Sunday, March 21, 2010

Executing Simple Joins Across Owned Relationships

As promised, today I'm going to explain how to use simple joins with owned relationships. This requires some additional setup because the physical layout of an owned relationship is not the same as the physical layout of an unowned relationship. Specifically, with an unowned relationship, the keys of the child entities are stored in a property on the parent, whereas with an owned relationship the keys of the child entities are not stored in a property on the parent - ownership is instead modeled using the hierarchical structure of the child key.  Simple joins depend on the physical layout we get with unowned relationships, so if we're going to use simple joins with owned relationships we're going to need to change the way we store them.

Fortunately, as of SDK 1.3.1, there is a configuration option that changes the physical layout of owned relationships to match the physical layout of unowned relationships while preserving the hierarchical structure of the child key.  Nice!  You can enable this option by adding the following to your config:

<property name="datanucleus.appengine.storageVersion" value="WRITE_OWNED_CHILD_KEYS_TO_PARENTS"/>

If you're using JPA, add this to persistence.xml. If you're using JDO, add this to jdoconfig.xml. If you don't have any existing data in production you're ready to move on to the examples. If you do have existing data there's more work to do: data migration. If you're like me then you get sad, then angry, then hungry, then sad again whenever you see those words so I apologize for just throwing them in your face like that, but if you have existing data there's really no way around it.  See, setting this config option will cause all newly created and updated parents to be written with a property containing child keys, but all your existing data is still sitting there in the old layout. If you want the results of your simple joins to be accurate (and you do want this, right?) you're going to need to migrate all your existing data. I'll explain how to do this in a later post (note that I said 'a later post' not 'the next post' - I may not get to this for awhile).

Let's go back to our college example.  We have Students, Credits, and Honors Thesis.  Every Student has zero or more Credits and zero or one Honors Thesis.

JPA:
@Entity
public class Student {
  @Id
  @GeneratedValue(strategy = GenerationType.IDENTITY)
  private Long id;

  private String year;

  @OneToMany(cascade = CascadeType.ALL)
  private List<Credit> credits;

  @OneToOne(fetch = FetchType.LAZY, cascade = CascadeType.ALL)
  private HonorsThesis thesis;
}

@Entity
public class Credit {
  @Id
  @GeneratedValue(strategy = GenerationType.IDENTITY)
  private Key key;

  private String department;
}

@Entity
public class HonorsThesis {
  @Id
  @GeneratedValue(strategy = GenerationType.IDENTITY)
  private Key key;

  private boolean complete;
}


JDO:
@PersistenceCapable(detachable = "true")
public class Student {
  @PrimaryKey
  @Persistent(valueStrategy = IdGeneratorStrategy.IDENTITY)
  private Long id;

  private String year;

  @Element(dependent = "true")
  private List<Credit> credits;

  @Persistent
  private HonorsThesis thesis;
}

@PersistenceCapable(detachable = "true")
public class Credit {
  @PrimaryKey
  @Persistent(valueStrategy = IdGeneratorStrategy.IDENTITY)
  private Key key;

  private String department;
}

@PersistenceCapable(detachable = "true")
public class HonorsThesis {
  @PrimaryKey
  @Persistent(valueStrategy = IdGeneratorStrategy.IDENTITY)
  private Key key;

  private boolean complete;
}
---------------------------------------
Here's a query that returns all Seniors with at least one credit from the Philosophy department:
JPA:
Query q = em.createQuery("select from " + Student.class.getName() 
  + " s JOIN s.credits c where "
  + "c.department = 'Philosophy' and "
  + "s.year = 'Senior'");

JDO:
Query q = pm.newQuery("select from " + Student.class.getName() 
  + " where "
  + "credits.contains(c) && c.department == 'Philosophy' && "
  + "year == 'Senior'");
q.declareVariables(Credit.class.getName() + " c");

---------------------------------------

And here's a query that returns all Seniors who have completed an Honors Thesis:
JPA:
Query q = em.createQuery("select from " + Student.class.getName() 
  + " s JOIN s.thesis t where "
  + "t.complete = true and "
  + "s.year = 'Senior'");

JDO:
Query q = pm.newQuery("select from " + Student.class.getName() + " where "
  + "thesis == t && t.complete == true && "
  + "year == 'Senior'");
q.declareVariables(HonorsThesis.class.getName() + " t");


---------------------------------------
Unlike the unowned relationship examples we covered in the last post, we're now firmly back in the land of standard JPA and JDO.  Good!  That's where we want to be.  Give it a try and let me know what you think!  In the next post, hopefully less than 6 weeks away this time, I'll explain how simple joins work under the hood.

64 comments:

  1. Would including
    <property name="datanucleus.appengine.storageVersion" value="WRITE_OWNED_CHILD_KEYS_TO_PARENTS"/>
    affect efficiency?

    Now 2-way joins not enough for me.. But I thought n-way joins would be possible soon. So I just want to prevent migration in future using this config for now..

    ReplyDelete
  2. What about the reverse join ?
    Would a JPA query like this work ?

    select c from Credits c JOIN c.student s where s.id = 123;

    Assuming of course that the Credit class has a student property.

    ReplyDelete
  3. hi max,

    Since *contain* keyword can be translated into several queries, why I got an UnsupportedDatastoreFeatureException?

    ReplyDelete
  4. @Alex:
    WRITE_OWNED_CHILD_KEYS_TO_PARENTS results in an additional write to the parent.
    1) Write the parent
    2) Write child entities
    3) Write child keys to the parent

    The last write has to be a separate step because if you're using id generation we don't know the Keys of the child objects until after they've been written in step 2.

    ReplyDelete
  5. @Jean-Baptiste:
    In your example you can just do an ancestor query, no join needed: "Give me all Credits where the Credit ancestor is Student 123."

    ReplyDelete
  6. @Max Zhu:
    Are you saying you ran the example code I provided and you're getting an exception? If so, can you please post the exception? If not, did you change something? If you changed something, what did you change?

    Thanks,
    Max

    ReplyDelete
  7. Thanks Max.. your explanation really help.

    ReplyDelete
  8. very interesting!
    Apologies if this is an obvious question, but what did you mean (in your previous post) re: that this feature was "available to beta testers"? Should it work with a 'regular' deployment of 1.3.1 ?
    I assume that it's still unofficial in some respects since it's not in the release notes?

    ReplyDelete
  9. It will work with a regular deployment of 1.3.1 or later. 'beta' in this case just means it's not yet an official feature of the SDK.

    ReplyDelete
  10. I appreciate the information. Java is one of the consistent player from the development industry which has been providing the wider scope for developers to come out with different solutions.

    ReplyDelete
  11. Hi Max,

    Here is the stacktrace:

    org.datanucleus.store.appengine.query.DatastoreQuery$UnsupportedDatastoreFeatureException: Problem with query "SELECT FROM c.demo.domain.Task WHERE indexList.contains(i) && :p1.contains(i.index) VARIABLES c.demo.domain.TaskIndex i>: Filter on property 'index' uses operator 'IN'. Joins are only supported when all filters are 'equals' filters.
    at org.datanucleus.store.appengine.query.JoinHelper.validateJoinQuery(JoinHelper.java:131)
    at org.datanucleus.store.appengine.query.JoinHelper.executeJoinQuery(JoinHelper.java:66)
    at org.datanucleus.store.appengine.query.DatastoreQuery.executeQuery(DatastoreQuery.java:247)
    at org.datanucleus.store.appengine.query.DatastoreQuery.performExecute(DatastoreQuery.java:228)
    at org.datanucleus.store.appengine.query.JDOQLQuery.performExecute(JDOQLQuery.java:85)
    at org.datanucleus.store.query.Query.executeQuery(Query.java:1489)
    at org.datanucleus.store.query.Query.executeWithArray(Query.java:1371)
    at org.datanucleus.jdo.JDOQuery.execute(JDOQuery.java:243)
    at c.demo.QueryServlet.doGet(QueryServlet.java:38)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:693)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:806)
    at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:487)
    at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1093)
    at com.google.appengine.api.blobstore.dev.ServeBlobFilter.doFilter(ServeBlobFilter.java:51)
    at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1084)
    at com.google.apphosting.utils.servlet.TransactionCleanupFilter.doFilter(TransactionCleanupFilter.java:43)
    at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1084)
    at com.google.appengine.tools.development.StaticFileFilter.doFilter(StaticFileFilter.java:121)
    at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1084)
    at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:360)
    at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
    at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
    at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
    at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
    at com.google.apphosting.utils.jetty.DevAppEngineWebAppContext.handle(DevAppEngineWebAppContext.java:70)
    at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
    at com.google.appengine.tools.development.JettyContainerService$ApiProxyHandler.handle(JettyContainerService.java:352)
    at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
    at org.mortbay.jetty.Server.handle(Server.java:313)
    at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:506)
    at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:830)
    at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:514)
    at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:211)
    at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:381)
    at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:396)
    at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)

    ReplyDelete
  12. 好的部落格就要和好朋友分享--感謝分享.........................

    ReplyDelete
  13. Nice article Max. I am wondering if you could help me on this dilemma. I have a ListContainer entity with a key and "name" property. It simply stores the names of the list. Then I have a ListItem property which stores a key and the text in the list. Now I want to have one to many between ListContainer and ListItem. I have a page which displays the top 100 list items for the list you've selected. So the first thing is to display all the list names. then on clicking list names, I'll display the list items.
    #1) I am wondering if it would be more performant, if I stored the Key of ListContainer inside the ListItem without any relationship between them. I'd simply do an equality filter on the listitem to check if the key matches and fetch all the records.
    #2) Is it safe to pass the key of listcontainer upto clients browser and return it to server, for querying purposes(say in a hidden variable).
    I'd very much appreciate your help.

    ReplyDelete
  14. Oh and to add to the my previous comment, There will be five list containers and say 1000s of listitem per list container.

    ReplyDelete
  15. Hi,

    I have two classes which are like
    public class Employee {
    ***key and other property
    private Department department;
    *** getter-setter
    }
    and
    public class Department {
    ***key and other property
    private List employees;
    *** getter-setter
    }

    how i can add and update employee's department

    Thanks

    ReplyDelete
  16. Max: Will this capability be leveraged to do efficient loading of child objects when specified in defaultFetchGroup. The current behavior is to load one object at a time from the db leading to a very chatty and hence poorly performing load.

    ReplyDelete
  17. I first used this new functionality (for JDO) a while ago, and I forgot to say "Thank you."
    This is brilliant, and just what I needed. Thanks to you, Max, and to all your colleagues who made this happen!

    ReplyDelete
  18. this is only feasible for 2-way joins ? can we do 3 or more-way joins?

    ReplyDelete
  19. @Max its been 6 weeks already!

    Tell us more about data migration.

    ReplyDelete
  20. Hi Max ,

    How these Joins are indexed , is there any danger of exploding indexes?

    ReplyDelete
  21. 當一個人內心能容納兩樣相互衝突的東西,這個人便開始變得有價值了。............................................................

    ReplyDelete
  22. 要求適合自己的愛情方式,是會得到更多,還是會錯過一個真正愛你的人。.................................................................

    ReplyDelete
  23. 我們必須先有哭泣,才有歡笑;也必須先感到人生的悲哀,然後才感到人生的快樂。............................................................

    ReplyDelete
  24. 一個人的價值,應該看他貢獻了什麼,而不是他取得了什麼....................................................

    ReplyDelete
  25. 與人相處不妨多用眼睛說話,多用嘴巴思考. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

    ReplyDelete
  26. Custom software is the most expensive type of development due to it’s nature - unique functionality and utilization of new challenging technologies. The most important thing to understand is that java software development outsourcing is a significant decision that can have lasting ramifications for an organization

    ReplyDelete
  27. eing engaged in a very long relationship make people unpredictable, because it comes a time that you don't know well your couple but there are some little details that might improve the relation for instance the use of Generic Viagra.

    ReplyDelete
  28. Hi, I have a project that uses spring 3, spring-security 2.something, spring MVC, JPA with hibernate and some more things and Im trying to migrate to appengine...

    In this "process" I only find problems and errors, and I dont know how to make this work.

    can you help me please?

    ReplyDelete
  29. Are you fired Max !!
    Haven't seen you active since around six months.

    ReplyDelete
  30. Thanks for checking up on me. :-) Not fired, still working on App Engine, but focused entirely on internal datastore improvements. I hope to resurface some day...

    ReplyDelete
  31. We are still waiting for that next post ;-)

    ReplyDelete
  32. What if you wanted to check for multiple conditions in the child?

    Say you wanted only students that have taken both Philosophy and Math?

    Is something like this valid:

    Query q = pm.newQuery("select from " + Student.class.getName()
    + " where "
    + "credits.contains(c) && c.department == 'Philosophy' && "
    + "credits.contains(c2) && c2.department == 'Math' && "
    + "year == 'Senior'");
    q.declareVariables(Credit.class.getName() + " c; " + Credit.class.getName() + "c2");

    ReplyDelete
  33. Hi Max,
    what wrong with query with two variables? Query with single variable works well.
    Thanks.

    SELECT FROM com.domain.AdvImpl WHERE ( contacts.contains(v0) && v0.value == p1 ) && ( this.location == v1 && v1.city == p2 ) VARIABLES com.domain.ContactImpl v0; com.domain.LocationImpl v1 PARAMETERS String p0, String p1 import com.domain.ContactImpl; import com.domain.LocationImpl;

    ReplyDelete
  34. Hi Max,

    I have a question regarding what can be manipulated in parameters of a query:

    I have a list of entities where I store the date when I record them, both as a Date object and as an int (20110106 for example would be jan 6 2011). This is because on SQL whenever I wanted The entities for the month I could do something like:

    "Select from entity where IntDate/100 = paramterIntDate/100"

    Right now on GAE I am resorting to the creation of a start of month int (20110100) and an end of month int (20110132) and use inequalities to compare my entity intdate between these two values.

    Of course this means i lose all kind of sorting options since I am restricted to sort by the date first.

    Is there some way to manipulate the parameter like I did in SQL like dividing on both sides of the equation?

    As I said I also store the value as a Date, is there some way just to send the month and instruct the date object to use that as a reference??

    I hope I am being clear enough, thanks

    ReplyDelete
  35. Max! Max! Max!
    Let's go further on Jean-Baptiste's question!

    how to make this work
    -- we want to get the children objects.
    -- two search criteria, one from parent, one from child

    For example:
    select c from Credits c JOIN c.student s where s.id = 123 AND c.department='Math';

    It returns empty results to me...
    Please clarify!
    Thanks,

    ReplyDelete
  36. Hi I ran into the following issue while running a joined query between a Date range through JDO, is there any work around for this??

    org.datanucleus.store.appengine.query.DatastoreQuery$UnsupportedDatastoreFeatureException:
    Problem with query SELECT FROM Task WHERE dailyCells == d && d.date >= :fromDate && d.date <= :toDate && userId == :user VARIABLES DailyCell d

    ReplyDelete
  37. with the OneToMany annotation referring to the list of Credits

    private List credits;

    i assume the number of the 'many' are limited by the hard index value per entity limit of 5000.

    is that correct?
    how can i create a one-2-many for greater than 5K?

    ReplyDelete