What if we set our sights a little bit lower? Sure, n-way joins with arbitrary filters and sort orders would be ideal, but isn't something better than nothing in this area? It turns out that with some additional restrictions in place it is possible to execute joins where the performance scales with the size of the result set, and as of the recently released SDK 1.3.1 this feature, which I'm calling "simple joins," is available to beta testers.
Now, before you start licking your chops let me give you all the bad news (there's plenty of it). First, simple joins are only available to JPA and JDO users. Why? Because JPA and JDO already have an established API for joins and API design is really really hard. I try to avoid it whenever possible. Second, simple joins only support equality filters. Ouch, right? Third, simple joins can only be sorted by the property on which you're joining. Double ouch. Fourth, if you're using owned relationships, in order to use simple joins you're going to need to upgrade your storage version and migrate existing data. This post is already long enough so I'm just going to focus on simple joins across unowned relationships. I'll cover storage versions, data migration, and joining across owned relationships in my next post, I promise.
I know that was a lot of bad news, but my theory is that even with all that bad news, simple joins are sufficiently useful. I'm making them available to try and prove that theory. Ready to try it out?
Let's model a college. We have Students, Courses, and Dorms. Every Student is associated with zero or one Dorm and zero or more Courses.
JPA:
@Entity
public class Student {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
private String year;
@Basic
private List<Key> courses = new ArrayList<Key>();
@OneToMany
@Column(name = "courses", insertable = false, updatable = false)
private List<Course> coursesAlias;
@Basic
private Key dorm;
@OneToOne(fetch = FetchType.LAZY)
@Column(name = "dorm", insertable = false, updatable = false)
private Dorm dormAlias;
}
@Entity
public class Course {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private Key key;
private String department;
}
@Entity
public class Dorm {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private Key key;
private String campus;
}
Yes, there's something funny (and by funny I mean hacky) going on with the courseAlias and dormAlias fields. The JPQL compiler won't let you join on a Key or a List<Key> field because these aren't OneToOne or OneToMany fields. To work around this I've created a read-only alias of the appropriate type. Your application code should always read from and write to the 'courses' and 'dorm' fields while your join queries should read from and write to the 'coursesAlias' and 'dormAlias' fields. Once we have proper support for unowned relationships all this nonsense will go away.
WARNING: There's a bug that causes aliased fields to be written to the datastore with incorrect names. Make sure the names of your aliased fields don't end in "Key" or else your join won't return any results. In the above example you could trigger this bug by renaming Student.dorm to Student.dormKey.
JDO:
@PersistenceCapable(detachable = "true")
public class Student {
@PrimaryKey
@Persistent(valueStrategy = IdGeneratorStrategy.IDENTITY)
private Long id;
private String year;
@Element(dependent = "true")
private List<Key> courses = new ArrayList<Key>();
@Persistent
private Dorm dorm;
}
@PersistenceCapable(detachable = "true")
public class Course {
@PrimaryKey
@Persistent(valueStrategy = IdGeneratorStrategy.IDENTITY)
private Key key;
private String department;
}
@PersistenceCapable(detachable = "true")
public class Dorm {
@PrimaryKey
@Persistent(valueStrategy = IdGeneratorStrategy.IDENTITY)
private Key key;
private String campus;
}
---------------------------------------
Here's a query that returns all Juniors enrolled in a Biology course:
JPA:
Query q = em.createQuery("select from " + Student.class.getName()
+ " s JOIN s.coursesAlias c where "
+ "c.department = 'Biology' and "
+ "s.year = 'Junior'");
JDO:
Query q = pm.newQuery("select from " + Student.class.getName()
+ " where "
+ "courses.contains(c) && c.department == 'Biology' && "
+ "year == 'Junior'");
q.declareVariables(Course.class.getName() + " c");
It's definitely strange to be referring to non-existent properties of a Key field and a List<Key> field, but since proper unowned relationships are not yet supported I eased up some of the JDOQL compiler's type checking to let this go through (sorry Andy, I know you hate this). Non-standard!
---------------------------------------
And here's a query that returns all Juniors who live in a dorm on the North campus of the school:
JPA:
Query q = em.createQuery("select from " + Student.class.getName()
+ " s JOIN s.dormAlias d where "
+ "d.campus = 'North' and "
+ "s.year = 'Junior'");
JDO:
Query q = pm.newQuery("select from " + Student.class.getName() + " where "
+ "dorm == d && d.campus == 'North' && "
+ "year == 'Junior'");
q.declareVariables(Dorm.class.getName() + " d");
Again, I'm taking some liberties with standard JDOQL here.---------------------------------------
These examples should be enough to get you started. Currently, only 2-way joins are supported but there's no reason it can't be extended to n-way. As for outer joins, arbitrary sorts, and inequality filters, my colleagues and I are still thinking hard about it.
Stay tuned for more posts in which I'll explain how to make this work for owned relationships and the underlying algorithm involved. Give simple joins a try and let me know how it goes!