Friday, January 15, 2010

Querying with Key parameters

Today we're going to cover a small 'gotcha' that I've seen on the forums a number of times: How do you execute a query where one or more of the parameters are of type com.google.appengine.datastore.api.Key?

Did you know that a Key has two different string representations?  You can see the first by calling Key.toString():
Key key = KeyFactory.createKey("Foo", 25);
System.out.println(key); // prints Foo(25)

You can see the second by calling KeyFactory.keyToString():
Key key = KeyFactory.createKey("Foo", 25);
System.out.println(KeyFactory.keyToString(key)); // prints agR0ZXN0cgkLEgNGb28YGQw

The first string is human readable, just like the javadoc for Object.toString() recommends.  The second string is websafe (no escaping required) and easily parsed by a machine (it's just the Key object translated to a protocol buffer and then base-64 encoded).  Seems reasonable, right?  Key.toString() outputs a string that is useful in logging and debugging, KeyFactory.keyToString() outputs a string that you can safely pass around in forms and urls.  Plus for those of you who, like me, care about symmetry in your apis, KeyFactory.keyToString() and KeyFactory.stringToKey() are symmetrical.  If Key.toString() returned the websafe version, what would the opposite transform look like?  Probably a static method on Key called fromString(), right?  So now you have mirror methods but one requires an instance and the other is static.  I don't like it but I digress.  You're probably reading this because you'd like to learn something useful so let's move along.

The problem with having two string representations of the Key class stems from the fact that for a number of data types it's easy to "inline" the parameter value.  Take this query for example:
JPA:

public List<Flight> flightsByAircraftType(EntityManager em, String aircraftType) {
    String queryStr = String.format("select from " + Flight.class.getName() + 
        " where aircraftType = '%s'", aircraftType);
    return em.createQuery(queryStr).getResultList();
}

JDO:
public List<Flight> flightsByAircraftType(PersistenceManager pm, String aircraftType) {
    String queryStr = String.format("select from " + Flight.class.getName() + 
        " where aircraftType == '%s'", aircraftType);
    return (List<Flight>) pm.newQuery(queryStr).execute();
}


Concise, readable, and correct.  But now let's do the same thing with a Key parameter:
JPA:

public List<Flight> flightPaginationByDest(EntityManager em, Key dest1, Key dest2) {
    String queryStr = String.format("select from " + Flight.class.getName() + 
        " where dest > '%s' and dest <= '%s'", dest1, dest2);
    return em.createQuery(queryStr).getResultList();
}

JDO:
public List<Flight> flightPaginationByDest(PersistenceManager pm, Key dest1, Key dest2) {
    String queryStr = String.format("select from " + Flight.class.getName() + 

        " where dest > '%s' && dest <= '%s'", dest1, dest2);
    return (List<Flight>) pm.newQuery(queryStr).execute();
}

Concise, readable, but incorrect.  We just asked the datastore for all Flights departing from airports between "Airport(15)" and "Airport(30)", and that's not going to match anything, ever, because we passed a String parameter to a query on entities where every single "dest" property is of type Key.  Remember, the datastore is schema-less.  If we were using the low-level api you could create some flight entities where the "dest" property is of type String and some flight entities where the "dest" property is of type Key.  The datastore has no problem with that.
So what's the solution?  Ultimately the query compiler should detect this, and in future releases you can expect to get an exception if you attempt to match an unencoded String value against a Key.  Until this is implemented I'd recommend using parameters:
JPA:
public List<Flight> flightPaginationByDest(EntityManager em, Key dest1, Key dest2) {
    String queryStr = "select from " + Flight.class.getName() + 

        " where dest > :p1 and dest <= :p2";
    Query q = em.createQuery(queryStr);
    q.setParameter("p1", dest1);
    q.setParameter("p2", dest2);
    return q.getResultList();

}

JDO:
public List<Flight> flightPaginationByDest(PersistenceManager pm, Key dest1, Key dest2) {
      String queryStr = "select from " + Flight.class.getName() + 

          " where dest > :p1 && dest <= :p2";
      return (List<Flight>) pm.newQuery(queryStr).execute(dest1, dest2);
}

12 comments:

  1. If you are willing to use an alternative API to the datastore, the opensource Objectify-Appengine abstracts this and lets you issue key-based queries on the field flagged as the id of your entity.

    You can say query.filter("id >", 20) and this internally gets translated to a __key__ GREATER_THAN query.

    http://code.google.com/p/objectify-appengine/

    ReplyDelete
  2. This comment has been removed by a blog administrator.

    ReplyDelete
  3. hi Max,

    Can you have a look at this thread? We do need your help.

    http://groups.google.com/group/google-appengine-java/browse_thread/thread/c99d915005ffac3d/273021f1da44e9ed#273021f1da44e9ed

    Many thanks

    ReplyDelete
  4. KeyFactory.keyToString(key)

    Awsome,

    P.S. your blog is an EXCELLENT resource, i know these long posts do take a while to write and format properly. But please keep them coming.

    ReplyDelete
  5. Nice post, but is this kind of query string composition prone to kinda sql injection:

    String.format("select from " + Flight.class.getName() +
    " where aircraftType = '%s'", aircraftType);

    As far as I know the setParameter(..) does not only make your parameters explicit but also can prevent sql injection.

    ReplyDelete
  6. Anyway, here's the question I really wanna ask: Is there any way that key name (or pk-name), key id (or pk-id), or even parent key (or parent-key) can be used in a query regardless of JPQL or JDQL. I know in the Python edition it's possible (correct me if I'm wrong).

    If there's no way to use this kind query, then the power of key name or id (obviously from the GAE doc you can always benefit from a annotated parent key) would be very limited except that you always use encoded string key in conjunction with annotated "gae.pk-name" or "gae.pk-id".

    ReplyDelete
  7. Thanks for the great article.
    http://bit.ly/9WeN0S

    ReplyDelete
  8. Hello .. firstly I would like to send greetings to all readers. After this, I recognize the content so interesting about this article. For me personally I liked all the information. I would like to know of cases like this more often. In my personal experience, I really enjoyed reading about Querying with Key parameters,i think that this is a very professional information, do you know something about eco vacation costa rica ??

    ReplyDelete
  9. This comment has been removed by the author.

    ReplyDelete
  10. The same thing, but with a different syntax:

    String encodedKey = ...;
    final Key key = KeyFactory.stringToKey(encodedKey);
    final Query query = pm.newQuery(ContainerExample.class);
    query.setFilter("subcontainer == key");
    query.declareParameters("com.google.appengine.api.datastore.Key key");
    query.execute(key);

    Where ContainerExample has a
    private Key subcontainer.

    ReplyDelete
  11. With similar query i have got: ".... Caused by: java.lang.IllegalArgumentException: __key__ filter value must be a Key" (JPA, 1.7.1 appengine)
    I cant find the solution when KEy parameter is user.
    EntityManager cant "declareParameters" in Query object :(

    ReplyDelete