Google App Engine Java Persistence: Queries with != and IN filters

Friday, December 4, 2009

Queries with != and IN filters

Since we launched Google App Engine for Java back in April there has been a sizable disparity between the persistence features of Python and Java: Python had support for the != and IN operators in query filters and Java did not. With the release of SDK 1.2.8 this disparity is now gone (hooray!), but before you go rushing out to take advantage of these new operators I want to take some time to explain how the performance of queries that use these operators is a little bit different from what you're used to.

The key difference is that the datastore does not natively support != and IN. Instead, these operators are implemented in "userland." The code lives in appengine-api.jar, and it knows how to transform queries that use != and IN into queries that the datastore natively supports. Let's look at an example. Suppose you want to issue a query that returns all people not named Max, ordered by name and age.

Under the hood, this query gets broken up into two queries. The first query returns all people whose name is lexicographically smaller than Max, ordered by name and age. The second query returns all people whose name is lexicographically larger than Max, ordered by name and age.

JPA:

select from Person where name <> "Max" order by name, age

is translated into

select from Person where name < "Max" order by name, age

select from Person where name > "Max" order by name, age

JDO:

select from Person where name != "Max" order by name, age

is translated into  

select from Person where name < "Max" order by name, age

select from Person where name > "Max" order by name, age

The results of these two queries are then merged, with the sort applied in-memory as we go. Both result sets are ordered by the same properties, so we can determine the next result to return by comparing the two results at the front of the two result sets. This makes the in-memory sort efficient. We also need to de-dupe as we go because a multi-value property may place an Entity in both result sets. The performance of this query is equivalent to the performance of the two underlying queries with some additional memory consumption because we need to maintain references to all the results we've already returned in order to de-dupe.

IN filters are implemented in similar fashion, but instead of requiring a fixed number of queries to fulfill they require N queries, where N is the number of values in the IN clause. Suppose you want all people whose favorite foods are cheeseburgers, pizza, and fried chicken, ordered by favoriteFood and age.

Under the hood this query gets broken up into three queries. The first query returns all people whose favorite food is cheeseburgers, ordered by favoriteFood and age. The second query returns all people whose favorite food is pizza, ordered by favoriteFood and age. The third query includes all people whose favorite food is fried chicken, ordered by favoriteFood and age.

JPA:

select from Person where
favoriteFood IN ('cheeseburger', 'pizza', 'fried chicken')
order by favoriteFood, age

is translated into

select from Person where favoriteFood = "cheeseburger" order by favoriteFood, age

select from Person where favoriteFood = "pizza" order by favoriteFood, age

select from Person where favoriteFood = "fried chicken" order by favoriteFood, age

JDO:

Query q = pm.newQuery(
"select from Person where :p1.contains(favoriteFood) order by favoriteFood, age");

q.execute(Arrays.asList("cheeseburger", "pizza", "fried chicken"));

is translated into

select from Person where favoriteFood == "cheeseburger" order by favoriteFood, age

select from Person where favoriteFood == "pizza" order by favoriteFood, age

select from Person where favoriteFood == "fried chicken" order by favoriteFood, age

Once again we merge the results of these three queries, sorting and de-duping as we go. Once again the performance of this query is equivalent to the performance of the underlying queries with some additional memory consumption.

Can you issue a query that combines != and IN? Absolutely. Can you have multiple IN filters? Absolutely again! But, be careful, if your query requires more than 30 underlying queries to fulfill you'll get an exception. Here's a query that requires too many underlying queries to fulfill:

JPA:

select from Person where
   name <> "Max" AND
   favoriteFood IN ("cheeseburger", "pizza", "fried chicken") AND
   age IN (10, 11, 12, 13, 14, 15, 16, 17, 18 19) order by name

is translated into

select from Person where name < "Max" AND favoriteFood = "cheeseburger" AND age = 10 order by name

select from Person where name > "Max" AND favoriteFood = "cheeseburger" AND age = 10 order by name

select from Person where name < "Max" AND favoriteFood = "cheeseburger" AND age = 11 order by name

select from Person where name > "Max" AND favoriteFood = "cheeseburger" AND age = 11 order by name

select from Person where name < "Max" AND favoriteFood = "cheeseburger" AND age = 12 order by name

select from Person where name > "Max" AND favoriteFood = "cheeseburger" AND age = 12 order by name

select from Person where name < "Max" AND favoriteFood = "cheeseburger" AND age = 13 order by name

select from Person where name > "Max" AND favoriteFood = "cheeseburger" AND age = 13 order by name

select from Person where name < "Max" AND favoriteFood = "cheeseburger" AND age = 14 order by name

select from Person where name > "Max" AND favoriteFood = "cheeseburger" AND age = 14 order by name

select from Person where name < "Max" AND favoriteFood = "cheeseburger" AND age = 15 order by name

select from Person where name > "Max" AND favoriteFood = "cheeseburger" AND age = 15 order by name

select from Person where name < "Max" AND favoriteFood = "cheeseburger" AND age = 16 order by name

select from Person where name > "Max" AND favoriteFood = "cheeseburger" AND age = 16 order by name

select from Person where name < "Max" AND favoriteFood = "cheeseburger" AND age = 17 order by name

select from Person where name > "Max" AND favoriteFood = "cheeseburger" AND age = 17 order by name

select from Person where name < "Max" AND favoriteFood = "cheeseburger" AND age = 18 order by name

select from Person where name > "Max" AND favoriteFood = "cheeseburger" AND age = 18 order by name

select from Person where name < "Max" AND favoriteFood = "cheeseburger" AND age = 19 order by name

select from Person where name > "Max" AND favoriteFood = "cheeseburger" AND age = 19 order by name

select from Person where name < "Max" AND favoriteFood = "pizza" AND age = 10 order by name

select from Person where name > "Max" AND favoriteFood = "pizza" AND age = 10 order by name

select from Person where name < "Max" AND favoriteFood = "pizza" AND age = 11 order by name

select from Person where name > "Max" AND favoriteFood = "pizza" AND age = 11 order by name

.... and so on.

JDO:

Query q = pm.newQuery("select from Person where " +
"name != 'Max' AND " +
":p1.contains(favoriteFood) && :p2.contains(age) order by name");

q.execute(Arrays.asList("cheeseburger", "pizza", "fried chicken"),
Arrays.asList(10, 11, 12, 13, 14, 15, 16, 17, 18, 19));

is translated into

select from Person where name < "Max" && favoriteFood == "cheeseburger" AND age == 10 order by name

select from Person where name > "Max" && favoriteFood == "cheeseburger" AND age == 10 order by name

select from Person where name < "Max" && favoriteFood == "cheeseburger" AND age == 11 order by name

select from Person where name > "Max" && favoriteFood == "cheeseburger" AND age == 11 order by name

select from Person where name < "Max" && favoriteFood == "cheeseburger" AND age == 12 order by name

select from Person where name > "Max" && favoriteFood == "cheeseburger" AND age == 12 order by name

select from Person where name < "Max" && favoriteFood == "cheeseburger" AND age == 13 order by name

select from Person where name > "Max" && favoriteFood == "cheeseburger" AND age == 13 order by name

select from Person where name < "Max" && favoriteFood == "cheeseburger" AND age == 14 order by name

select from Person where name > "Max" && favoriteFood == "cheeseburger" AND age == 14 order by name

select from Person where name < "Max" && favoriteFood == "cheeseburger" AND age == 15 order by name

select from Person where name > "Max" && favoriteFood == "cheeseburger" AND age == 15 order by name

select from Person where name < "Max" && favoriteFood == "cheeseburger" AND age == 16 order by name

select from Person where name > "Max" && favoriteFood == "cheeseburger" AND age == 16 order by name

select from Person where name < "Max" && favoriteFood == "cheeseburger" AND age == 17 order by name

select from Person where name > "Max" && favoriteFood == "cheeseburger" AND age == 17 order by name

select from Person where name < "Max" && favoriteFood == "cheeseburger" AND age == 18 order by name

select from Person where name > "Max" && favoriteFood == "cheeseburger" AND age == 18 order by name

select from Person where name < "Max" && favoriteFood == "cheeseburger" AND age == 19 order by name

select from Person where name > "Max" && favoriteFood == "cheeseburger" AND age == 19 order by name

select from Person where name < "Max" && favoriteFood == "pizza" AND age == 10 order by name

select from Person where name > "Max" && favoriteFood == "pizza" AND age == 10 order by name

select from Person where name < "Max" && favoriteFood == "pizza" AND age == 11 order by name

select from Person where name > "Max" && favoriteFood == "pizza" AND age == 11 order by name

.... and so on.

One final note that isn't entirely related but that I'll tack on anyway: You can now use the 'OR' operator in your queries as long as the query can be rewritten using IN. That means this now works:

JPA:

select from Person where
   favoriteFood = "cheeseburger" OR
   favoriteFood = "pizza" OR
   favoriteFood = "fried chicken"

JDO:

select from Person where
   favoriteFood == "cheeseburger" ||
   favoriteFood == "pizza" ||
   favoriteFood == "fried chicken"

But this still doesn't:

JPA:

select from Person where
favoriteFood = "cheeseburger" OR age = 10

JDO:

select from Person where
favoriteFood == "cheeseburger" || age == 10

Enjoy the new operators!

38 comments:

Guillaume LaforgeDecember 8, 2009 at 12:57 AM
I would still love to see also examples using the low-level datastore API, that would really be handy!
ReplyDelete
Replies
MaxDecember 8, 2009 at 10:05 AM
The low-level api has far less surface area than JDO and JPA so I would expect the javadoc to be sufficient for most things, but if you have questions about how to accomplish specific tasks, please ask and I'll answer as best I can.

We have plans to document the low-level api alongside JDO and JPA in the official docs but this will take some time.
ReplyDelete
Replies
David GruzmanDecember 24, 2009 at 3:53 PM
Do you think it is a good idea to simulate a capabilities, which are not supported by the underlying engine? It might cause creation of inefficient applications, because developers will be lured into the sense that the capabilities like IN do present and should be used whenever needed.
ReplyDelete
Replies
MaxDecember 28, 2009 at 2:13 PM
In general, no, we don't simulate capabilities that are not supported by the underlying engine. That's why we don't support aggregate functions, joins, etc. The rule of thumb is that we only support operations that scale with the size of the result set (as opposed to the size of the data set). The implementation of != and IN that I've described in this post meets that criteria. These queries are more expensive to fulfill by a factor related to the query itself but still tied to the size of the result set. So, in this case, I think the exception is warranted.
ReplyDelete
Replies
andyJanuary 6, 2010 at 1:38 AM
A very simple alternative to the latter question is to make available access to the in-memory query evaluator. This would then allow people to get the results they want for the query they supply, and they have specified a query extension to enable it ... hence know that things may be done non-optimally. Their query would then be portable to other JDO/JPA implementations too .... this is actually important to some people. This alternative was raised back in April (2009) as an issue, and is NOT a lot of work to enable.
ReplyDelete
Replies
UnknownJanuary 26, 2010 at 5:26 PM
may i know is there hard limit when using "IN" when doing jpa/jdo query?
ReplyDelete
Replies
MaxJanuary 26, 2010 at 6:30 PM
A limit on what?
ReplyDelete
Replies
UnknownJanuary 26, 2010 at 7:08 PM
i mean, how long the "select in" statement.
ReplyDelete
Replies
MaxJanuary 26, 2010 at 8:49 PM
JPA and JDO don't impose any limitations, but if the query requires more than 30 underlying queries you'll get an exception. This is explained in more detail in the post.
ReplyDelete
Replies
UnknownJanuary 26, 2010 at 8:53 PM
thanks @Max ross . can you comment on socialwork presentation below ?

DatastoreService dataSvc =...;

Query query = new Query("MessageRecipients")
.addFilter("recipients"), FilterOperator.EQUAL,userid)
.addSort("date", SortDirection.DESCENDING)
.setKeysOnly(); // <-- Only fetch keys!

//*** from the presentation , mentioned the above can only be done
using low level api, arent jdo/jpa can get the key only value as well
//***http://gae-java-persistence.blogspot.com/2009/10/keys-only-
queries.html

List msgRecpts = dataSvc.prepare(query).asList();
List parents = new ArrayList();
for(Entity recep : msgREcpts) {
parents.add(recep.getParent());

}

//Bulk fetch parents using key list
Map msg = dataSvc.get(parents);

//***for second part above, what is the different from doing directly
"select in" statement in jpa/jdo ?
//***http://gae-java-persistence.blogspot.com/2009/12/queries-with-and-
in-filters.html
ReplyDelete
Replies
Eric JainFebruary 12, 2010 at 10:11 AM
Do these queries run in parallel?
ReplyDelete
Replies
MaxFebruary 12, 2010 at 1:54 PM
Good question. In the traditional sense, no. != and IN filters are executed in the datastore api implementation, which executes inside your app. Since we don't currently allow threads in your app the sub-queries are initiated in sequence. However, we only need one result from each sub-query to determine the next result to return for the top-level query, and query results aren't streamed back from the datastore until they are needed, so in a sense the queries are being evaluated in parallel, just not in parallel within the jvm where your application code is running.
ReplyDelete
Replies
Florian d'ErfurthOctober 11, 2010 at 5:21 AM
When building a query using the low-level datastore API, am I limited to 30 underlying queries too?
I need to count plenty of entites fast in geocells for my map based app.
ReplyDelete
Replies
UnknownMay 2, 2011 at 4:13 AM
Doesn't it support not in or not contains??
ReplyDelete
Replies
Bez IminiJuly 2, 2011 at 10:51 AM
It is possible to not just test if the query string *contains* the Entity property, but also find out whether the query string contains the Entity property at the *beginning* of the query string?

E.g.

Entity Property: "this is just a test"
Query String: "this is" - SHOULD MATCH
Query String: "just a test" - SHOULD FAIL

As it is, if I just use the IN operator, both query strings would match
ReplyDelete
Replies
Bez IminiJuly 2, 2011 at 3:40 PM
I'm just replying to my previous. I think I totally misunderstood the function of contains()

I thought it performed the same function as String.contains()

But it's actually used to search if a property is contained in a collection of possible results.

Oops.

Is there any sort of String.contains() functionality in JDOQL queries?
ReplyDelete
Replies
pinoOctober 13, 2011 at 2:54 PM
This comment has been removed by the author.
ReplyDelete
Replies
BenjaminMarch 4, 2012 at 11:34 AM
The low level API permits batch gets when retrieving by entity key. They don't mention a limit of 30 anywhere for that.

Is it possible to fetch more than 30 entities with JPA if you query with an IN clause on the key property?

For example:
select from MyEntity where id IN (:keyList)

and then keyList is a collection with over 30 keys.
ReplyDelete
Replies
NewOctober 22, 2012 at 7:57 AM
Hi Max,
Nice Blog!!

I have a couple of questions:
1. I didn't understood this ':p1.contains(favoriteFood) && :p2.contains(age)'.
Everything okay but what is ':p1' and ':p2'? Is it a syntax to specify 'parameter 1', 'parameter 2' ?

2. Can we match substrings? If yes, how? Suppose datastore has values "looping", "loop", "loops" and the search parameter value is "loop". So the result should give all the 3 values. :p.contains returns only the search value(here "loop") is returned.

Thanks
ReplyDelete
Replies
AnkurJune 14, 2013 at 1:04 AM
It's been nearly 4 years you started supporting != and IN *non-natively* through a kludge. Are there any plans to support != and IN natively any time soon?

The problem is that in their current avatars, they (especially IN) result in rapid increase of costs and additionally, in limitations. Now I need to put checks in the system so that the total number of items in IN are not more than 30. Too problematic. Need a graceful solution here.
ReplyDelete
Replies
Sophie GraceNovember 17, 2019 at 11:35 PM
This is extremely a pleasant and instructive, containing all data and furthermore greatly affects the new innovation. A debt of gratitude is in order for sharing it. To know about top account on instagram, please go to webstagram
ReplyDelete
Replies
Bitdefender LoginDecember 22, 2019 at 11:02 PM
Very Good Information Thanks for sharing me its very thoughtful article thanks

TomTom Home | office.com/verify | Garmin Express | turbotax login | Norton Security Standard
ReplyDelete
Replies
Bitdefender LoginDecember 22, 2019 at 11:03 PM
Nice Information thanks for sharing me.
Mcafee.com/mis/retailcard | mcafee.com/activate
ReplyDelete
Replies
norton.com/setupMay 29, 2020 at 5:00 PM
Thank you so much for spending time to write this blog article, i must say you are a excellent and talented blogger, I Found This Article To Be Very Informative and Helpful, please keep sharing such amazing blogs I am a Accountant from United States of America, Washington Dc, and I Love to read and Write Blog.
Some Of My Web Blogs Awesome Blog Post Please Feel Free to Check Out My Blog you will find the process of Norton software installation with step by step instructions and much more.

Keep Sharing will come back to read more,

Click to Process

norton.com/setup download and install
norton.com/setup enter product key
www.norton.com/setup
login.norton.com
my.norton.com
norton.com/setup
norton.com/setup activate
norton.com/setup install
norton.com/setup help
norton.com/setup renewals
norton.com/enroll product key

ReplyDelete
Replies
Ramesh jiJune 10, 2020 at 12:16 AM
Thanks for provide great informatic and looking beautiful blog, really nice required information & the things i never imagined and i would request, wright more blog and blog post like that for us. Thanks you once agian
we offer services birth certificate in delhi which inculde name add in birth certificate and birth certificate correction complete process is online and we offer birth certificate onlineand we offer this birth certificate apply online same service offers at yourdoorstep at birth certificate in ghaziabad our dream to provide birth certificate in india and other staes like birth certificate in bengaluru and birth certificate in gurgaon book service with us birth certificate in noida also, service at yoursdoorstep only.
ReplyDelete
Replies
Nancy AngelFebruary 4, 2021 at 9:25 PM
Best method to create your primevideo mytv code account through the activation process by primevideo.com/mytv. You need to follow the instructions to activate www.amazon.com code. If you are viewing Amazon Prime Video on your device go toamazon.com/mytv.
ReplyDelete
Replies
primevideocommytvFebruary 15, 2021 at 12:47 AM
Primevideo.com/mytv is Amazon online interface to enlist a gadget with the enactment code on the TV screen to watch Prime recordings substance on the gadget that produced the Prime video initiation code. The superb video enactment code interfaces the gadget to the real time feature and registers the gadget to transfer the help when you click on the excellent video application on the gadget.
Read more…
ReplyDelete
Replies
primevideo.com/mytv codeMarch 10, 2021 at 12:07 PM
Copy and paste the web link link in your web browser that is www.amazon.com/mytv enter code. Login to your prime video account using your prime video login credentials. After that enter the activation code you have on your screen .Devices Compatible with Prime Video. Exclusive content and Amazon Original series Enjoy exclusive movies and TV shows on your device.
ReplyDelete
Replies
AnonymousMarch 15, 2021 at 3:48 AM
Make sure that your OS is upgraded into the new variant. Restart the program by departing out, shutting the desktop program and re opening Tubi. ... Apparent Tubi cache and data stored in your own program. You can achieve it when you go to Settings onto your own Android apparatus, and selecting Apps > Tubi > tap Clear cache.
tubi.tv/activate
tubi.tv/activate enter code
tubi.tv/activate
ReplyDelete
Replies
donJune 25, 2021 at 11:26 PM
great share
https://higherlovewellness.com/
ReplyDelete
Replies
donJune 26, 2021 at 12:33 AM
Great share and thank you
https://higherlovewellness.com/
ReplyDelete
Replies
Sam BillingsJuly 28, 2021 at 3:30 AM
The Gmail application doesn't perform properly because of some reasons. The ofttimes occurring issues with the Gmail application ar associated with receiving emails and notifications. you have got to resolve these problems by following the Gmail not working troubleshooting steps below.
ReplyDelete
Replies
charlesFebruary 16, 2022 at 9:25 AM
Love it. This is really nice.
locast.org/activate
hbomax/tvsignin
disneyplus.com/begin
showtimeanytime.com activate
ReplyDelete
Replies
AjaypalFebruary 27, 2022 at 3:15 AM
Music and Performing Arts today is growing to be fields that students would like to consider as professional education options. All professional courses In Music and Performing Arts are available through us. Coaching And Mentoring In Music & Performing Arts is also our forte.
ReplyDelete
Replies
glad432March 6, 2022 at 5:11 AM
This comment has been removed by the author.
ReplyDelete
Replies
Jobi JohnsonAugust 4, 2022 at 10:44 PM
This comment has been removed by the author.
ReplyDelete
Replies
GovtAssistancesMarch 27, 2023 at 12:00 AM
The laptops are lightweight and portable, making them easy to carry to and from school. Visit
ReplyDelete
Replies
Help Single MotherSeptember 23, 2023 at 12:09 AM
Before diving into the assistance options, it's crucial to understand why furniture matters. Furniture is not just about aesthetics; it plays a fundamental role in a family's daily life. Click here
ReplyDelete
Replies

Add comment

Google App Engine Java Persistence

Friday, December 4, 2009

Queries with != and IN filters

38 comments:

App Engine Links

Followers

Blog Archive

About Me