I see questions about case insensitive queries on our message board pretty regularly. We usually give people the same answer, but rather than continuing to describe the solution, why not just provide it? Here we go.
If you're using a relational database it's pretty easy to execute case-insensitive queries because it's a native feature of the database. It usually looks something like this:
select * from Person where UPPER(lastName) = 'ROSS'
This is a challenge for the App Engine Datastore because we rely on executing scans over a limited subset of your data and returning results as we encounter them. Consider the following strings in lexicographic order:
...
ROSE
...
ROSS
...
ROSTA
...
Rose
...
Ross
...
Rosta
...
rose
...
ross
...
rosta
...
As you can see there could be an unlimited number of rows in between 'ROSS' and 'Ross', and also between 'Ross' and 'ross' (okay not unlimited, we do have a limit on indexed string length, but definitely Large), so if we start scanning at 'ROSS' we might have to skip a huge number of results before we hit 'Ross', and that doesn't scale. This is why the datastore doesn't support case-insensitive queries.
Fortunately it's not hard to implement support for case-insensitive queries in your application. Here's the approach: for each field that you want to query in a case-insensitive way, create a duplicate field that stores the value of that field in either all upper or lowercase letters. Then, have your model object register a pre-persist callback with your persistence framework (JPA or JDO), and then populate the duplicate fields inside that callback. Here's what it looks like.
JPA:
import javax.persistence.PrePersist;
import javax.persistence.PreUpdate;
@Entity
public class Person {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
private String lastName;
private String uppercaseLastName;
public Long getId() {
return id;
}
public String getLastName() {
return lastName;
}
public void setLastName(String lastName) {
this.lastName = lastName;
}
@PrePersist
@PreUpdate
public void prePersist() {
if (lastName != null) {
uppercaseLastName = lastName.toUpperCase();
} else {
uppercaseLastName = null;
}
}
}
public List<Person> getPeopleByLastName(String lastName) {
Query q = em.createQuery("select from Person where uppercaseLastName = :p");
q.setParameter("p", lastName.toUpperCase());
return (List<Person>) q.getResultList();
}
JDO:
import javax.jdo.listener.StoreCallback;
@PersistenceCapable(identityType = IdentityType.APPLICATION)
public class Person implements StoreCallback {
@PrimaryKey
@Persistent(valueStrategy = IdGeneratorStrategy.IDENTITY)
private Long id;
private String lastName;
private String uppercaseLastName;
public Long getId() {
return id;
}
public String getLastName() {
return lastName;
}
public void setLastName(String lastName) {
this.lastName = lastName;
}
@PersistenceCapable(identityType = IdentityType.APPLICATION)
public class Person implements StoreCallback {
@PrimaryKey
@Persistent(valueStrategy = IdGeneratorStrategy.IDENTITY)
private Long id;
private String lastName;
private String uppercaseLastName;
public Long getId() {
return id;
}
public String getLastName() {
return lastName;
}
public void setLastName(String lastName) {
this.lastName = lastName;
}
public void jdoPreStore() {
if (lastName != null) {
uppercaseLastName = lastName.toUpperCase();
} else {
uppercaseLastName = null;
}
}
}
public List<Person> getPeopleByLastName(String lastName) {
Query q = pm.newQuery(Person.class, "uppercaseLastName == :p");
return (List<Person>) q.execute(lastName.toUpperCase());
}
Note that in both examples I have not created a getter or a setter for the uppercaseLastName field. That's because this field is part of the interface we're exposing to the persistence framework but not part of the public interface of the model object. Your query code can filter by the uppercaseLastName field but the client code that interacts with the Person object has no reason to know that there are duplicate fields that are used to support case-insensitive queries.
thanks for that post, max.
ReplyDeletea simple question: wouldn't the property name then be a candidate for being defined as "unindexed" in regard to optimization?
If you're sure you'll never want to filter or sort by that propery in a case sensitive way then sure, "unindexed" sounds reasonable. Still, make sure the optimization is really necessary. It's much easier to change a property from "indexed" to "unindexed" than the other way around (see previous post).
ReplyDeleteDo you have an opinion on whether or not toUpperCase should be used vs. toLowerCase? I can't think of any reason for one over the other, but am wondering if there's something to think about if Locale's are introduced.
ReplyDeleteFor example, my entity stores a Locale in it along with a lower-case field, and based on the locale, I perform the lowercase function in my prePersist(). Not sure if it really matters for English, but perhaps for other language sets?
Interesting question, I don't know the answer. Perhaps someone else can chime in?
ReplyDeleteJavaDoc from String#toUpperCase()
ReplyDelete-----------------------------
Converts all of the characters in this String to upper case using the rules of the default locale. This method is equivalent to toUpperCase(Locale.getDefault()).
Note: This method is locale sensitive, and may produce unexpected results if used for strings that are intended to be interpreted locale independently. Examples are programming language identifiers, protocol keys, and HTML tags. For instance, "title".toUpperCase() in a Turkish locale returns "T\u0130TLE", where '\u0130' is the LATIN CAPITAL LETTER I WITH DOT ABOVE character. To obtain correct results for locale insensitive strings, use toUpperCase(Locale.ENGLISH).
-----------------------------
So, your solution (just as SQL one) will not work for Turkish language.
E.g we have name "Likas". It will be stored as "LIKAS" (Locale.ENGLISH is the default one in AppEngine, right?). But in Turkish uppercase variant is "LİKAS". So, if some Turkish person will look for "LİKAS", he will not fined "Likas". Hopefully you get my point :)
Finally I found that story:
ReplyDeletehttp://gizmodo.com/382026/a-cellphones-missing-dot-kills-two-people-puts-three-more-in-jail
(originally found by Stefan B.)
So a good solution might be to capture the Locale of the user entering in a certain type of data, and then perform a lower/upper case translation with that user's locale (String.toLowerCase(Locale.XYZ))
ReplyDeleteNot as easy as Max's solution since it requires one more piece of data (the Locale), but should work if you can capture the locale.
What is the benefit of the logic being in prePersist() instead of setLastName()
ReplyDeleteGood question! In this example I can't think of a benefit, other than my desire to demonstrate that these callbacks are easy to use. However, if you needed to compute some derived state based on the value of more than one field you'd definitely want the logic in prePersist() since you don't need to think about the order in which your setters are invoked.
ReplyDeleteOk, now I'm stuck with 20k records that do not have this. What can I do?
ReplyDeletecron? It can run for just 30 seconds right. How do I go about the entire process?
I'm subscribing to ur blog now, have so much to learn!
I ran into the problem of mass updating 15k+ object lists. My first idea was to do it on localhost and use the bulk uploader to upload the whole DB, but the bulk uploader is only available in Python...
ReplyDeleteSo what I ended up doing was creating a Task which only processes n objects in a run and then reschedule a new task to continue with the remaining objects. I chose n to be a number that is safe to be finished in 30 seconds. I also ran into the problem of seeking into the middle of the resultset, as the Query.setRange() only accepts offsets below 1000... to solve this problem, I used used a Cursor which was being passed between the tasks, so the next task could continue exactly where the previous once finished, without having to rerun the appstore query.
See AppStore documentation for Tasks and Data storage / Cursor.
Thanks for the post. Really important;
ReplyDeleteThank you!
ReplyDeleteToo bad I saw this after a few wasted hours...
Its great blog..Thanks lot..
ReplyDeletebest dissertation topics
A woman wakes up in the middle of the night in a state of crisis. Her life flashes before her eyes as she struggles to get to the bottom of her anxiety.limo Las Vegas
ReplyDeleteYour column father fasts an extend of their dally duess on buggy vias these persons reconsider sonorous censor match census irrelevant so that you desire to and gown prestige per despite basss.legal recruitment
ReplyDeleteGetting the best price on hard wood flooring in atlanta can be difficult. Because of this I thought this article may be helpful to you and your readers.best vitamins for arthritis
ReplyDeleteGreat blog! I really love how it is easy on my eyes and the information are well written. I am wondering how I might be notified whenever a new post has been made. I have subscribed to your rss feed which really should do the trick! Have a nice day!
ReplyDeleteVery interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article.e procurement
ReplyDeletethis got me thinking what other workouts are good for those of us who find ourselves on the road or have limited.Ft Lauderdale Patent Attorney
ReplyDeleteInteresting blog. This is one of my favorite blog also I want you to update more post like this. Thanks for sharing this article.send money abroad online
ReplyDeleteI would love to stop by. But, I think it might have to wait until this summer. I did not know that Serlkay had ever expanded its size. I must say that a succesful family owned business in this.buy job posting
ReplyDeleteThis can be well-articulated, sharpened, and also aim. Precisely what important lessons have you found out that an individual applied to this post. Looking forward to go through your experience. Where have you came your own bottom line?unique bow ties
ReplyDeleteYou have a very inspiring way of exploring and sharing your thoughts. restaurant hood cleaning
ReplyDeleteMy partner and i mood tutorial dapper your blog, I want to in order to necessitate some sort of troll elocution in order to tenor an individual although would like which you serviceable extension. Looking an individual the actual search associated with attraction pertaining to numerous your writing a blog surenesss.perfumes cacharel
ReplyDeleteI am having a great time today! Extremely helpful information specially the last part I care for such info a lot.I am just amazed to see this top notch article today.compare conveyancing
ReplyDeleteI don't have sufficient energy right now to completely read your site but I have bookmarked it and likewise include your RSS channels. I will be back in a day or two. much obliged concerning an incredible site.cheap money transfer
ReplyDeleteThanks for sharing. I hope it will be helpful for too many people that are searching for this topic.70 volt speakers
ReplyDeleteast our site definitely will indisputably turn out to be widely known with virtually all operating a blog families, due to its fastidious content pieces and / or review articles.janitorial cleaning services etobicoke
ReplyDeleteA case study evaluation needs one to address the company’s issue, analyze the alternate options, and propose the best answer using supporting evidence.oven cleaning
ReplyDeleteThanks for this great post, i find it very interesting and very well thought out and put together. I look forward to reading your work in the future.room scheduling software for colleges
ReplyDeletePretty good post. I just stumbled upon your blog and wanted to say that I have really enjoyed reading your blog posts. Any way I’ll be subscribing to your feed and I hope you post again soon.paris escort
ReplyDeleteThis diet is an intensive program that allows you to lose weight fast. The speed of this diet means you can do it while you are on a 3-week holiday.room signage software
ReplyDeleteNice post. I leaгn something totally new and challenging on websites I stumbleupon on a daily bɑsis.personalization t shirts
ReplyDeletethe spirit of healty living rhinoplasty dallas
ReplyDeleteThis is really awesome and i love that.. This is very unique thing you put on that post.mtn data plans
ReplyDeleteIt seems interesting. Its like you read my mind! You appear to know so much about this, like you wrote the book in it or something. I think that you could do with some pics to drive the message home a little bit, but other than that, this is great blog. A great read. I will certainly be back.mississauga mortgage agent
ReplyDeleteEverything is very open with a very clear clarification of the challenges. It was definitely informative. Your website is very useful. mortgage broker barrie
ReplyDeleteGive me sales more than 60 bottles of God.Popcorn Machines
ReplyDeleteFirst of all Good day! I would just like to give thanks to the admin who posted this one I really appreciate it thank you.family dentist in Vaughan
ReplyDeleteSuch a wonderful work has been done by you really your work is likeable to praise it is an awesome post. healthcare email database
ReplyDeleteIt is a really good submit and incredibly helpful within the fascinating line of articles.Bound Gravel
ReplyDeleteWhen My partner and i go through some sort of blog site, I'm hoping who's isn't going to let down everyone approximately this blog. What I'm saying is,Commercial snow cone machines
ReplyDeleteThank you so much for the post you do. I like your post and all you share with us is up to date and quite informative, i would like to bookmark the page so i can come here again to read you, as you have done a wonderful job.long island tree services
ReplyDeleteThanks for sharing this information with us, i always like to read these articles, cool..!tree trimming Bronx County
ReplyDeleteI am really enjoying reading your well written articles. It looks like you spend a lot of effort and time on your blog. I have bookmarked it and I am looking forward to reading new articles.Resin Drives Diy Kit
ReplyDeleteGreat post full of useful tips! My site is fairly new and I am also having a hard time getting my readers to leave comments. Analytics shows they are coming to the site but I have a feeling “nobody wants to be first”.Murcia
ReplyDeleteI really like the dear information you offer in your articles. I’m able to bookmark your site and show the kids check out up here generally. Im fairly positive theyre likely to be informed a great deal of new stuff here than anyone.ant extermination
ReplyDeleteWe providing working spaces and commercial office spaces in noida.
ReplyDeleteCoworking Spaces In Noida At WorkAndNetwork
Shared Office Space For Rent In Noida At WorkAndNetwork
Commercial space for rent At WorkAndNetwork
Coworking Space At WorkAndNetwork
Virtual office space in noida At WorkAndNetwork