7

In a Rails 3.2 app I have a Tag model and want to find all records where the value in the name:string field is all lowercase.

Thus the activerecord query (on Postgres) would return Tag(id: 1, name: 'test') but not Tag(id:2, name: 'Test').

I'm sure there's a straightforward way to do this but I haven't been able to produce a working query!

djoll
  • 1,139
  • 1
  • 12
  • 31
  • Thanks ROR Developer, but what I'm after is all `Tags` that have `names` that have only lowercase letters, i.e. 'test', 'banana', 'ror developer' but not 'TeSt', 'BANANA', nor 'ROR Developer'. – djoll Aug 24 '15 at 06:55
  • 2
    There are slight differences in the two answers, and you ought to be clear whether what you want is all rows for which column's value is the same as its lowercase representation, or for which the column's value has no characters other than lowercase letters. – David Aldridge Aug 24 '15 at 09:21
  • Yep, **very true** David. Either would have been acceptable, though what I was most ideally trying to do was *exclude* Tags with a name containing any uppercase character. But in the end I settled on a regex query based on Brad's answer that then did the opposite, `regex: '^[a-z0-9]+$'` finding Tags with names containing only numbers and lowercase letters, as this returned fewer non-relevant queries. Strangely not only was the 'regex way' more flexible, but in production **actually twice as fast!** – djoll Aug 24 '15 at 10:08
  • The performance may be explainable by the regexp returning a false as soon as it reaches a character not in the required set, which would be optimal when strings start with an uppercase character. No need to convert everything and then compare against the original. – David Aldridge Aug 24 '15 at 10:46
  • Great point David. Running some stats on the database shows that of the 109,756 current Tags, 99.17% start with an uppercase letter. Interesting again, that query `Tag.where('name ~ :regex', regex: '^[A-Z]').count` still takes almost 50% longer than the `'^[a-z0-9]+$'` regex & count (ie. 100 ms vs 70 ms), even with a hot postgres cache. – djoll Aug 24 '15 at 11:34

2 Answers2

11

This should work:

Tag.where('name = lower(name)')

If name equals lower(name) it means that name is lowercase.

Mischa
  • 42,876
  • 8
  • 99
  • 111
  • *Thanks Mischa!* Yours is the most elegant and precise query, though when I test performance on my production database (Heroku / Postgres) with just under 110,000 Tags, and after the database cache is warmed up it, it averages **140 ms for your query**, vs **under 70 ms for Brad's Regex answer**. This really surprised me - who would have thought have 'now you have two problems' resulted in the fastest query!? – djoll Aug 24 '15 at 10:01
4

You can use a regular expression to search for all lowercase, like:

Tag.where('name ~ :regex', regex: '^[a-z]+$')
Brad Werth
  • 17,411
  • 10
  • 63
  • 88
  • 1
    This will return incorrect results when `name` contains accented characters like e.g. `é`. – Mischa Aug 24 '15 at 07:01
  • If you need support for non-english characters, you can try the regex at http://stackoverflow.com/a/5005122/525478 – Brad Werth Aug 24 '15 at 07:03
  • This works with postgresql. if you are using MySQL: `Tag.where("name REGEXP ?", '^[a-z]+$')` – Sanad Liaquat Aug 24 '15 at 07:19
  • Thanks Brad! Yours is the fastest query on testing, though you have a typo in the active record expression, it should read `...('name ~ :regex',...)` as `~*` represents a case-**insensitive** regex match for Postgres. If you correct your answer I'll mark it accepted. Thanks! – djoll Aug 24 '15 at 09:14