64

Can someone briefly explain to me the difference in use between the methods uniq and distinct?

I've seen both used in similar context, but the difference isnt quite clear to me.

Matthias
  • 1,884
  • 2
  • 18
  • 35

4 Answers4

123

Rails queries acts like arrays, thus .uniq produces the same result as .distinct, but

  • .distinct is sql query method
  • .uniq is array method

Note: In Rails 5+ Relation#uniq is deprecated and recommended to use Relation#distinct instead. See http://edgeguides.rubyonrails.org/5_0_release_notes.html#active-record-deprecations

Hint:

Using .includes before calling .uniq/.distinct can slow or speed up your app, because

  • uniq won't spawn additional sql query
  • distinct will do

But both results will be the same

Example:

users = User.includes(:posts)
puts users
# First sql query for includes

users.uniq
# No sql query! (here you speed up you app)
users.distinct
# Second distinct sql query! (here you slow down your app)

This can be useful to make performant application

Hint:

Same works for

  • .size vs .count;
  • present? vs .exists?
  • map vs pluck
Abel
  • 3,989
  • 32
  • 31
itsnikolay
  • 17,415
  • 4
  • 65
  • 64
  • 2
    Thanks for this answer. I have a question: Doesn't it depend on the number of rows returned by the db? If you have a massive number of results vs a small number of rows, then the db might perform better than ruby? – GLaDOS Aug 05 '19 at 17:27
  • 4
    @GLaDOS yes, *rails app is like a building*: on the first floor we have sql, on the second and others we have business logic, on the last floor we have a view for users. 1) In case if user doesn't need any data, we should not lift the data from the first floor to last. So it means we shouldn't fetch the data on sql layer, not to lift it on rails one (so use `.limit(25)` method instead of `.first(25)`). 2) Also in case we have have missed any data of first floor. It not efficient to running back on the first floor to grab additional data from sql. So in that case use `.includes(:comments)`. Etc – itsnikolay Aug 06 '19 at 11:40
7

Rails 5.1 has removed the uniq method from Activerecord Relation and added distinct method...

  • If you use uniq with query it will just convert the Activerecord Relaction to Array class...
  • You can not have Query chain if you added uniq there....(i.e you can not do User.active.uniq.subscribed it will throw error undefined method subscribed for Array )
  • If your DB is large and you want to fetch only required distinct entries its good to use distinct method with Activerecord Relation query...
komaldhanwani
  • 101
  • 1
  • 3
5

From the documentation:

uniq(value = true)

Alias for ActiveRecord::QueryMethods#distinct

Community
  • 1
  • 1
Roman Kiselenko
  • 43,210
  • 9
  • 91
  • 103
  • 3
    apidock.com isn't Rails documentation, and it's no longer maintained. A better source is https://api.rubyonrails.org – snowangel May 04 '19 at 21:29
3

Its not exactly answer your question, but what I know is:

If we consider ActiveRecord context then uniq is just an alias for distinct. And both work as removing duplicates on query result set(which you can say up to one level).

And at array context uniq is so powerful that it removes duplicates even if the elements are nested. for example

arr = [["first"], ["second"], ["first"]]

and if we do

arr.uniq

answer will be : [["first"], ["second"]]

So even if elements are blocks it will go in deep and removes duplicates.

Hope it helps you in some ways.

Mayur Shah
  • 3,344
  • 1
  • 22
  • 41
Tushar H
  • 755
  • 11
  • 29