1

I am trying to build a simple thesaurus app in Rails, in which a word in a table of words would be in a has-many, self-joined relationship to other words in the table, through a joiner table of synonym-pairs.

My SynonymPair class is built as follows:

class SynonymPair < ActiveRecord::Base
    belongs_to :word1, class_name: :Word
    belongs_to :word2, class_name: :Word
end

A crucial aspect of this thesaurus program is that it should not matter whether a word is in the word1 or word2 column; word1 is a synonym of word2, and vice versa.

In order for my Words class to return the SynonymPairs and Synonyms of a given word, I wrote a SQL query:

class Word < ActiveRecord::Base

def synonym_pairs

    #joins :synonym_pairs and :words where either word1_id OR word2_id matches word.id.
    sql = <<-SQL 
    SELECT synonym_pairs.id, synonym_pairs.word1_id, synonym_pairs.word2_id, words.word FROM synonym_pairs 
    JOIN words ON synonym_pairs.word1_id = words.id WHERE words.word = ? 
    UNION SELECT synonym_pairs.id, synonym_pairs.word1_id, synonym_pairs.word2_id, words.word FROM synonym_pairs 
    JOIN words ON synonym_pairs.word2_id = words.id WHERE words.word = ?
    SQL

    #returns synonym_pair objects from the result of sql query
    DB[:conn].execute(sql,self.word,self.word).map do |element|
        SynonymPair.find(element[0])
    end
end

    def synonyms
        self.synonym_pairs.map do |element|
            if element.word1 == self
                element.word2
            else
                element.word1
            end
        end
    end
end

This code works as intended. However, it does not take advantage of association models in ActiveRecord. So, I was wondering it would be possible to write a has_many :synonyms_pairs/has_many :synonyms through: :synonym-pairs custom relation query in the Words class, rather than writing out an entire SQL query, as I did above. In other words, I'm curious if it's possible to convert my SQL query into a Rails custom relations query.

Note, I tried the following custom relations query:

class Word < ActiveRecord::Base

has_many :synonym_pairs, ->(word) { where("word1_id = ? OR word2_id = ?", word.id, word.id) }
has_many :synonyms, through: :synonym_pairs

end

But, after passing a few Word/SynonymPair seeds, it returned a 'ActiveRecord:Associations:CollectionProxy' when I tried getting I called word#synonym_pairs and the following error when I called word#synonyms:

[17] pry(main)> w2 = Word.create(word: "w2")
=> #<Word:0x00007ffd522190b0 id: 7, word: "w2">
[18] pry(main)> sp1 = SynonymPair.create(word1:w1, word2:w2)
=> #<SynonymPair:0x00007ffd4fea2230 id: 6, word1_id: 6, word2_id: 7>
[19] pry(main)> w1.synonym_pairs
=> #<SynonymPair::ActiveRecord_Associations_CollectionProxy:0x3ffea7f783e4>
[20] pry(main)> w1.synonyms
ActiveRecord::HasManyThroughSourceAssociationNotFoundError: Could not find the source association(s) "synonym" or :synonyms in model SynonymPair. Try 'has_many :synonyms, :through => :synonym_pairs, :source => <name>'. Is it one of word1 or word2?

Any other ideas for getting a custom relation query, or any sort of self-join model working here?

Jeff Adler
  • 11
  • 1
  • You need to indicate that `synonyms` belong to the Word class. So `has_many :synonyms, class_name: Word, through: :synonym_pairs` – Les Nightingill Jul 06 '20 at 03:34
  • @LesNightingill that would only work if the assocation points to one single assocation on the table that it points to. – max Jul 06 '20 at 09:39

3 Answers3

0

You are probably looking for the scope ActiveRecord class method:

class SynonymPair < ActiveRecord::Base
    belongs_to :word1, class_name: :Word
    belongs_to :word2, class_name: :Word

    scope :with_word, -> (word) { where(word1: word).or(where(word2: word)) }
end

class Word < ActiveRecord::Base
  scope :synonyms_for, -> (word) do
    pairs = SynonymPair.with_word(word)
    where(id: pairs.select(:word1_id)).where.not(id: word.id).or(
    where(id: pairs.select(:word2_id)).where.not(id: word.id))
  end
   
  def synonyms
    Word.synonyms_for(self)
  end
end
eikes
  • 4,811
  • 2
  • 31
  • 31
  • 1
    The problem with scopes is that they are not relations and cannot be eager loaded. This can lead to some pretty serious performance problems down the line. – max Jul 06 '20 at 09:16
0

Instead of a table of synonym pairs you can just create a standard M2M join table:

class Word
  has_many :synonymities
  has_many :synonyms, though: :synonymities
end
class Synonymity 
  belongs_to :word
  belongs_to :synonym, class_name: 'Word'
end
class CreateSynonymities < ActiveRecord::Migration[6.0]
  def change
    create_table :synonymities do |t|
      t.belongs_to :word, null: false, foreign_key: true
      t.belongs_to :synonym, null: false, foreign_key: { to_table: :words }
    end
  end
end

While this solution would require twice as many rows in the join table it might be well worth the tradeoff as dealing with relations where the foreign keys are not fixed is a nightmare in ActiveRecord. This just works.

AR does not really let you provide the join sql when using .eager_load and .includes and loading records with a custom query and getting AR to make sense if the results and treat the associations as loaded to avoid n+1 query issues can be extremely hacky and time consuming. Sometimes you just have to build your schema around AR rather then trying to beat it into submission.

You would setup a synonym relationship between two words with:

happy = Word.create!(text: 'Happy')
jolly = Word.create!(text: 'Jolly')
# wrapping this in a single transaction is slightly faster then two transactions
Synonymity.transaction do
  happy.synonyms << jolly
  jolly.synonyms << happy
end
irb(main):019:0> happy.synonyms
  Word Load (0.3ms)  SELECT "words".* FROM "words" INNER JOIN "synonymities" ON "words"."id" = "synonymities"."synomym_id" WHERE "synonymities"."word_id" = $1 LIMIT $2  [["word_id", 1], ["LIMIT", 11]]
=> #<ActiveRecord::Associations::CollectionProxy [#<Word id: 2, text: "Jolly", created_at: "2020-07-06 09:00:43", updated_at: "2020-07-06 09:00:43">]>
irb(main):020:0> jolly.synonyms
  Word Load (0.3ms)  SELECT "words".* FROM "words" INNER JOIN "synonymities" ON "words"."id" = "synonymities"."synomym_id" WHERE "synonymities"."word_id" = $1 LIMIT $2  [["word_id", 2], ["LIMIT", 11]]
=> #<ActiveRecord::Associations::CollectionProxy [#<Word id: 1, text: "Happy", created_at: "2020-07-06 09:00:32", updated_at: "2020-07-06 09:00:32">]>
max
  • 96,212
  • 14
  • 104
  • 165
  • The SynonymPair table already is the M2M join table, why create another one? – eikes Jul 06 '20 at 09:23
  • @eikes I changed the naming and stuff to make it more apparent what is going on here. The table is not really being queried as a table of pairs. Rather it has two rows for each pair (one for each direction). Its thus a very different solution. – max Jul 06 '20 at 09:28
  • You would need to add each pair twice then. Not optimal. This can lead to some pretty serious performance problems down the line. The author specifically said: "A crucial aspect of this thesaurus program is that it should not matter whether a word is in the word1 or word2 column; word1 is a synonym of word2, and vice versa." – eikes Jul 06 '20 at 09:42
  • @eikes as written in the answer its a compromise. But the lack of eager loading with the alternatives will be a much more immediate performance problem. – max Jul 06 '20 at 09:50
0

If you really want to setup associations where the record can be in either column on the join table you need one has_many association and one indirect association for each potential foreign key.

Bear with me here as this gets really crazy:

class Word < ActiveRecord::Base
  has_many :synonym_pairs_as_word_1, 
   class_name: 'SynonymPair',
   foreign_key: 'word_1'

  has_many :synonym_pairs_as_word_2, 
   class_name: 'SynonymPair',
   foreign_key: 'word_2'

  has_many :word_1_synonyms, 
   through: :synonym_pairs_as_word_1,
   class_name: 'Word', 
   source: :word_2

  has_many :word_2_synonyms, 
   through: :synonym_pairs_as_word_2,
   class_name: 'Word',
   source: :word_1

  def synonyms
    self.class.where(id: word_1_synonyms).or(id: word_2_synonyms)    
  end
end

Since synonyms here still is not really an association you still have a potential n+1 query issue if you are loading a list of words and their synonyms.

While you can eager load word_1_synonyms and word_2_synonyms and combine them (by casting into arrays) this poses a problem if you need to order the records.

max
  • 96,212
  • 14
  • 104
  • 165