-1

I have an articles array, and each of its element has source and score attributes. I can select the articles for each srouces[sic] with the highest score like:

articles = articles.sort_by(&:score).reverse.uniq(&:source)

What if I want to get the first three unique elements by source? uniq can only return the first.

Here is the desired example, you can specify uniq(first_n_element), to get the n elements:

  # To make the example simpler,
  # I use array as element
  b = [["source1","10"], ["source2","9"], ["source3","8"], ["source1","7"], ["source1","9"], ["source2","8"]]
  # return should contain ["source1","10"], ["source1","9"],
  # because they are the first 2 distinct element by `source`,
  b.sort(&:second).uniq(2) { |s| s.first }
  # => [["source1","10"], ["source2","9"], ["source3","8"], ["source1","9"], ["source2","8"]]
ZK Zhao
  • 19,885
  • 47
  • 132
  • 206
  • 3
    Your question is not clear. You should either spell it out better, or give examples. – sawa Aug 02 '15 at 09:34
  • @sawa, just updated, not sure if this is clear though? – ZK Zhao Aug 02 '15 at 09:52
  • What do you mean by first n uniq elements. Please keep it clear. You could say first n elements or all uniq elements or first element but not first n uniq elements. – Prabhakar Aug 02 '15 at 10:03
  • @Packer, well, first n element by `source`. – ZK Zhao Aug 02 '15 at 10:06
  • `uniq` removes duplicates - it does not necssarily returns single element – Wand Maker Aug 02 '15 at 10:27
  • @cqcn1991 the question is still not clear. When you say "first n elements by `source`", do you want to have a hash with keys being source and values being an array of first n elements by score? – AmitA Aug 02 '15 at 10:32
  • @AmitA, yeah, I feel the same, I try put it another way: `uniq(&:source)` only returns the first element (by `source`), but I want more, how can I return the first n elements? Not only the first. – ZK Zhao Aug 02 '15 at 10:36
  • Example & code that works on it do not compile – Wand Maker Aug 02 '15 at 10:42
  • @WandMaker, I'm sorry, I just made up this example, because there is no `uniq(n)` method at all. I put here only to show the desired outcome. – ZK Zhao Aug 02 '15 at 10:45
  • I finally get it. In that case, @sawa's answer is the way to go. – AmitA Aug 02 '15 at 11:11

3 Answers3

2

Not sure, but the following may be what you want. I assume the order of the elements is insignificant.

articles = articles
.group_by(&:source)
.values
.flat_map{|a| a.sort_by(&:score).last(2)}

If the order is significant, then do a articles & with the above result.

sawa
  • 165,429
  • 45
  • 277
  • 381
  • Yeah, I'm also thinking about `group_by`, but didn't know how to put them back into 1 group after spliting them by source. – ZK Zhao Aug 02 '15 at 10:00
  • 1
    @AmitA Your code will end up with a hash (which is not an array), and I don't know what you are trying to do, but the OP's problem is solved by my applying `values` and `flat_map`. – sawa Aug 02 '15 at 10:11
  • @sawa, yes you are right. Thought that what the OP wanted when he answered Packer with "first n elements by source" and "put them back into 1 group after splitting them", but removing my comment. – AmitA Aug 02 '15 at 10:15
1

If you want to get the first three unique sources that manifest the highest scores, you can do the following (if you want something else, then apologies for not fully understanding your question).

First, make sure the scores are integer, not strings:

b = [["source1",10], ["source2",9], ["source3",8], ["source1",7], ["source1",9], ["source2",8]]

Then do this:

b.sort_by(&:second).reverse!.uniq(&:first).first(3)

(Use #reverse! for fastest results, per this)

Community
  • 1
  • 1
AmitA
  • 3,239
  • 1
  • 22
  • 31
0

Your example array b is an array of array - there are no objects in it which have methods called score or source. Given that limitations, following is the closest one can come up with answer. Also, for sorting to work correctly, the second element should have been integer, so we need to convert second element to integer by calling to_i in a map method

class Array
    def second 
        self[1]
    end
end

articles = [["source1","10"], ["source2","9"],
            ["source3","8"], ["source1","7"], 
            ["source1","9"], ["source2","8"], 
            ["source1", "100"]]

p articles.map{|a| [a.first, a.second.to_i]}
       .sort_by(&:second).reverse.uniq(&:first)

# To get first n elements, add first(n)
p articles.map{|a| [a.first, a.second.to_i]}
       .sort_by(&:second).reverse.uniq(&:first).first(2)

Output

[["source1", 100], ["source2", 9], ["source3", 8]]
[["source1", 100], ["source2", 9]]
Wand Maker
  • 18,476
  • 8
  • 53
  • 87
  • Since you are calling `uniq` first, there are cases where elements with higher scores will be removed right before sorting. I mean it won't give the right answer if elements with smaller scores for a source are placed before the higher score ones for that source. – limekin Aug 02 '15 at 10:50
  • @limekin Thanks, fixed it. – Wand Maker Aug 02 '15 at 11:33