0

If I wanted to count foobar.relationships.friend.count, how would I use map/reduce against this document structure so the count will equal 22.

[
    [0] {
              "rank" => nil,
        "profile_id" => 3,
          "20130913" => {
            "foobar" => {
                    "relationships" => {
                      "acquaintance" => {
                        "count" => 0
                    },
                    "friend" => {
                          "males_count" => 0,
                                  "ids" => [],
                        "females_count" => 0,
                                "count" => 10
                    }
                }
            }
        },
          "20130912" => {
            "foobar" => {
                    "relationships" => {
                      "acquaintance" => {
                        "count" => 0
                    },
                    "friend" => {
                          "males_count" => 0,
                                  "ids" => [
                            [0] 77,
                            [1] 78,
                            [2] 79
                        ],
                        "females_count" => 0,
                                "count" => 12
                    }
                }
            }
        }
    }
]
Christian Fazzini
  • 19,613
  • 21
  • 110
  • 215

2 Answers2

1

In JavaScript this query get you the result you expect

r.db('test').table('test').get(3).do( function(doc) {
  return doc.keys().map(function(key) {
    return r.branch(
      doc(key).typeOf().eq('OBJECT'),
      doc(key)("foobar")("relationships")("friend")("count").default(0),
      0
    )
  }).reduce( function(left, right) {
    return left.add(right)
  })
})

In Ruby, it should be

r.db('test').table('test').get(3).do{ |doc|
  doc.keys().map{ |key| 
    r.branch(
      doc.get_field(key).typeOf().eq('OBJECT'),
      doc.get_field(key)["foobar"]["relationships"]["friend"]["count"].default(0),
      0
    )
  }.reduce{ |left, right|
    left+right
  }
}

I would also tend to think that the schema you use is not really adapted, it would be better to use something like

{
  rank: null
  profile_id: 3
  people: [
    {
      id: 20130913,
      foobar: { ... }
    },
    {
      id: 20130912,
      foobar: { ... }
    }
  ]
}

Edit: A simpler way to do it without using r.branch is just to remove the fields that are not objects with the without command.

Ex:

r.db('test').table('test').get(3).without('rank', 'profile_id').do{ |doc|
  doc.keys().map{ |key| 
    doc.get_field(key)["foobar"]["relationships"]["friend"]["count"].default(0)
  }.reduce{ |left, right|
    left+right
  }
}.run
neumino
  • 4,342
  • 1
  • 18
  • 17
  • Your map reduce method doesn't work. On the last suggestion, you have an extra `}` and a `,`. I get `ArgumentError: [] cannot handle var_14 of type RethinkDB::RQL` – Christian Fazzini Sep 13 '13 at 10:12
  • Err, there's a bug in the ruby driver -- See https://github.com/rethinkdb/rethinkdb/issues/1432 to track progress. – neumino Sep 13 '13 at 17:55
  • Just updated the queries. Using `.get_field()` instead of `[]` should work now. – neumino Sep 14 '13 at 00:57
-1

I think you will need your own inputreader. This site gives you a tutorial how it can be done: http://bigdatacircus.com/2012/08/01/wordcount-with-custom-record-reader-of-textinputformat/

Then you run mapreduce with a mapper

Mapper<LongWritable, ClassRepresentingMyRecords, Text, IntWritable>

In your map function you extract the value for count and emit this is the value. Not sure if you need a key?

In the reducer you add together all the elements with the same key (='count' in your case).

This should get you on your way I think.

DDW
  • 1,975
  • 2
  • 13
  • 26