3

Suppose I have couchdb docs that look like so:

{
    "_id": "id",
    "_rev": "rev",
    "title": "foobar",
    "URI": "http://www.foobar.com",
    "notes": "",
    "date": 1334177254774,
    "tags": [
        "tag1",
        "tag2",
        "tag3"
    ],
    "date_modified": 1334177278457,
    "deleted": false
}

What I want is to create an inverted index from the tags, so I end up with something like:

{
    "tag1": [
        _id,
        _id,
        _id
    ],
    "tag2": [
        _id,
        _id,
        ...
    ]
}

From what I've read and tried, couchdb might not let me do it. I can't get it done in the map phase, and it doesn't seem like I can do it in the couch reduce phase. Is this something I need to accomplish in another layer of the app?

Matthew Flaschen
  • 278,309
  • 50
  • 514
  • 539
Geoff Moller
  • 778
  • 7
  • 15
  • What have you tried? It seems like for the map phase, you would do a for loop over the tags, then do `emit(tag, _id)` for each one. Then, in reduce, you would combine key value pairs with the same key. I haven't tried this yet. – Matthew Flaschen Apr 21 '12 at 05:04
  • @MatthewFlaschen, you can't combine key value pairs in reduce, because the size of the result can't be predicted. Reduce functions, must strictly reduce the input to a value with a small and fixed maximum size. – Marcello Nuccio Apr 22 '12 at 08:08

1 Answers1

3

You can achieve this with the map feature of CouchDB.

CouchDB likes tall lists, not fat lists. So, to "cut with the grain" on this problem, you want a view keyed on the tag, with one row per document ID.

// View rows (conceptual diagram)
// Key  , Value
[ "tag1", "_id1"
, "tag1", "_id2"
, "tag1", "_id3"

, "tag2", "_id2"
, "tag2", "_id4"
, "tag3", "_id6"
]

To get a list of all document IDs with a tag, hit the view with a tag name as the key, GET /db/_design/ddoc/_view/tags?key="tag1".

This should be enough. But as lagniappe, you can set the "reduce" value to "_count" to get a count of all tags used (useful for building a tag cloud). Just remember to add &reduce=false when you want to query your index.

JasonSmith
  • 72,674
  • 22
  • 123
  • 149