0

I have a table in a database in rethinkdb with 5.8 million records, a sample of which is provided below:

{
"release": {
    "genres": {
        "genre": "Electronic"
    },
    "identifiers": {
        "identifier": [
            {
                "description": "A-Side",
                "value": "MPO SK 032 A1 G PHRUPMASTERGENERAL T27 LONDON",
                "type": "Matrix / Runout"
            },
            {
                "description": "B-Side",
                "value": "MPO SK 032 B1",
                "type": "Matrix / Runout"
            },
            {
                "description": "C-Side",
                "value": "MPO SK 032 C1",
                "type": "Matrix / Runout"
            },
            {
                "description": "D-Side",
                "value": "MPO SK 032 D1",
                "type": "Matrix / Runout"
            }
        ]
    },
    "status": "Accepted",
    "videos": {
        "video": [
            {
                "title": "The Persuader (Jesper Dahlbäck) - Östermalm",
                "duration": 290,
                "description": "The Persuader (Jesper Dahlbäck) - Östermalm",
                "src": "http://www.youtube.com/watch?v=AHuQWcylaU4",
                "embed": true
            },
            {
                "title": "The Persuader - Vasastaden",
                "duration": 380,
                "description": "The Persuader - Vasastaden",
                "src": "http://www.youtube.com/watch?v=5rA8CTKKEP4",
                "embed": true
            },
            {
                "title": "The Persuader-Stockholm-Sodermalm",
                "duration": 335,
                "description": "The Persuader-Stockholm-Sodermalm",
                "src": "http://www.youtube.com/watch?v=QVdDhOnoR8k",
                "embed": true
            },
            {
                "title": "The Persuader - Norrmalm",
                "duration": 289,
                "description": "The Persuader - Norrmalm",
                "src": "http://www.youtube.com/watch?v=hy47qgyJeG0",
                "embed": true
            }
        ]
    },
    "labels": {
        "label": {
            "catno": "SK032",
            "name": "Svek"
        }
    },
    "companies": {
        "company": [
            {
                "id": 271046,
                "catno": "",
                "name": "The Globe Studios",
                "entity_type_name": "Recorded At",
                "resource_url": "http://api.discogs.com/labels/271046",
                "entity_type": 23
            },
            {
                "id": 56025,
                "catno": "",
                "name": "MPO",
                "entity_type_name": "Pressed By",
                "resource_url": "http://api.discogs.com/labels/56025",
                "entity_type": 17
            }
        ]
    },
    "styles": {
        "style": "Deep House"
    },
    "formats": {
        "format": {
            "text": "",
            "name": "Vinyl",
            "qty": 2,
            "descriptions": {
                "description": [
                    "12\"",
                    "33 ⅓ RPM"
                ]
            }
        }
    },
    "country": "Sweden",
    "id": 1,
    "released": "1999-03-00",
    "artists": {
        "artist": {
            "id": 1,
            "anv": "",
            "name": "Persuader, The",
            "role": "",
            "tracks": "",
            "join": ""
        }
    },
    "title": "Stockholm",
    "master_id": 5427,
    "tracklist": {
        "track": [
            {
                "position": "A",
                "duration": "4:45",
                "title": "Östermalm"
            },
            {
                "position": "B1",
                "duration": "6:11",
                "title": "Vasastaden"
            },
            {
                "position": "B2",
                "duration": "2:49",
                "title": "Kungsholmen"
            },
            {
                "position": "C1",
                "duration": "5:38",
                "title": "Södermalm"
            },
            {
                "position": "C2",
                "duration": "4:52",
                "title": "Norrmalm"
            },
            {
                "position": "D",
                "duration": "5:16",
                "title": "Gamla Stan"
            }
        ]
    },
    "data_quality": "Complete and Correct",
    "extraartists": {
        "artist": {
            "id": 239,
            "anv": "",
            "name": "Jesper Dahlbäck",
            "role": "Music By [All Tracks By]",
            "tracks": "",
            "join": ""
        }
    },
    "notes": "The song titles are the names of Stockholm's districts."
}}

I am trying to retrieve release titles where the artist name is "Persuader, The"

I used the following command:

r.db("discogs").table("releases").getField("release").filter(r.row("artists").getField("artist").getField("name").eq("Persuader, The")).getField("title")

It takes too long. However, it works quickly on smaller databases. How do I speed it up?

1 Answers1

1

Currently, you have no index in your query, which means the database has to go through every single document filtering out by the artist.

You can create an index for the nested artist name property using the indexCreate command:

r
  .db("discogs")
  .table("releases")
  .indexCreate('artistName', r.row('release')('artists')('artist')('name'));

After that, you can just get all rows with a specific artist names by using the getAll command.

r
  .db("discogs")
  .table("releases")
  .getAll("Persuader, The", { index: 'artistName' });

This is not only cleaner, but much faster.

Jorge Silva
  • 4,574
  • 1
  • 23
  • 42
  • 1
    I am getting the following error :( RqlRuntimeError: Index `artistName` on table `discogs.releases` was accessed before its construction was finished in: r.db("discogs").table("releases").getAll("Persuader, The", {index: "artistName"}) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ –  Jun 06 '15 at 06:06
  • 1
    I tried indexWait(). But it is taking too long. Also how do I use them multiple columns? –  Jun 06 '15 at 16:03
  • 1
    You only have to create the index once. When you create the index, it has to index every single document so, yeah, it'll take a while. After you created the index, you don't need to create it again... – Jorge Silva Jun 06 '15 at 17:43
  • 2
    If you want to create an index for multiple columns, you can use an index function (see here: http://rethinkdb.com/api/javascript/index_create/) – Jorge Silva Jun 06 '15 at 17:45