0

I have successfully imported some JSON data into cloudant, the JSON data has three levels. Then created the dashdb warehouse from cloudant to put the data into relational tables. It appears that dashdb has created three tables for each of the levels in the JSON data but has not provided me with a Key to join back to the top level. Is there a customisation that is done somewhere that tells dashdb how to join the tables. A sample JSON doc is below:

 {
  "_id": "579b56388aa56fd03a4fd0a9",
  "_rev": "1-698183d4326352785f213b823749b9f8",
  "v": 0,
  "startTime": "2016-07-29T12:48:04.204Z",
  "endTime": "2016-07-29T13:11:48.962Z",
  "userId": "Ranger1",
  "uuid": "497568578283117a",
  "modes": [
    {
      "startTime": "2016-07-29T12:54:22.565Z",
      "endTime": "2016-07-29T12:54:49.894Z",
      "name": "bicycle",
      "_id": "579b56388aa56fd03a4fd0b1",
      "locations": []
    },
    {
      "startTime": "2016-07-29T12:48:02.477Z",
      "endTime": "2016-07-29T12:53:28.503Z",
      "name": "walk",
      "_id": "579b56388aa56fd03a4fd0ad",
      "locations": [
        {
          "at": "2016-07-29T12:49:05.716Z",
          "_id": "579b56388aa56fd03a4fd0b0",
          "location": {
            "coords": {
              "latitude": -34.0418308,
              "longitude": 18.3503616,
              "accuracy": 37.5,
              "speed": 0,
              "heading": 0,
              "altitude": 0
            },
            "battery": {
              "is_charging": true,
              "level": 0.7799999713897705
            }
          }
        },
        {
          "at": "2016-07-29T12:49:48.488Z",
          "_id": "579b56388aa56fd03a4fd0af",
          "location": {
            "coords": {
              "latitude": -34.0418718,
              "longitude": 18.3503895,
              "accuracy": 33,
              "speed": 0,
              "heading": 0,
              "altitude": 0
            },
            "battery": {
              "is_charging": true,
              "level": 0.7799999713897705
            }
          }
        },
        {
          "at": "2016-07-29T12:50:20.760Z",
          "_id": "579b56388aa56fd03a4fd0ae",
          "location": {
            "coords": {
              "latitude": -34.0418788,
              "longitude": 18.3503887,
              "accuracy": 33,
              "speed": 0,
              "heading": 0,
              "altitude": 0
            },
            "battery": {
              "is_charging": true,
              "level": 0.7799999713897705
            }
          }
        }
      ]
    },
    {
      "startTime": "2016-07-29T12:53:37.137Z",
      "endTime": "2016-07-29T12:54:18.505Z",
      "name": "carshare",
      "_id": "579b56388aa56fd03a4fd0ac",
      "locations": []
    },
    {
      "startTime": "2016-07-29T12:54:54.112Z",
      "endTime": "2016-07-29T13:11:47.818Z",
      "name": "bus",
      "_id": "579b56388aa56fd03a4fd0aa",
      "locations": [
        {
          "at": "2016-07-29T13:00:08.039Z",
          "_id": "579b56388aa56fd03a4fd0ab",
          "location": {
            "coords": {
              "latitude": -34.0418319,
              "longitude": 18.3503623,
              "accuracy": 36,
              "speed": 0,
              "heading": 0,
              "altitude": 0
            },
            "battery": {
              "is_charging": false,
              "level": 0.800000011920929
            }
          }
        }
      ]
    }
  ]
}

SQL for the three tables created in dashdb showing all the fields in each table is here. Note there is no FK that I can see, the "_ID" fields are unique to each table.

SELECT ENDTIME,STARTTIME,USERID,UUID,V,"_ID","_REV" 
FROM <schemaname>.RANGER_DATA
where "_ID" = '579b56388aa56fd03a4fd0a9'

SELECT ARRAY_INDEX,ENDTIME,NAME,STARTTIME,TOTALPAUSEDMS,"_ID"
FROM <schemaname>.RANGER_DATA_MODES
where "_ID" = '579b56388aa56fd03a4fd0b1'

SELECT ARRAY_INDEX,AT,LOCATION_BATTERY_IS_CHARGING,LOCATION_BATTERY_LEVEL,LOCATION_COORDS_ACCURACY,LOCATION_COORDS_ALTITUDE,LOCATION_COORDS_HEADING,LOCATION_COORDS_LATITUDE,LOCATION_COORDS_LONGITUDE,LOCATION_COORDS_SPEED,RANGER_DATA_MODES,"_ID" 
FROM <schemaname>.RANGER_DATA_MODES_LOCATIONS
where "_ID" = '579b56388aa56fd03a4fd0b0'
binway
  • 88
  • 1
  • 11
  • The warehousing task should be creating keys for you automatically. Can you share an example of the JSON and the respective tables that are getting generated? – Ed Shee Sep 14 '16 at 12:13
  • Cloudant uses _id as it's UID for each document. My guess is that the warehousing task is detecting the _id in modes and locations and assuming they are new documents. If you try renaming them to, say, ID does that work? – Ed Shee Sep 15 '16 at 11:22
  • That appeared to be the issue, have replaced all the _id with just id. Now the dashdb tables look better. – binway Sep 16 '16 at 05:54
  • I'll provide it as an answer. Would you mind accepting it? – Ed Shee Sep 16 '16 at 07:56

1 Answers1

0

Cloudant uses _id for its UID for each document. It seems that the warehousing task iterates over these documents and assumes that there is a new document every time it sees a new _id.

Because you're using _id in your modes and locations this will produce an undesired result in the SQL DB.

Renaming your _id in modes and locations to something else should fix the problem.

Ed Shee
  • 931
  • 1
  • 7
  • 22