5

I'm trying to get started with Amazon CloudSearch. I have my data in a DynamoDB Table that I want to search. I was able to set up the cloud search domain and it pulled the fields from the table and let me set them, etc. However I went to upload the data and I'm having some issues. I told it the DynamoDB table to pull from and it did it's data pull and told me it found the items and I pressed the import but it's giving me errors about needing at least one field and needing to have a non-null id.

I downloaded the list of documents that it is trying to upload and I see this (or similar) pretty frequently repeated throughout the documents:

{
    "type": "add",
    "id": "null",
    "fields": {

    }
  },
  {
    "type": "add",
    "id": "null",
    "fields": {
      "libraries": "721409e7-5fca-495d-a625-a5bc5f4a0434~d8ccd611-ae75-418b-91a3-13dd57d46934",
      "shadecolor": "170-98-104",
      "objecttype": "20",
      "timestamp": "2017-01-25T22:43:53.672Z"
    }
  },

Obviously I can see that one of these documents has a null id and no fields at all, and the other has fields but no id which seems to be the issue. However I have no idea where this data is coming from and why it's generating it. The ID is my DynamoDB Hash Key so it can't be null or Dynamo wouldn't accept it. I have looked through my Dynamo data and I can't find any data like that...

Can someone tell me what may be happening here and the best way to deal with it to get my data into CloudSearch?

sfaust
  • 2,089
  • 28
  • 54
  • For now I've ended up using the API to create documents and then uploaded them that way; but it seems like it should be possible to upload them through the console as well; don't understand why it's generating all the weird null id docs... The API way works for me but still interested to know if anyone knows why this would be happening... – sfaust Feb 06 '17 at 16:54
  • I'm having the same issues via the CLI. The only thing I can tell is that some of my data has extra fields and if CS is sampling a subset that doesn't contain the extra fields, it might be choking when it hits a doc with more fields than it's expecting. Does this fit your case as well? – kfblake Mar 22 '17 at 23:12
  • Possibly, I would have to look at it for sure but I do have a few fields that aren't on every object so that variance could be there... I moved to generating them via the API as the requirements started getting more complex so I haven't really spent much more time working on this since the API seems to be fine. – sfaust Mar 23 '17 at 15:25
  • 1
    For the record, my issue ended up being with Map attributes. Any docs with a Map attribute broke the AWS CLI conversion to CloudSearch ready import format. There is a round-about-way to resolve this, but it's laborious and best for large single-use imports only. – kfblake Mar 23 '17 at 18:35
  • Did you guys figure this out? I am facing the exact same issue and have no idea on how to go about this. – ASR4 Jun 26 '19 at 05:57
  • 1
    We ended up needing to custom create the objects trough API anyway for other reasons so we did that. Unfortunately never figured out why this was happening. – sfaust Jun 26 '19 at 15:06
  • Oh, thats sad :/ Can you share any doc which talks about using DynamoDB to upload data to CloudSearch using APIs? I did not come across any that specificaly uses dynamo db as a source for uploading, using APIs. – ASR4 Jun 27 '19 at 07:12
  • No we ended up deciding that we needed the properties slightly different for cloudsearch so we just have our DynamoDB for pulling data and we manually create CloudSearch documents with the properties we need and upload them. – sfaust Jun 27 '19 at 07:33

1 Answers1

0

For me, the problems were Map attributes and List attributes. Once I removed all those attributes from my table items the upload worked successfully

Theo
  • 479
  • 5
  • 6