5

I am performing a search on my AWS CloudSearch domain from a Lambda function in node.js:

I uploaded a document such as this:

         {
               “some_field”: “bla bla“,
               “some_date_field”: 1.466719E9,
               "number_field”: 4,
               “some_string”: "some long string blabla"
         }

And I perform a search like this

   var params = {
                  query: 'bla bla',
                };

    cloudsearchdomain.search(params, function(err, data) {

      if (err) {
        console.log(err, err.stack); // an error occurred
        context.fail(err); 
      } 
      else  {
        context.succeed(data);           // successful response
      }    

    });

The search works and as documented here CloudSearch returns document info in fields property of a hit. Here is an example:

  {
   "status": {
   "timems": 2,
   "rid": “blabla”
  },
    "hits": {
       "found": 1,
       "start": 0,
       "hit": [
               {
                "id": “452545-49B4-45C3-B94F-43524542352-454352435.6666-8532-4099-xxxx-1",
                "fields": {
                   “some_field”: [
                     “bla bla“
                    ],
                   “some_date_field”: [
                     "1.466719E9"
                    ],
                   "number_field”: [
                      "4"
                    ],
                   “some_string”: [
                     "some long string blabla"
                   ],
             }
      }
   ]
 }
 }

As you can see all the fields are returned as strings in an array. Is there anyway to get the results as a JSON that preserves the type of all the fields?

Zigglzworth
  • 6,645
  • 9
  • 68
  • 107

2 Answers2

4

After submitting a report about this to AWS I received this reply:

Hello, This is actually the intended behavior. The SDK team chose to implement the "fields" property as a dictionary of string keys and string-array values to maintain consistency across the various languages in which the AWS SDK exists. They place the responsibility for handling the various response formats (HTTP request vs. SDK method) on the client. For more details, please see: https://github.com/aws/aws-sdk-js/issues/791

Unfortunately the only current solutions to the problem I describe above is:

1) Create a parser that will parse the results as needed based on your expected response which takes into account your data types

2) Add a new field to your cloudsearch index (text type) containing a stringified version of your entire json object/document. You can then just use JSON.parse() on this to get the document in JSON format. This solution is not ideal because it adds an unnecessary chunk of text to your document but it proved a quick solution to my problem above.

I'd love to hear of any more solutions if anyone knows of any.

Zigglzworth
  • 6,645
  • 9
  • 68
  • 107
2

CloudSearch does preserve the field type; the results imply that you've configured these fields as arrays.

You can confirm this by going to Indexing Options for your domain on the AWS web console. You should see fields that are text-array, literal-array, etc as in the screenshot below. Those will be returned as arrays. You can change them to non-array types if you will only ever be submitting a single value for each field in each document and you'll get back non-array values. indexing options

alexroussos
  • 2,671
  • 1
  • 25
  • 38
  • Thanks but this doesn't seem to be the case in my issue as I have the fields configured correctly. In my case I am using only laterals, int, and text data types in the index configuration. Yet these are still returned in arrays.. – Zigglzworth Jun 29 '16 at 22:21
  • I just confirmed the behavior is as I described (ignore the fact that `author` is not return-enabled in my example; I checked other array vs non-array fields that are returnable). I can't come up with any remotely likely reasons it would be behaving differently for you. Are you using the the 2013-01-01 version of CloudSearch? Have you tried querying CloudSearch directly via curl or your browser to get the raw results, instead of through a library? – alexroussos Jun 29 '16 at 22:31
  • I am using latest cloudsearch javascript SDK. If I do a search in cloudsearch console and view the raw JSON then it is as expected but when doing the search via the javascript SDK as I describe then I have the array issue. This may well be a bug in the javascript SDK. Did you test using the SDK or curl? – Zigglzworth Jun 30 '16 at 18:16
  • 1
    I don't have the js sdk set up but yes sounds like there's a bug. Works as desired with curl – alexroussos Jul 01 '16 at 16:48
  • Thanks. I submitted a bug report to AWS concerning this issue. – Zigglzworth Jul 01 '16 at 17:23
  • Nice. I'd be curious to hear what comes of it. If you submitted the bug via the AWS bug forum, can you link me to it? – alexroussos Jul 01 '16 at 18:52
  • Note their reply in the answer I posted – Zigglzworth Jul 05 '16 at 19:43