0

I am new to elasticsearch and i am having tough time switching from mysql to elasticsearch

my Mysql my tables looks like this


    table :  test_request 
    +---------+-------------+--------------+-----------+------------+-----------+
    | test_id | device_name | ip_address   | user_name | time_stamp | show_flag |
    +---------+-------------+--------------+-----------+------------+-----------+
    |   1     |     d1      |   0.0.0.0    |   admin   |            |     Y     |
    +---------+-------------+--------------+-----------+------------+-----------+

    
table: test_results +----+---------+-----+-----------------------+-------------------------+----------------------------------+-----------+ | id | test_id | cli | xml | json | another json | show_flag | +----+---------+-----+-----------------------+-------------------------+----------------------------------+-----------+ | 1 | 1 | c1 | some xml format data | {"some":"json here"} | {"some":" another json here"} | Y | +----+---------+-----+-----------------------+-------------------------+----------------------------------+-----------+ | 2 | 1 | c2 | some xml format data | {"some":"json here"} | {"some":" another json here"} | Y | +----+---------+-----+-----------------------+-------------------------+----------------------------------+-----------+ | 3 | 1 | c2 | some xml format data | {"some":"json here"} | {"some":" another json here"} | Y | +----+---------+-----+-----------------------+-------------------------+----------------------------------+-----------+

the test_id field in the test_request table and the id field in the test_results table are auto increment.
The json and another json fields are of data type JSON.

I am trying to use elasticsearch_dsl to create index and its mappings. I am going through the docs to figure out how to do achieve this but i couldn't figure out three things

  1. how to get the test_id to auto increment
  2. how to make a field of JSON data type
  3. Best way to setup a relationship between both (i partially understood nested could help here) but looking for the correct way to do this

1 Answers1

1

The auto increment id columns play following rules in the SQL tables:

  1. they are unique identifiers of the row
  2. they allow to link rows between tables

To achieve this in elasticsearch you don't need a auto increment field. You can add document to elasticsearch index, and elasticsearch will add a unique id to it.

For JSON fields use simply object datatype.

There are few options to setup relation like SQL join:

  1. You can put test_results as nested objects within test_request document
  2. You can use join datatype field to link test_results documents to test_request document within the same index
  3. You can denormalize and store every test_result into single document together with its test_request. It is ok, that test_request will be stored many times. Elasticsearch is primarily for searching anyway.

Which version you choose is up to you. It depends, how are you going to use your data, what kind of queries are you going to do. Can you collect all test_results together with the test_request and store it with the single call, or do you need store the test_request and the successively add test_results?

Successively updating nested field would mean reindexing the whole document every time. Join datatype is expensive for querying.

Denormalization adds space usage, but if the number of test_results per request is not large, then it is maybe the best option.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
zbusia
  • 571
  • 2
  • 7