2
  • I have many docements in mongoDB. Mongo-connector inserts those data to elasticsearch. Is there a way, before inserting in to ES where we can add extra field to the document and then insert into elasticsearch? Is there any way in mongo-connector to do the above?

UPDATE

based on your UPDATE 3 i created mappings some thing like this is it correct?

PUT my_index2
{
 "mappings":{
  "my_type2": {
  "transform": {
  "script": {
    "inline": "if (ctx._source.geopoint.alt) ctx._source.geopoint.remove('alt')",
    "lang": "groovy"
  }
},
"properties": {
  "geopoint": {
    "type": "geo_point"
  }
 }
}
}
}

ERROR

This what the error i keep getting when i tried to insert your mapping

{
   "error": {
  "root_cause": [
     {
        "type": "script_parse_exception",
        "reason": "Value must be of type String: [script]"
     }
  ],
  "type": "mapper_parsing_exception",
  "reason": "Failed to parse mapping [my_type2]: Value must be of type String: [script]",
  "caused_by": {
     "type": "script_parse_exception",
     "reason": "Value must be of type String: [script]"
  }
   },
   "status": 400
}

UPDATE 2

Now the mapping is getting inserted and getting the acknowledge as true. But when try to insert the json data like below its throwing error.

PUT my_index2/my_type2/1
{
 "geopoint": {
        "lon": 48.845877,
        "lat": 8.821861,
        "alt": 0.0
        }
}         

ERROR FOR UPDATE2

{
   "error": {
  "root_cause": [
     {
        "type": "mapper_parsing_exception",
        "reason": "failed to parse"
     }
  ],
  "type": "mapper_parsing_exception",
  "reason": "failed to parse",
  "caused_by": {
     "type": "illegal_argument_exception",
     "reason": "failed to execute script",
     "caused_by": {
        "type": "script_exception",
        "reason": "scripts of type [inline], operation [mapping] and lang [groovy] are disabled"
     }
  }
  },
  "status": 400
}

ERROR 1 FOR UPDATE 2

After adding script.inline:true, tried to insert the data but getting following error.

{
   "error": {
  "root_cause": [
     {
        "type": "parse_exception",
        "reason": "field must be either [lat], [lon] or [geohash]"
     }
  ],
  "type": "mapper_parsing_exception",
  "reason": "failed to parse",
  "caused_by": {
     "type": "parse_exception",
     "reason": "field must be either [lat], [lon] or [geohash]"
  }
   },
   "status": 400
}
Shreyas Rao B
  • 193
  • 4
  • 17

1 Answers1

4

mongo-connector aims at synchronizing a Mongo database with another target system, such as ES, Solr or another Mongo DB. Synchronizing means 1:1 replication, so there's no way that I know of for mongo-connector to enrich documents during the replication (and it's not its intent either).

However, in ES 5 we'll soon be able to use ingest nodes in which we'll be able to define processing pipelines whose goal is to enrich documents before they get indexed.

UPDATE

There's probably a way by modifying the formatters.py file.

In transform_value I would add a case to handle Geopoint:

    if isinstance(value, dict):
        return self.format_document(value)
    elif isinstance(value, list):
        return [self.transform_value(v) for v in value]

    # handle Geopoint class
    elif isinstance(value, Geopoint):
        return self.format.document({'lat': value['lat'], 'lon': value['lon']})

    ...

UPDATE 2

Let's try another approach by modifying the transform_element function (on line 104):

def transform_element(self, key, value):
    try:
        # add these next two lines
        if key == 'GeoPoint':
            value = {'lat': value['lat'], 'lon': value['lon']}
        # do not modify the initial code below
        new_value = self.transform_value(value)
        yield key, new_value
    except ValueError as e:
        LOG.warn("Invalid value for key: %s as %s"
                 % (key, str(e)))

UPDATE 3

Another thing you might try is to add a transform. The reason I've not mentioned it before is that it was deprecated in ES 2.0, but in ES 5.0 you'll have ingest nodes and you'll be able to take care of it at ingest time using a remove processor

You can define your mapping like this:

PUT my_index2
{
  "mappings": {
    "my_type2": {
      "transform": {
        "script": "ctx._source.geopoint.remove('alt'); ctx._source.geopoint.remove('valid')"
      },
      "properties": {
        "geopoint": {
          "type": "geo_point"
        }
      }
    }
  }
}

Note: make sure enable dynamic scripting, by adding script.inline: true to elasticsearch.yml and restart your ES node.

What is going to happen is that the alt field will still be visible in the stored _source but it will not be indexed, and hence, no error should occur.

With ES 5, you'd simply create a pipeline with a remove processor, like this:

PUT _ingest/pipeline/geo-pipeline
{
  "description" : "remove unsupported altitude field",
  "processors" : [
    {
      "remove" : {
        "field": "geopoint.alt"
      }
    }
  ]
}
Val
  • 207,596
  • 13
  • 358
  • 360
  • In mongo-connector there is a file called as **elastic2_doc_manager.py** which is config file, **I think** mongo-connector uses this to add doc to ES from MogoDB, guess its written in python. If thats true, Is there a way were we can add field ,so that mongo-connector before inserting to ES it will add this field in the document and then insert it to ES? – Shreyas Rao B Apr 22 '16 at 05:22
  • Why i am asking this is beacuse the doc in MOngoDB contains a field **Geopoint** filed which contains **lat , lon , alt** field. But ES Does not support 3D. The ES **"type" : "geo_point"** take only **lat** and **lon**. How do i solve this? – Shreyas Rao B Apr 22 '16 at 05:28
  • ES will not support 3D geo queries (although it [should now accept three values](https://github.com/elastic/elasticsearch/issues/10510)), but what you want is just to store the altitude field separately? – Val Apr 22 '16 at 06:05
  • In our project we have given **alt, lon, lat** as variables in **Geopoint class** and also inserted large amt of doc with specifying values only to **lat** and **lon** not specifying any **alt** value. So when ever we retrive doc in MongoDB we get **GeoPoint : { "lat" : 32.002, "lon" : 45.0215, "alt" : " " }**. So Removing this **alt** field from **GeoPoint.class** and then inserting doc to MOngoDB take a lot of time. – Shreyas Rao B Apr 22 '16 at 06:15
  • I was thinking like,editing config file and before mongo-connector inserts doc to ES, to that doc we add a field like **geo_2d** which copies **lat and lon** of "GeoPoint" and then insert. so that in ES we can query location using this new field added. – Shreyas Rao B Apr 22 '16 at 06:20
  • What does currently happen if you run your synch process? – Val Apr 22 '16 at 06:31
  • Since we are not specifying any mapings in ES.... Mongo-connector syncs everything from MOngoDB to ES. But problem is we are not able to query the **geopoint** in ES ..beacuse our **geopoint** has 3 fields where in ES supports only 2D. – Shreyas Rao B Apr 22 '16 at 06:37
  • How does the field look like in ES after the synch? – Val Apr 22 '16 at 06:42
  • "geopoint": { "lat": 53.08412, "alt": 0, "valid": false, "lon": 8.64863 } ' – Shreyas Rao B Apr 22 '16 at 06:48
  • What I would do, is to specify the ES mapping explicitely and make `geopoint` a `geo_point` field. You need to delete the index and re-create it with the proper mapping and then you can run the synch again. – Val Apr 22 '16 at 06:50
  • At the time of sync will there be no problem when the mappings encounters an alt filed in the doc that is to be indexed?.. – Shreyas Rao B Apr 22 '16 at 06:53
  • The best way to know is to try it out :) – Val Apr 22 '16 at 06:56
  • You're going to get a parse exception, indeed, because of the `valid` and `alt` fields. Removing these fields could be a job for the ingester nodes I talked about, indeed – Val Apr 22 '16 at 06:57
  • Here is wat i tried. I had written mapping like "geopoint" : { "type" : "geo_point" }. but when i passed doc containing GeoPoint : { "lat" : 32.002, "lon" : 45.0215, "alt" : " " } it gives mapping error. – Shreyas Rao B Apr 22 '16 at 06:58
  • There's probably a way by modifying the [`formatters.py`](https://github.com/mongodb-labs/mongo-connector/blob/v2.2/mongo_connector/doc_managers/formatters.py) file and adding special handling for your Geopoint class – Val Apr 22 '16 at 07:00
  • I dont know python. Its too much asking you, but still can you specify where in and how to make modification?. – Shreyas Rao B Apr 22 '16 at 07:06
  • the above **handle for geopoint** will remove alt and adds just lat lon in the doc while inserting to ES? – Shreyas Rao B Apr 22 '16 at 07:18
  • Yes, that's correct, although I agree I have not tested it. But if you modify the file, you should be able to quickly see if that works or not – Val Apr 22 '16 at 07:20
  • Ok. I will give the feedback on it. – Shreyas Rao B Apr 22 '16 at 07:22
  • Its giving error like: > **File "/usr/local/lib/python2.7/dist-packages/mongo_connector/doc_managers/formatters.py", line 69, in transform_value elif isinstance(value, geopoint): > NameError: global name 'Geopoint' is not defined** – Shreyas Rao B Apr 22 '16 at 13:45
  • Geopoint is the class name, right? Should be capital-cased like `Geopoint` and also make sure to import it. – Val Apr 22 '16 at 13:46
  • doc in mongoDB has Geopoint field like this: "geopoint" : { "lat" : 23.03, "lon": 45.035, "alt": 0} – Shreyas Rao B Apr 22 '16 at 13:50
  • Time allowing, I will try it out on my local MongoDB and see what's up – Val Apr 22 '16 at 13:52
  • Sir try to have json doc in mongodb similar to [this](http://stackoverflow.com/questions/36737559/how-to-write-mappings-in-elsaticsearch-for-geopoint-having-lat-lon-and-alt/36741536#36741536) "See Update1" and then try to insert it toES using mongo-connector – Shreyas Rao B Apr 22 '16 at 14:04
  • Will do and keep you posted. – Val Apr 22 '16 at 14:05
  • thanks for your time. Will be waiting for your answer :) – Shreyas Rao B Apr 22 '16 at 14:10
  • SIr did you get any result? – Shreyas Rao B Apr 24 '16 at 06:33
  • is there anything that we can do to handle geo-points by the help of dynamic mappings in ES? – Shreyas Rao B Apr 24 '16 at 06:51
  • Sir i will test it and give you feedback – Shreyas Rao B Apr 25 '16 at 05:17
  • even after adding that line its not making any changes. ES will have "geopoint": { "lat": 53.08412, "alt": 0, "valid": false, "lon": 8.64863 } same format. – Shreyas Rao B Apr 25 '16 at 11:03
  • sir is there anything we can change in [elastic2-doc-manager](https://github.com/mongodb-labs/elastic2-doc-manager/blob/master/mongo_connector/doc_managers/elastic2_doc_manager.py) – Shreyas Rao B Apr 25 '16 at 11:05
  • even tried using dynamic mappings but it didnt work. – Shreyas Rao B Apr 26 '16 at 05:50
  • tried copy_to some thing like this : 'PUT a { "mappings": { "b": { "properties": { "geopoint": { "properties": { "lon": { "type": "double", "copy_to": "geopoint2d" }, "lat": { "type": "double", "copy_to": "geopoint2d" }, "alt": { "type": "double" }, "geopoint2d": { "type": "geo_point" } } } } } } }' – Shreyas Rao B Apr 26 '16 at 05:53
  • its giving error like this : ' "type": "script_parse_exception", "reason": "Value must be of type String: [script]"' – Shreyas Rao B Apr 26 '16 at 06:13
  • Are you sure you properly copy/pasted the index definition? it's working on both ES 1.x and ES 2.x on my end. Please check again. – Val Apr 26 '16 at 06:14
  • You have one too many closing parenthesis in the script, please check my update again. – Val Apr 26 '16 at 06:37
  • Sir whats that "test" indicate? Is it just a type name? – Shreyas Rao B Apr 26 '16 at 06:43
  • Oh, yes, you need to replace that with `my_type2` of course. Please see my revised UPDATE 3 – Val Apr 26 '16 at 06:46
  • i got confused...let me try once again :)) – Shreyas Rao B Apr 26 '16 at 06:47
  • Check my **ERROR** thats what i get – Shreyas Rao B Apr 26 '16 at 06:54
  • I've slightly modified the index definition. Please try again and this time it should work. – Val Apr 26 '16 at 06:56
  • pls check my **UPDATE 2** – Shreyas Rao B Apr 26 '16 at 07:05
  • Ok, that's great progress. I missed that but you also need to enable dynamic scripting, by adding `script.inline: true` to `elasticsearch.yml` and restart your ES node. – Val Apr 26 '16 at 07:06
  • were in `elasticsearch.yml` should i add? at the end? – Shreyas Rao B Apr 26 '16 at 07:09
  • Yes, at the end is fine. – Val Apr 26 '16 at 07:10
  • Pls see my **ERROR 1 ON UPDATE 2** – Shreyas Rao B Apr 26 '16 at 07:16
  • I think it's because of the `valid` field, which I forgot to remove... Replace the script string with this new one and try again `if (ctx._source.geopoint.alt) { ctx._source.geopoint.remove('alt'); ctx._source.geopoint.remove('valid'); }` – Val Apr 26 '16 at 07:18
  • Pls can you show me the above modification in your **UPDATE 3** – Shreyas Rao B Apr 26 '16 at 07:22
  • Done, please check again. – Val Apr 26 '16 at 07:23
  • @Val...yeee JSON data is getting inserted..Now can you show me how to query this geopoint? – Shreyas Rao B Apr 26 '16 at 07:25
  • Awesome, glad we made it work !!! Since we took care of the indexing part, I suggest you create another question for the searching part, I think this one was already quite a ride, don't you think? – Val Apr 26 '16 at 07:27
  • yes..sir..sure thanks a lot for your help. I got how to query and got result too......I dont know how to thank you. thanks thanks alot... – Shreyas Rao B Apr 26 '16 at 07:38
  • If you learned something in this thread, then that's perfect! There are a few books, among which the [official one](https://www.elastic.co/guide/en/elasticsearch/guide/master/index.html) by ES folks and [another one](https://www.manning.com/books/elasticsearch-in-action) by Manning I contributed to – Val Apr 26 '16 at 07:42
  • I too heard from others that manning is good. Now definitely i will by that....One last question. If i have any problem in ES further how do i connect to you? For ex. If i have a post how do i link it to you? – Shreyas Rao B Apr 26 '16 at 07:46
  • Just post a question, there are many people around here to help you out :-) – Val Apr 26 '16 at 07:47
  • Sir sorry to distrub you. Do you know any solution for [this](http://stackoverflow.com/questions/36982849/illegalstateexception-for-mixing-up-field-types) – Shreyas Rao B May 02 '16 at 15:04