2

I am trying create an index and then do a bulk insert using RestHighLevelClient to my ES (the code is in Kotlin).

The bulk insert code is :

private fun insertEntity(entityList: List<Person>, indexName: String) {
    var count = 0
    val bulkRequest = BulkRequest()

    entityList.forEach {
        bulkRequest.add(IndexRequest(indexName).source(it,XContentType.JSON))
        count++

        if (count == batchSize) {
            performBulkInsert(bulkRequest)
        }
    }
}

When executing this, I am getting an exception saying : Limit of 1000 fields is crossed.

On analysing my code, I feel the implementation is wrong, because :

bulkRequest.add(IndexRequest(indexName).source(it,XContentType.JSON))

source takes a String type but I am passing the Person (it)object itself. So I believe that is causing some issue related to 1000 fields based on my mapping or something.

Not sure if my assumption is correct. If yes, how can I achieve the bulk insert then ?

EDIT

Index creation:

private fun createIndex(indexName: String) {
    val request = CreateIndexRequest(indexName)

    val settings = FileUtils.readFileToString(
        ResourceUtils.getFile(
            ResourceUtils.CLASSPATH_URL_PREFIX + "settings/settings.json"), "UTF-8")

    val mappings = FileUtils.readFileToString(
        ResourceUtils.getFile(
            ResourceUtils.CLASSPATH_URL_PREFIX + "mappings/personMapping.json"), "UTF-8")

    request.settings(Settings
        .builder()
        .loadFromSource(settings, XContentType.JSON))
        .source(mappings, XContentType.JSON)
    restHighLevelClient.indices().create(request, RequestOptions.DEFAULT)
    
}

Mapping.json Please note original has 16 fields.

{
  "properties": {
    "accessible": {
      "type": "boolean"
    },
    "person_id": {
      "type": "long"
    },
    "person_name": {
      "type": "string",
      "analyzer": "lower_keyword"
    }
}
}

Thanks.

newLearner
  • 637
  • 1
  • 12
  • 20
  • I already commented that your answer unfortunately did not resolve the original issue. Hence, I did not accept. – newLearner Aug 05 '20 at 08:40

1 Answers1

1

Looks like you are using the dynamic mapping and due to some mistake when you index a document it ends up creating new fields in your index which crossed the 1000 fields limit.

Please see if you can use the static mapping or debug the code which prepares the document and compare it with your mapping to see if its creating new fields.

Please refer this SO answer to increase the limit if its legitimate or use static mapping or debug the code to figure out why you are adding new fields to elasticsearch index.

Amit
  • 30,756
  • 6
  • 57
  • 88
  • Ok. So 2 questions. 1. Passing the person Object to the source() is correct ? That is not a problem ? 2. I have my mapping defined in a json file. And while creating an index, I am just reading that json file as a string passing it. Is that an issue ? Please check I have updated my code with the index creation part as well. – newLearner Jul 29 '20 at 10:06
  • Also, I am fairly new to this so not sure if this way of mapping is dynamic or static (looks static as I am just reading it from a file). Would really appreciate help. Struggling with this since yesterday. – newLearner Jul 29 '20 at 10:13
  • @newLearner yeah by looking at your code, its a static mapping but in file you mentioned carmapping while you are passing person object? is this typo? – Amit Jul 29 '20 at 10:24
  • Ah sorry. That is just a typo.. Corrected ! – newLearner Jul 29 '20 at 10:25
  • @newLearner can you also provide your JSON mapping details? – Amit Jul 29 '20 at 10:32
  • I am afraid I can't share the original :(.. But I am posting a dummy one which is very similar except for the actual names. Also in total there are total 16 fields.. For the sake of simplicity I am posting just 3. Hope that helps. – newLearner Jul 29 '20 at 10:42
  • @newLearner thats fine, i don't need your actual one, can you also compare your 16 fields with what you get from GET mapping API output? use http://localhost:9200//_mapping to get the actual no of fields in your index. – Amit Jul 29 '20 at 10:46
  • I am getting site can't be reached. I guess ES is not running or something ? I did not do any localhost configuration for my client. I am just auto wiring it and using. – newLearner Jul 29 '20 at 10:56
  • Thanks. I'll try to debug. – newLearner Jul 29 '20 at 11:02
  • 1
    Hey, I got the issue but not sure why this is happening. So when I am doing ```bulkRequest.add(IndexRequest(indexName).source(it,XContentType.JSON))```, in the kibana I can see that my mapping is completely wrong. ```{ "myindexneww" : { "mappings" : { "properties" : { and inside this properties instead of fields there are Person Objects with all the data inside them ``` Any idea what can be the issue for this to happen ? – newLearner Jul 29 '20 at 16:14
  • 1
    Basically the way I am trying to create index with mapping is wrong. The index is getting created but mappings are not present. ```{ "adindex" : { "mappings" : { } } } ``` – newLearner Jul 29 '20 at 16:20
  • Sure will do that. Just last thing, how can I do static mapping ? Thanks for all the help. – newLearner Jul 29 '20 at 16:30
  • @newLearner you are on the right path, please ask a follow-up question and I shall answer this if you ask in next 10 mins :) as its late for me, otherwise I can give tomorrow – Amit Jul 29 '20 at 16:32
  • @newLearner sure – Amit Jul 29 '20 at 16:37
  • @newLearner done and please explain your question bit more so that you don't get downvote. – Amit Jul 29 '20 at 16:48
  • There 2 issues with the code of creating an index. 1. Instead of ```request.source()```, it should have been ```request.mapping()```. 2. In the ```mapping.json```, I am using ```string```, which should have been ```text```. By doing this, I was able to create an index with correct mapping. Confirmed the same on Kibana as well. – newLearner Jul 29 '20 at 18:04
  • Secondly, I am no where creating a document here and then indexing that document. Instead I was adding the object itself. Created a document and this issue is fixed now. Now I am getting ```Caused by: java.net.SocketTimeoutException: 5,000 milliseconds timeout on connection http-outgoing-0 [ACTIVE]``` while doing bulk insert. Will see how I can fix that. Thanks for your effort. Also, sorry but as this answer did not solve the issue, I am un-accepting this answer. – newLearner Jul 29 '20 at 21:34
  • I already apologised to you thinking your answer was correct. But it was not. Do you really think it would wise to accept an answer which does not resolve the issue ? – newLearner Aug 05 '20 at 10:33
  • And related to accepting and unaccepting, please read my comments. Also your follow up question's answer was of no use technically because your 10 lines of code was exactly what my 1 liner was doing.So I don't think it was a wise option. – newLearner Aug 05 '20 at 10:36
  • Would you like to explain how your answer revolved the 1000 field issue ? As I can read you suspected dynamic mapping to be the reason; whilst it was static mapping that I had in my code. So how come your answer solves the issue ? – newLearner Aug 05 '20 at 10:39
  • @newLearner debugging and writing efficient code is not the intention of this website, since we can't do live debugging I provided the code which worked for me, we try to help the community without much info, your env, library are different and at least we can expect the proper response from the users who we are trying to help. – Amit Aug 05 '20 at 10:39
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/219257/discussion-between-opster-elasticsearch-ninja-and-newlearner). – Amit Aug 05 '20 at 10:40