2

I have an API service written in Go, using gin-gonic, which is backed by an Elasticsearch service. Queries hit the API, the API server queries Elasticsearch, Elasticsearch replies with the search results to the API, and the API serves the results back to the party that started the process. All of it "works", end-to-end, but there is a problem.

In Elasticsearch, I have 20 indexes -- each index has thousands of documents. When I use go-elasticsearch to query one of the indexes in Elasticsearch, via gin-gonic, I get an Elasticsearch result back, which gets unmarshal'd into these structs:

type esResult struct {
        Took     int        `json:"took"`
        Timedout bool       `json:"timed_out"`
        Shards   jsonShards `json:"_shards"`
        Hits     jsonHits   `json:"hits"`
}

type jsonShards struct {
        Total      int `json:"total"`
        Successful int `json:"successful"`
        Skipped    int `json:"skipped"`
        Failed     int `json:"failed"`
}

type jsonHits struct {
        Total    jsonTotal     `json:"total"`
        Maxscore float64       `json:"max_score"`
        Hitlist  []jsonHitlist `json:"hits"`
}

type jsonHitlist struct {
        Index string  `json:"_index"`
        Type  string  `json:"_type"`
        ID    string  `json:"_id"`
        Score float64 `json:"_score"`
        Source customForIndex `json:"_source"`
        //      Source []byte `json:"_source"`
}

My problem is that when the esResult.Hits.Hitlist.Source field, shown above as a struct of type 'customForIndex', differs depending on which Index I query.

What I'd like to do, is this:

                var esres esResult
                err := json.Unmarshal(resp, &esres)

That works fine, if I use customForIndex, but if I set Source to string or []byte (so that I can unmarshal Source separately), it does not seem to work.

The workaround, which I'm aware is awful, is to define duplicate structures (one set, per index), but that's awful, as it would result in many esResult structures, many Hits structures, and so on, which are all duplicates.

So how do I unmarshal Elasticsearch replies, so that I can get at the custom part of the reply?

Said another way, if I take the reply from Elasticsearch and pass it directly to the API server (and thus to the client), without any processing, then it includes all sorts of irrelevant (to the client) Elasticsearch information. So my plan was to strip all that out, and only include the data from the Source object in the Elasticsearch reply. But if I unmarshal esResult, esResult itself is invalid on other indexes, because the child struct is different.

Thoughts? Any help would be appreciated. Basically my question relates to unmarshaling nested structures in golang and just happens to show up in Elasticsearch use cases because most (but not all) of the structures are duplicative across indexes.

mbenoit6
  • 21
  • 2

2 Answers2

1

You can make use of "github.com/olivere/elastic/v7" library, which helps you store the data in SearchResult struct easily

type SearchResult struct {
Header          http.Header          `json:"-"`
TookInMillis    int64                `json:"took,omitempty"`             // search time in milliseconds
TerminatedEarly bool                 `json:"terminated_early,omitempty"` // request terminated early
NumReducePhases int                  `json:"num_reduce_phases,omitempty"`
Clusters        *SearchResultCluster `json:"_clusters,omitempty"`    // 6.1.0+
ScrollId        string               `json:"_scroll_id,omitempty"`   // only used with Scroll and Scan operations
Hits            *SearchHits          `json:"hits,omitempty"`         // the actual search hits
Suggest         SearchSuggest        `json:"suggest,omitempty"`      // results from suggesters
Aggregations    Aggregations         `json:"aggregations,omitempty"` // results from aggregations
TimedOut        bool                 `json:"timed_out,omitempty"`    // true if the search timed out
Error           *ErrorDetails        `json:"error,omitempty"`        // only used in MultiGet
Profile         *SearchProfile       `json:"profile,omitempty"`      // profiling results, if optional Profile API was active for this search
Shards          *ShardsInfo          `json:"_shards,omitempty"`      // shard information
Status          int                  `json:"status,omitempty"`       // used in MultiSearch

}

Further you can get what you want using

for _, hit := range searchResult.Hits.Hits { //your logic here }
Tesla
  • 11
  • 2
0

I was reading your answer and i remember when i fighted with elastic to return only what i need.

Really i don't know how you are making the querys (if you can share some code, it's will be good)

But, in my case, a did the following.

         "aggs": {
            "top_values": {
              "terms": {
                "field": "data.field_group.keyword"
              },
              "aggs": {
                "top_field_hits": {
                  "top_hits": {
                    "_source": {
                      "includes": [ "data.field1", "data.field2", "data.field3", "data.field4"]
                    }
                  }
                }
              }
            }
          }

And after that ir read the responses, reading the result starting with:

for _, hit := range values["aggregations"].(map[string]interface{})["top_values"].(map[string]interface{})["buckets"].([]interface{})

inside this "FOR" statement, I made a Marshall of the structure containing "hit" and then I did an unmarshal with MY structure.

var totalValue MyStruct
marshalTotal, _ := json.Marshal(hit)
_ = json.Unmarshal(marshalTotal, &totalValue)

You need in your struct the same tag name like the result of elastic.

MyProperty string `json:"elasticNameField"`

On the other hand, maybe you can use the [Olivera] 1 library, in my experience it is a good library, but I was able to find some difference in the response times in the queries.

I wish you can use this information to solve your problem.

Regards,

Jsperk
  • 124
  • 1
  • 11