0

I have a problem with my ES queries where they fail because of having array indexes in their query strings. This happens because my following approach. I flatten the JSON requests that I get with the following method.

private void flattenJsonRequestToMap(String currentPath, JsonNode jsonNode, Map<String, Object> map) {
        if (jsonNode == null || jsonNode.isNull()) {
            map.remove(currentPath);
        } else if (jsonNode.isObject()) {
            ObjectNode objectNode = (ObjectNode) jsonNode;
            Iterator<Map.Entry<String, JsonNode>> iter = objectNode.fields();
            String pathPrefix = currentPath.isEmpty() ? "" : currentPath + ".";

            while (iter.hasNext()) {
                Map.Entry<String, JsonNode> entry = iter.next();
                flattenJsonRequestToMap(pathPrefix + entry.getKey(), entry.getValue(), map);
            }
        } else if (jsonNode.isArray()) {
            ArrayNode arrayNode = (ArrayNode) jsonNode;
            for (int i = 0; i < arrayNode.size(); i++) {
                flattenJsonRequestToMap(currentPath + "[" + i + "]", arrayNode.get(i), map);
            }
        } else if (jsonNode.isValueNode()) {
            ValueNode valueNode = (ValueNode) jsonNode;
            map.put(currentPath, valueNode.asText());
        } else {
            LOGGER.error("JSONNNode unexpected field found during the flattening of JSON request" + jsonNode.asText());
        }
    }

When the Json requests have lists in them, my flattened map looks like below.

myUserGuid -> user_testuser34_ibzwlm
numberOfOpenings -> 1
managerUserGuids[0] -> test-userYspgF1_S3P6s
accessCategories[0] -> RESTRICTED
employeeUserGuid -> user_user33_m1minh

Now I construct ES Query with the following method using the above map.

public SearchResponse searchForExactDocument(final String indexName, final Map<String, Object> queryMap)
            throws IOException {
        BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery();
        queryMap.forEach((name, value) -> {
            queryBuilder.must(QueryBuilders.matchPhraseQuery(name, value));
            LOGGER.info("QueryMap key: {} and value: {} ", name, value);
        });
        return this.executeSearch(indexName, queryBuilder);
    }

As you can already see, it ends up executing the query below, with the array indexes in them. My mapping structure is as follows.

{
  name=job,
  type=_doc,
  mappingData={
    properties={
    
      myUserGuid ={
        type=text,
        fields={
          keyword={
            ignore_above=256,
            type=keyword
          }
        }
      },
      numberOfOpenings ={
        type=long
      },
      numOfUsage={
        type=long
      },
      accessCategories ={
        type=text,
        fields={
          keyword={
            ignore_above=256,
            type=keyword
          }
        }
      },
      managerUserGuids ={
        type=text,
        fields={
          keyword={
            ignore_above=256,
            type=keyword
          }
        }
      },
      employeeUserGuid ={
        type=text,
        fields={
          keyword={
            ignore_above=256,
            type=keyword
          }
        }
      }
  }
}

Because of the appended array index next to the name, the queries don't return any search results. How can I navigate this issue? One option I see is removing the array index using flattening the map, however I need to be able to construct a POJO object which has list for those fields in concern, using the flattened map. Would appreciate any advice/suggestions. Thanks a lot in advance.

AnOldSoul
  • 4,017
  • 12
  • 57
  • 118

1 Answers1

0

Lists in ES are processed like just having few values for one field so if you have "accessCategories": ["foo", "bar"] this doc will match both "accessCategories": "foo" and "accessCategories": "bar" though there is no way to make a query which would match only one ("foo" but not "bar") with this data schema.

If you need to address specific items, you can unwrap list into separate fields accessCategories_0, accessCategories_1, etc. though there is a limit in Elasticsearch for total number of fields in one index.

ilvar
  • 5,718
  • 1
  • 20
  • 17
  • I actually don't need to access specific items, and want those two list items to be considered with "Must". However my bool query ends up having accessCategories[0]. accessCategories[1] etc. Do I need to make the map have accessCategories -> item1, accessCategories -> item2 using Java somehow to build the query? – AnOldSoul Feb 01 '22 at 23:08
  • Yes, would be something like: `queryBuilder.must(QueryBuilders.matchPhraseQuery(name, value1)); queryBuilder.must(QueryBuilders.matchPhraseQuery(name, value2));` – ilvar Feb 01 '22 at 23:10