0

I have a list of map of fields from ElasticSearch in a JSON structure. I need to extract the keys from the fields into a name.value list to be used as search terms.

For example, the response I get from ElasticSearch looks like:

{
    "orange": {
        "type": "keyword"
    },
    "apple": {
        "type": "keyword"
    },
    "banana": {
        "type": "keyword"
    },
    "pineapple": {
        "properties": {
            "color": {
                "type": "text"
            },
            "size": {
                "type": "text"
            }
        }
    },
    "vegetables": {
        "properties": {
            "potato": {
                "properties": {
                    "quality": {
                        "type": "keyword"
                    },
                    "price": {
                        "type": "keyword"
                    },
                    "location": {
                        "type": "keyword"
                    }
                }
            }
        }
    }
}


I need to transform this into a list of

[
  "orange",
  "apple",
  "banana",
  "pineapple.color",
  "pineapple.size",
  "vegetables.potato.quality",
  "vegetables.potato.price",
  "vegetables.potato.location",
  "vegetables.cabbage"
]

I'm a bit lost as to where to start so I end up with something that will work no matter how deep the "object" + "properties" key ends up being.

edit:

I have a couple of methods I'm trying to do this with, but I keep ending up with nested loops instead

private static String process(final Map.Entry<String, Object> entry) {
    final String fieldName = entry.getKey();
    final Map<String, Object> value = toSourceMap(entry.getValue());
    if (value.containsKey("properties")) {
        final Map<String, Object> properties = toSourceMap(value.get("properties"));
        process(entry); // ??
    }
    return fieldName;
}

And a small helper method I'm using which casts the unknown object to a map

private static Map<String, Object> toSourceMap(final Object sourceMap) {
    try {
        final Map<String, Object> map = (Map) sourceMap;
        return map;
    } catch (final Exception e) {
        return Map.of();
    }
}

And I'm calling this

final List<String> fieldName = new ArrayList<>();

for (final Map.Entry<String, Object> entry : properties.entrySet()) {
    fieldName.add(process(entry));
}

Trying to get a list of each value from the process method

edit 2: I can get something that works for one level deep, but this won't capture the deeper objects like vegetables.potato.quality

    private static List<String> process(final Map.Entry<String, Object> entry) {

        final String fieldName = entry.getKey();
        final Map<String, Object> value = toSourceMap(entry.getValue());

        final List<String> fields = new ArrayList<>();
        if (value.containsKey("properties")) {
            final Map<String, Object> properties = toSourceMap(value.get("properties"));

            properties.keySet().stream().map(s -> fieldName + "." + s).forEach(fields::add);
        } else {
            fields.add(fieldName);
        }

        return fields;
    }

and the caller

        final List<String> fieldName = new ArrayList<>();

        for (final Map.Entry<String, Object> entry : properties.entrySet()) {
            fieldName.addAll(process(entry));
        }
Alexander Ivanchenko
  • 25,667
  • 5
  • 22
  • 46

3 Answers3

0

I'm sure theres a cleaner and better way to do it, but I was able to achieve what I needed by doing the following

   private static List<String> process(final Map.Entry<String, Object> entry) {

    final String fieldName = entry.getKey();
    final List<String> fields = new ArrayList<>();
    final Map<String, Object> value = toSourceMap(entry.getValue());

    if (value.containsKey(MAPPING_PROPERTIES)) {
        toSourceMap(value.get(MAPPING_PROPERTIES)).entrySet()
            .stream()
            .map(FieldMappingFactory::process)
            .map(nestedFields -> nestedFields.stream().map(f -> "%s.%s".formatted(fieldName, f)).toList())
            .forEach(fields::addAll);
    } else {
        fields.add(fieldName);
    }

    return fields;
}
0

Here's a solution based on the Depth first search tree-traversal algorithm.

Since it's iterative, you can use it to process even deeply nested massive JSON without a risk of getting a StackOverFlowError.

To implement DFS, we need a Stack and a Map is needed to store the paths associated with a node that are being explored.

As the first step, we need to read the whole JSON tree and the obtained JsonNode on the stack.

Then, until the stack is not empty, need to examine each node it contains by pulling them out from the stack. If the node happens to be an ArrayNode or ObjectNode, then all its children-nodes which could be obtained via JsonNode.fields() should be added on the stack.

String json = // the source JSON
JsonNode node = new ObjectMapper().readTree(json);
    
List<String> results = new ArrayList<>();
Deque<JsonNode> stack = new ArrayDeque<>();
        
Map<JsonNode, List<String>> pathByNode = new HashMap<>();
pathByNode.put(node, Collections.emptyList());
        
Set<String> keysToExclude = Set.of("type", "properties"); // add more if you need
        
stack.push(node);
        
while (!stack.isEmpty()) {
    JsonNode current = stack.pop();
    List<String> path = pathByNode.get(current);
            
    if (current instanceof ArrayNode || current instanceof ObjectNode) {
                
        for (Iterator<Map.Entry<String, JsonNode>> it = current.fields(); it.hasNext(); ) {
                    
            Map.Entry<String, JsonNode> next = it.next();
            stack.push(next.getValue());
    
            String propertyName = next.getKey();
            List<String> newPath;
                    
            if (!keysToExclude.contains(propertyName)) {
                newPath = new ArrayList<>(path);
                newPath.add(propertyName);
    
                results.add(String.join(".", newPath)); // list of path should be updated
            } else {
                newPath = path;
            }
            pathByNode.put(next.getValue(), newPath);
        }
    }
}
        
results.forEach(System.out::println);

Output:

orange
apple
banana
pineapple
vegetables
vegetables.potato
vegetables.potato.quality
vegetables.potato.price
vegetables.potato.location
pineapple.color
pineapple.size
Alexander Ivanchenko
  • 25,667
  • 5
  • 22
  • 46
0

This is a simpler solution.

private static void process(List<String> paths, String path, JsonNode node, Set<String> excludeKeys) {
    if (node.isValueNode()) {
        paths.add(path);
    } else if (node.isObject()) {
        node.fields().forEachRemaining(elem -> {
            String key = excludeKeys.contains(elem.getKey()) ? path : (path == null ? "" : path + ".") + elem.getKey();
            process(paths, key, elem.getValue(), excludeKeys);
        });
    } else { // This part is not required if there is no array inside the source JSON object
        for (int i = 0; i < node.size(); i++) {
            process(paths, String.format("%s[%d]", path, i), node.get(i), excludeKeys);
        }
    }
}

Caller

JsonNode node = new ObjectMapper().readTree(// JSON string //);
List<String> paths = new ArrayList<>();
process(paths, null, node, Set.of("properties", "type"));
paths.forEach(System.out::println);

Output

orange
apple
banana
pineapple.color
pineapple.size
vegetables.potato.quality
vegetables.potato.price
vegetables.potato.location
Raymond Choi
  • 1,065
  • 2
  • 7
  • 8