1

I am geocoding using OpenRefine. I pulled data from OpenStreetMaps to my datasetstructure of data

I am adding a "column based on this column" for the coordinates.I want to check that the display_name contains "Rheinland-Pfalz" and if it does, I want to extract the latitude and longitude,i.e. pair.lat + ',' + pair.lon. I want to do this iteratively but I don't know how. I have tried the following:

if(display_name[0].contains("Rheinland-Pfalz"), with(value.parseJson()[0], pair, pair.lat + ',' + pair.lon),"nothing")

but I want to do this for each index [0] up to however many there are. I would appreciate if anyone could help.

Edit: Thanks for your answer b2m. How would I extract the display_name corresponding to the coordinates that we get. I want the output to be display_name lat,lon for each match (i.e. contains "Rheinland-Pfalz", because I have a different column containing a piece of string that I want to match with the matches generated already.

For example, using b2m's code and incorporating the display_name in the output we get 2 matches:

Schaumburg, Balduinstein, Diez, Rhein-Lahn-Kreis, Rheinland-Pfalz, Deutschland 50.33948155,7.9784308849342604 Schaumburg, Horhausen, Flammersfeld, Landkreis Altenkirchen, Rheinland-Pfalz, Deutschland 52.622319,14.5865283

For each row, I have another string in a different column. Here the entry is "Rhein-Lahn-Kreis". I want to filter the two matches above to only keep those containing my string in the other column. In this case "Rhein-Lahn-Kreis" but the other column entry is different for each row. I hope this is clear and I would greatly appreciate any help

chris
  • 11
  • 2

1 Answers1

1

Assuming we have the following json data

[
    {"display_name": "BW", "lat": 0, "lon": 1},
    {"display_name": "NRW 1", "lat": 2, "long": 3},
    {"display_name": "NRW 2", "lat": 4, "lon": 5}
]

You can extract the combined elements lat and long with forEach and filter using the following GREL expression e.g. in the Add column based on this column dialog.

forEach(
    filter(
        value.parseJson(), geodata, geodata.display_name.contains("NRW")
    ), el, el.lat + "," + el.lon)
.join(";")

This will result in a new field with the value 2,3;4,5.

You can then split the new multi valued field on the semicolon ";" to obtain separated values (2,3 and 4,5).

Another approach would be to split the JSON Array elements into separate rows, avoiding the forEach and filter functions.

b2m
  • 529
  • 3
  • 11
  • Thank you very much - that worked well. Would you mind taking a look at my edited post as well, please? – chris Jan 19 '22 at 13:48
  • Ok, you extended your problem by adding a second question, which does not help the comprehensibility of your original question. As far as I understand your addition you want to compare the content of one column with the content of the other column... like `value.contains(row.cells["Comparison Columns Name"].value)`. You can use this expression in a "Custom Facet" or to create a indicator column with boolean values, or wrap it in an if-clause, ... – b2m Jan 19 '22 at 19:20