Questions tagged [openrefine]

OpenRefine is the new name for the data cleaning tool which used to be called Google Refine (and was born as Freebase Gridworks)

Resources

400 questions
4
votes
1 answer

Parse multivalued JSON in GREL (OpenRefine)

I have a column with the following content: 7. {"resource":"abc"} 8. [{"resource":"def"},{"resource":"ghi"}] I try to get the content of "resource": value.parseJson().resource Works. If I try to get the content of multivalued cells, I can't get it…
CH_
  • 685
  • 1
  • 7
  • 18
4
votes
2 answers

Best way to parse a big and intricated Json file with OpenRefine (or R)

I know how to parse json cells in Open refine, but this one is too tricky for me. I've used an API to extract the calendar of 4730 AirBNB's rooms, identified by their IDs. Here is an example of one Json file :…
Ettore Rizza
  • 2,800
  • 2
  • 11
  • 23
4
votes
0 answers

using ANT to package the Web app directory with the executable jar

I am working on an open source project that uses ANT which can perform all kind of useful targets. One of them generate an executable jar and a webapp directory in the same directory. I tried to find ways to package that webapp directory with the…
InsaneBot
  • 2,422
  • 2
  • 19
  • 31
4
votes
1 answer

Progressive number in Openrefine column

Is it possible to generate a "counter", a progressive number in a column using GREL? For example, I would like to add value to that number to generate an identifier for each record.
Aubrey
  • 507
  • 4
  • 20
4
votes
1 answer

Combine column x to n in OpenRefine

I have a table with an unknown number of columns, and I need to combine all columns after a certain point. Consider the following: | A | B | C | D | E | |----|----|---|---|---| | 24 | 25 | 7 | | | | 12 | 3 | 4 | | | | 5 | 5 | 5 | 5 | …
OleVik
  • 1,222
  • 3
  • 14
  • 28
4
votes
1 answer

Open Refine - Add another file to the existing Project

I've imported a CSV file to OR (Open Refine). Since the CSV file I have contains over 200,000 records, I've decided to create separate files, since uploading the large file wouldn't work in my computer (takes too long, not even sure if it is…
olleh
  • 1,248
  • 5
  • 16
  • 43
3
votes
2 answers

Add column with number of occurrencies, reset for each record

I have records with variable number of rows and a column A with 7 possible values, all of them are repeatable. I need a new column B based on A showing the number of occurrencies of each value per record. The count should reset in every record. I…
pasq87
  • 31
  • 1
3
votes
2 answers

OpenRefine - Merge multiple column values into new column should (?) work

My data includes multiple columns that--for my purposes--are the same. In these places, I need to combine the values in multiple selected columns into a single column. For example, combine columns names1, names2, and names3 into a single column…
3
votes
2 answers

How to do a dynamic regex in openrefine GREL replace?

I'm trying to remove case-insensitively the value of cell 'artist' from the current cell (which is a song name). I know that replace() can take regex as argument…
dvalexieva
  • 31
  • 2
3
votes
2 answers

Transpose variable number of rows into columns in OpenRefine

I have an xml file containing records from a library catalogue. I have imported it into OpenRefine but all the values are in one column. I want to transpose it so each field in the record has its own column. However, this is complicated by the fact…
3
votes
1 answer

Removing duplicate strings from a comma separated list, in a cell

I'm using Google Sheets and this is way beyond my simple scripting. I have numerous cells containing comma separated values; AA, BB, CC, BBB, CCC, CCCCC, AA, BBB, BB BB, ZZ, ZZ, AA, BB, CC, BBB, CCC, CCCCC, AA, BBB, BB I'm trying to return: AA,…
Callum
  • 554
  • 2
  • 7
  • 18
3
votes
1 answer

How to use or statement in a if statement in openrefine

I need to verify 2 values inside a if condition in Openrefine I already tried: if(value > 5.6 | < -33, "inside", "outside") if(value > 5.6 || < -33, "inside", "outside") if(value > 5.6 or < -33, "inside", "outside")
Joni Hoppen
  • 658
  • 5
  • 23
3
votes
1 answer

clustering word in sentences in openrefine

I'd like to cluster words in a text file with rows like this: number queries waiting support representatives become available query numbers More specifically, I want to replace words with their cluster representatives without changing the…
Viktor
  • 45
  • 7
3
votes
1 answer

Openrefine : key collision-fingerprint clustering + diacritics

I thinks there is a bug (or a very surprising feature...) in the way openrefine manage diacritics in "key collision-fingerprint" clustering: row 1 : école row 2 : école école ecole -> clustering -> 0 cluster same issue with row 1 : école row 2 :…
Mathieu Saby
  • 125
  • 5
3
votes
1 answer

OpenRefine changing the port and host when executable is run directly

The refine.ini allow setting the port and host without the need to re-building, but it says the following: # NOTE: This file is not read if you run the Refine executable directly # It is only read of you use the refine shell script or…
InsaneBot
  • 2,422
  • 2
  • 19
  • 31
1
2
3
26 27