Questions tagged [openrefine]

OpenRefine is the new name for the data cleaning tool which used to be called Google Refine (and was born as Freebase Gridworks)

Resources

400 questions
3
votes
3 answers

OpenRefine custom text faceting

I have a column of names like: Quaglia, Pietro Paolo Bernard, of Clairvaux, Saint, or .E., Calvin F. Swingle, M Abate, Agostino, Assereto Abati, Antonio 10-NA)\u, Ferraro, Giuseppe, ed, Biblioteca comunale ariostea. Mss. (Esteri I want to…
Lara M.
  • 855
  • 2
  • 10
  • 23
3
votes
3 answers

Reconciliation services for OpenRefine not working?

Has anyone been experiencing problems with reconciliation in OpenRefine? I've imported a list of American universities and colleges, selected 50 rows, and tried Freebase, DBpedia, OpenCorporates reconciliation services. I've previously had multiple…
ultrageek
  • 637
  • 1
  • 6
  • 13
3
votes
2 answers

How to parse XML in Google Refine to extract data?

I need to parse an XML using Google Refine to extract some data from it. The XML is something like this one
Cesare
  • 1,629
  • 9
  • 30
  • 72
3
votes
2 answers

Mod_Proxy doesn't show OpenRefine App properly

I have OpenRefine (a webapp hosted by jetty) running on: http://127.0.0.1:3333 Which looks like this: Everything works perfectly. Now I would like to tunnel this through Apache2 (for security and renaming reasons), so I changed my http.conf file…
Jesus
  • 655
  • 1
  • 7
  • 21
3
votes
1 answer

Filter only by having blank/empty string cells

I want to investigate the rows for which a certain column is empty. I'll fill these cells based on values in other columns, but I want to identify which ones have not yet been done. If I make a filter on that column, it doesn't do anything until I…
drevicko
  • 14,382
  • 15
  • 75
  • 97
3
votes
2 answers

Easiest way to merge rows in Google Refine (OpenRefine) if all columns are identical

I'm cleaning data with OpenRefine (was Google Refine) from multiple sources. I have files from different sources which contain companies, column definitions are identical i.e. UNID | Name | Street | City | Country | Phone | ... sg52d…
Christian Waidner
  • 1,324
  • 1
  • 13
  • 22
3
votes
1 answer

Trying to parse a Json with Open Refine GREL

I'm trying to parse this JSON but really can't find the way to extract the data I want. { "results" : [ { "address_components" : [ { "long_name" : "44", "short_name" : "44", "types" : [ "street_number" ] }, {…
3
votes
1 answer

Google refine cross-reference between row and column

I'm not sure if this can be achieved in Google Refine at all. But basically, I have data like this. The first table is the table of all the users. The second table show all the friends. However, in the second table in "friends" column not all the…
toy
  • 11,711
  • 24
  • 93
  • 176
2
votes
1 answer

How to get multilingual sites after Wikidata reconciliation in OpenRefine

I have a column of reconciled entities in OpenRefine which include entities like Q56085233 and I would like to retrieve all links inside "Multilingual sites", if possible with a separator or only one at a time. That is Q56085233, for instance, has…
silviaegt
  • 317
  • 3
  • 12
2
votes
1 answer

How to extract undo/redo transformations from OpenRefine Client?

I am using the OpenRefine client: https://github.com/opencultureconsulting/openrefine-client I need to automate processes and for this I need to be able to extract/export OpenRefine transformation history (undo/redo) in JSON format from the client,…
frankh07
  • 41
  • 4
2
votes
2 answers

OpenRefine error: "object[] value not storable"

I'm trying to extract an array of industry code descriptions from the OpenCorporates.com JSON output using OpenRefine. I've extracted the industry_codes array from the JSON body into a new column. Some records have a full array, some just have [ ].…
woodbine
  • 553
  • 6
  • 26
2
votes
1 answer

Underscore and dash in column names after JSON import

I've been using OpenRefine very successfully for a couple of years, working solely with CSV (and TSV) source files. Recently I had some tables from an sql database that I wanted to bring into OpenRefine so I exported them (from SQL) as JSON and then…
jcquokka88
  • 55
  • 6
2
votes
1 answer

OpenRefine sample extension not building

I'd like to write my own OpenRefine extension Before starting any implementation, I just want to build the sample extension from OpenRefine just to get me started. However, I'm getting the Maven error Could not resolve dependencies for project…
Jan
  • 7,444
  • 9
  • 50
  • 74
2
votes
2 answers

Does OpenRefine support Python3?

I have my own Python library that I would like to use in OpenRefine as described here However, it seems that all the Python code in OpenRefine goes through Jython which supports only Python 2 Is there a way to run Python3 code in OpenRefine? cheers
Jan
  • 7,444
  • 9
  • 50
  • 74
2
votes
1 answer

Open Refine: compare strings within cell, and delete everything after a certain character

I have a column of values with "//" as the separator between them. For example, one cell might contain - September 17 2021 // September 18 2021. I want to compare if what comes before and after the separator are the same, and if so, to delete the…
Hev
  • 35
  • 3
1 2
3
26 27