2

I have two questions regarding Open Refine

  1. I have multiple sets of data in the form of Excel files, but I want to upload it all to Open Refine. How do I append File A, File B, and File C in Open Refine? All the files have the same column names. Note, I am not trying to merge or cell.cross between a common, unique field. I just want to append the three files together into one project.

  2. I have a dataset which includes the fields Inspection Type and Violations. Some of the common categories under Inspection Type are Accident, Complaint, Referral, Planned, and a couple of others. The Violations categories/records contain three common types: Serious, Repeat, Willful. What I need to analyze is how for each type of the Inspection Type (say, for Accidents) corresponds to what Violations, and what the count of those are. For example, for how many Accident inspection types, the Violation was found to be Serious, or Willful, and display that information in a separate column. I was able to facet the Inspection Type column to count the number of categories under each type, but I was unable to proceed with how to accomplish the next step.

Any help will be much appreciated!

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129

2 Answers2

1

Referring to Open Refine - Add another file to the existing Project, you could export each project into CSV files, create a zip file containing those CSVs, and then re-import it into OpenRefine.

YudhiWidyatama
  • 1,684
  • 16
  • 14
-1

1. Append files

When you create your project in Refine, you have the option to select ''Worksheets to Import''. The preview panel let you make sure that things are in order before creating the project. If this doesn't work then best is to do this in Excel First.

2. Faceting

Note that you can combined multiple facet together, for example you can first select all record that belong to a certain Inspection Type and then create a new facet on the Accident field to have a count. You can create a new column to add the count.

PS It is a best practise on Stack Overflow to ask only one question per question. Next create a thread per questions.

magdmartin
  • 1,712
  • 3
  • 20
  • 43
  • Thanks Magdmartin. I will comment on Q.1. Just to clarify, I am not talking about worksheets from the same file. I am talking about how to merge different projects that have been uploaded in OpenRefine, for example Project A from File A, Project B from File B, and want to append/merge the two projects into one A+B. Something analogous to the Merge function in Google Fusion Tables. – The Magiclightbulb Oct 06 '14 at 18:25
  • Refine cannot support it, you will have to merge the file outside the application. – magdmartin Oct 06 '14 at 23:38