0

I am new to using OpenRefine, and I cannot figure out how split a multivalue cell on each character in the cell. For example, I cannot split a cell with value "mod" in to three rows: one with "m", one with "o", and one with "d".

When the data has a delimiter is present (e.g., "m,o,d"), splitting is easy. However, I deal with a lot of dental data in which you have of a tooth number in one cell (e.g., "3") and the tooth surfaces represented as a string (e.g., "mod"). In this case, "m" is for the mesial surface of a tooth, "o" is for the occlusal surface of a tooth, and "d" is for the distal surface of a tooth.

In python, I know I can get separate characters using list(); e.g., list("mod") returns ["m", "o", "d"]. Can I do something like this in OpenRefine?

Bill
  • 179
  • 8

1 Answers1

3

I think the simplest way of doing this in OpenRefine is:

value.split(//)

The use of an empty regular expression in the 'split' function splits the string up into individual characters

Owen Stephens
  • 1,550
  • 1
  • 8
  • 10
  • Thanks! This works for creating a list of values. However, when I want to split the cell in multiple rows (i.e., "Split multi-valued cells") I am only presented with a text box in which to enter a value. I can't enter "value.split(//)". – Bill Nov 30 '16 at 20:41
  • For that you need two steps. Use value.split(//).join("|") - use a join character that doesn't appear anywhere in your data. Then use split multi-valued cells using the character you chose for the join – Owen Stephens Dec 01 '16 at 07:44