3

I'm using Google Sheets and this is way beyond my simple scripting.

I have numerous cells containing comma separated values;

AA, BB, CC, BBB, CCC, CCCCC, AA, BBB, BB

BB, ZZ, ZZ, AA, BB, CC, BBB, CCC, CCCCC, AA, BBB, BB

I'm trying to return:

AA, BB, CC, BBB, CCC, CCCCC etc.

BB, ZZ, AA, CC, BBB, CCC, CCCCC etc.

... remove the duplicates. Per cell.

I can't get my head around a solution. I've tried every online tool that removes duplicates. BUT they all remove duplicates throughout my document.

Part of the problem is, I can't put the cells in 'alphabetical' order (which would make things simple) they have to be kept in the original order they appear.

I also have, at my disposal (but beyond my skill) Open Refine which I believe is a clever tool.

Community
  • 1
  • 1
Callum
  • 554
  • 2
  • 7
  • 18

1 Answers1

4

Here is how to do that in OpenRefine.

enter image description here

The formula I used is :

value.split(',').uniques().join(',')

It means : split the value in the cells by commas, remove duplicates, join them again using commas.

EDIT :

Another solution in OpenRefine using Python instead of GREL. This one keep better the original order.

enter image description here

Python/Jython Script:

from collections import OrderedDict
dedup = list(OrderedDict.fromkeys(value.replace(' ','').split(',')))
return ",".join(dedup)
Ettore Rizza
  • 2,800
  • 2
  • 11
  • 23
  • Amazing, thanks so much. Also how do you make these screenshot animations? – Callum Jun 20 '18 at 05:49
  • @Callum I used a little windows app (http://www.screentogif.com/?l=fr_fr), but you can find a lot of other softwares here : https://alternativeto.net/software/screentogif/ – Ettore Rizza Jun 20 '18 at 08:42