1

I'm currently using this formula to remove the stop words from a list of words:

=SUBSTITUTE(REGEXREPLACE(A1,"or|and|but|for|of",""),",,",",")

However, it also removed some letters in some words which are not stop words. Ex: nord, foreign. Is there any way to remove the words with the matching case only?

player0
  • 124,011
  • 12
  • 67
  • 124
JOY
  • 399
  • 3
  • 15

1 Answers1

3

try:

=SUBSTITUTE(REGEXREPLACE(A1, "\b(or|and|but|for|of)\b", ), ",,", ",")

enter image description here


upgrade:

=JOIN(",", SPLIT(REGEXREPLACE(A1, "\b(or|and|but|for|of)\b", ), ", "))

enter image description here

player0
  • 124,011
  • 12
  • 67
  • 124
  • 2
    Just `=REGEXREPLACE(A1,"\s*\b(or|and|but|for|of)\b,?\s*","")` would do. – JvdV Aug 06 '20 at 09:16
  • 1
    `=REGEXREPLACE(A1, "\s+\b(or|and|but|for|of)\b,?\s+", )` doesn't work but `=REGEXREPLACE(A1,"\s*\b(or|and|but|for|of)\b,?\s*","")` works correctly. – JOY Aug 06 '20 at 20:01
  • Thanks! I compiled this huge regex and list of stop words with adverbs and adjectives and the months here https://web.archive.org/web/20230216023942/https://rentry.co/2fve82 and here https://web.archive.org/web/20230216021429/https://rentry.co/3ibya and reached the 50k characters cell limit but could perform this workaround by player0: https://stackoverflow.com/a/55070275/10789707 – Lod Feb 16 '23 at 02:45