1

Having some issues removing   whitespace.

vehicle = [" 2013 ", "BMW ", "535 ", "Sedan 4 Door "] 
v = vehicle[0]
# => " 2013 "

v[-1].ord.chr
# => "\xA0"

Failed Attempts:

vehicle.map { |d| d.gsub(/\S(\w*)/, '\1') }
# => ["2013", "MW", "35", "edan  oor"] (space gone but so are other characters.)

vehicle.map { |d| d.gsub(/\xA0/, '') }
# => SyntaxError: (irb):340: invalid multibyte escape: /\xA0/

vehicle.map { |d| d.gsub(/#{160.chr}/, '') }
# => Encoding::CompatibilityError: incompatible encoding regexp match (ASCII-8BIT regexp with UTF-8 string)

Answer from this question works:

vehicle.map { |d| d.gsub("\302\240", ' ').strip }
# => ["2013", "BMW", "535", "Sedan 4 Door"] 

but it doesn't explain why/how. Can someone explain how and why this works? Or suggest an alternative?

Community
  • 1
  • 1
MrPizzaFace
  • 7,807
  • 15
  • 79
  • 123
  • `\S` means *not* a space (in general, the `\ `-capital-letter character classes mean the *opposite* of the corresponding lower-case character classes). Your original attempt works if you change `\S` to `\s`. Note, however, that the use of `(\w*)` is unnecessary; just use `d.gsub(/\s/, '')`. – Kyle Strand May 08 '14 at 23:02
  • @KyleStrand I tried `.gsub(/\s/, '')` and unfortunalty that doesn't work either. See sample output: `=> [[" 2013 ", "Lincoln ", "MKXAWD ", "Utility4x44Door "], [" 2013 ", "BMW ", "X5 ", "Utility4x44Door "], [" 2013 ", "BMW ", "535 ", "Sedan4Door "]]` – MrPizzaFace May 08 '14 at 23:14

1 Answers1

3

You should be probably able to simply use /[[:space:]]/ to match all whitespace (unicode or not).

\302\240 is just the utf8-encoded nbsp representation.

ChristopheD
  • 112,638
  • 29
  • 165
  • 179