0

My XML file encodes strings in between single-quotes. Array elements are comma-separated.

'Hello', 'World'

Normally, extracting comma-separated values is easy:

let result = myString.split{ $0 == "," }.map(String.init)

The problem is that I can't recklessly split on the comma: if the comma is enclosed in single quotes it is text, otherwise it is an array element separator:

'Hello', 'World', 'Hello, World'

Should produce:

["Hello", "World", "Hello, World"]

Note two things:

  1. An empty single quotes is an empty string that can't be discarded (it's an element in an array)
  2. I can't guarantee whitespace between the elements; a user may have tampered with a file.

'Hello', 'World'     , 'Hello, World', ''

should produce:

["Hello", "World", "Hello, World", ""]

I need some way to differentiate between a comma-separated value which lies outside of the single quotes: ', ' A better approach would probably be to retain anything in between single-quotes, but I don't know how to do this.

MH175
  • 2,234
  • 1
  • 19
  • 35
  • 1
    This looks more like a CSV format. – Are strings always enclosed in single-quotes? Can there be a single-quote inside a string? How would that be escaped? – Martin R Dec 05 '17 at 18:59
  • Yes, strings are always inside single quotes. The user can't use them. Odd, I know; this is from a niche program from I believe around 2003. – MH175 Dec 05 '17 at 19:04
  • @MH175 Can you elaborate more on what this format is? `XML` doesn't have native "arrays" – Alexander Dec 05 '17 at 19:21
  • I'm afraid I can't :-( Yeah, it's a totally unique product from the early 2000s that ran on its own hardware. The encoding is not quite XML, but very close. If you looked at it quickly you would easily be fooled into thinking it is! – MH175 Dec 05 '17 at 19:24

1 Answers1

2

What about separating the string by single quotes and removing elements containing a single comma and a space afterwards?

Note that you should remove the first/last element of the result, since the input starts and ends with a single quote and that produces an empty element after separating.

var result = myString.components(separatedBy: "'").filter { 
    $0.trimmingCharacters(in: .whitespaces) != ","
}

result.removeLast()
result.removeFirst()
Tamás Sengel
  • 55,884
  • 29
  • 169
  • 223