0

Does anyone know how i can use NSScanner to separate a string by comma into an array EXCEPT when a comma is embedded within quotes?

Before i had been using:

  NSArray *arrData = [strData componentsSeparatedByString:@","];

However i have quotes inside the string that have commas inside them, i want these not to be separated. I have never worked with NSScanner before and i am struggling to come to terms with the documentation. Has anyone done a similar thing before?

JH95
  • 489
  • 1
  • 7
  • 24

2 Answers2

2

If you absolutely have to use an NSScanner you could do something like this.

        NSScanner *scanner = [[NSScanner alloc] initWithString:@"\"Foo, Inc\",0.00,1.00,\"+1.5%\",\"+0.2%\",\"Foo"];
        NSCharacterSet *characters = [NSCharacterSet characterSetWithCharactersInString:@"\","];
        [scanner setCharactersToBeSkipped:nil];
        NSMutableArray *words = [[NSMutableArray alloc]init];
        NSMutableString *word = [[NSMutableString alloc] init];
        BOOL inQuotes = NO;
        while(scanner.isAtEnd == NO)
        {
            NSString *subString;
            [scanner scanUpToCharactersFromSet:characters intoString:&subString];
            NSUInteger currentLocation = [scanner scanLocation];
            if(currentLocation >= scanner.string.length)
            {
                if(subString.length > 0)
                    [words addObject:subString];
                break;
            }
            if([scanner.string characterAtIndex:currentLocation] == '"')
            {
                inQuotes = !inQuotes;
                if(subString == nil)
                {
                    [scanner setScanLocation:currentLocation + 1];
                    continue;
                }
                [word appendFormat:@"%@",subString];
                if(word.length > 0)
                   [words addObject:word.copy];
                [word deleteCharactersInRange:NSMakeRange(0, word.length)];
            }
            if([scanner.string characterAtIndex:currentLocation] == ',')
            {
                if(subString == nil)
                {
                    [scanner setScanLocation:currentLocation + 1];
                    continue;
                }
                if(inQuotes == NO)
                    [words addObject:subString];
                else
                    [word appendFormat:@"%@,",subString];
            }
            [scanner setScanLocation:currentLocation + 1];
        }

EDIT: This gives the following output:

Foo, Inc

0.00

1.00

+1.5%

+0.2%

Foo

Hope this is what you want

As you can see it gets complicated and very error prone, I would recommend using Regex for this.

Community
  • 1
  • 1
lead_the_zeppelin
  • 2,017
  • 13
  • 23
  • that keeps crashing for me when i insert my own string. Maybe regex would be better. How would i split it into an array using regex though? – JH95 Apr 18 '14 at 01:25
  • Still no luck. Heres a copy of my string if it helps: "Foo, Inc",0.00,1.00,"+1.5%","+0.2%","Foo" – JH95 Apr 18 '14 at 01:29
  • I clearly can't write proper code without debugging it. My updated code should work – lead_the_zeppelin Apr 18 '14 at 01:52
  • This is how is what the array looks like after i looped through each index:Item 0: Foo Item 1: Inc" Item 2: 0.00 Item 3: 1.00 Item 4: "+1.5%","+0.2%","Foo Is there a way of doing it with regex? – JH95 Apr 18 '14 at 02:01
  • isn't that what you wanted? It didn't split that part of the string. I know very basic regex sorry. Someone can help you better. – lead_the_zeppelin Apr 18 '14 at 02:03
  • I wanted the data to be like this item 1: "Foo, Inc" item 2: 0.00 item 3: "+1.5%" item 4:"+0.2%". However thank you for all your time @lead_the_zeppelin – JH95 Apr 18 '14 at 02:09
  • 1
    I think you missed 1.00 and Foo from your output. Check my code now. It should work! Whew – lead_the_zeppelin Apr 18 '14 at 02:51
  • Voted up. But with this note - this code skips empty elements (""). Here is my working (changed part) of code: – Panayot Feb 11 '20 at 09:45
  • if([scanner.string characterAtIndex:currentLocation] == '"') { inQuotes = !inQuotes; if((subString == nil)&&(!([scanner.string characterAtIndex:currentLocation+1] == '"'))) { [scanner setScanLocation:currentLocation + 1]; continue; } else if (subString == nil){subString = @" ";} [word appendFormat:@"%@",subString]; ...etc... – Panayot Feb 11 '20 at 09:46
1

This is the same general procedure as lead_the_zeppelin's answer, but I think it's considerably more straightforward. We use an NSMutableString, accum, to build up each intra-comment piece, which may consist of any number of quoted segments.

Each iteration, we scan up to a comma, a quote, or the end of the string, whichever comes first. If we find a comma, whatever's been accumulated so far should be saved. For a quote, we pick up everything up to the closing quote mark -- this is the key that avoids interpreting commas inside quotations as split points. Note that this won't work if quotes are not always balanced. If it's neither of those, we've reached the end of the string, so we save the accumulated string and quit.

// Demo data
NSArray * commaStrings = @[@"This, here, \"has\" no quoted, commas.",
                           @"This \"has, no\" unquoted \"commas,\"",
                           @"This, has,no,quotes",
                           @"This has no commas",
                           @"This has, \"a quoted\" \"phrase, followed\", by a, quoted phrase",
                           @"\"This\", one, \"went,\", to, \"mar,ket\"",
                           @"This has neither commas nor quotes",
                           @"This ends with a comma,"];

NSCharacterSet * commaQuoteSet = [NSCharacterSet characterSetWithCharactersInString:@",\""];

for( NSString * commaString in commaStrings ){

    NSScanner * scanner = [NSScanner scannerWithString:commaString];
    // Scanner ignores whitespace by default; turn that off.
    [scanner setCharactersToBeSkipped:nil];

    NSMutableArray * splitStrings = [NSMutableArray new];

    NSMutableString * accum = [NSMutableString new];

    while( YES ){

        // Set to an empty string for the case where the scanner is
        // at the end of the string and won't scan anything;
        // appendString: will die if its argument is nil.
        NSString * currScan = @"";
        // Scan up to a comma or a quote; this will go all the way to
        // the end of the string if neither of those exists.
        [scanner scanUpToCharactersFromSet:commaQuoteSet
                                intoString:&currScan];
        // Add the just-scanned material to whatever we've already got.
        [accum appendString:currScan];

        if( [scanner scanString:@"," intoString:NULL] ){
            // If it's a comma, save the accumulated string,
            [splitStrings addObject:accum];
            // clear it out,
            accum = [NSMutableString new];
            // and keep scanning.
            continue;
        }
        else if( [scanner scanString:@"\"" intoString:NULL] ) {
            // If a quote, append the quoted segment to the accumulation,
            [scanner scanUpToString:@"\""
                         intoString:&currScan];
            [accum appendFormat:@"\"%@\"", currScan];
            // and continue, appending until the next comma.
            [scanner scanString:@"\"" intoString:NULL];
            continue;
        }
        else {
            //Otherwise, there's nothing else to split; 
            // just save the remainder of the string
            [splitStrings addObject:accum];
            break;
        }

    }
    NSLog(@"%@", splitStrings);
}

Also, as Chuck suggested, you might want to just get a CSV parser.

Community
  • 1
  • 1
jscs
  • 63,694
  • 13
  • 151
  • 195