3

I am reading text from a PDF to NSString. I replace all the spaces using the code below

NSString *pdfString = convertPDF(path);
    pdfString=[pdfString stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
    pdfString=[pdfString stringByReplacingOccurrencesOfString:@"\r" withString:@""];
    pdfString=[pdfString stringByReplacingOccurrencesOfString:@"\n" withString:@""];

But this also eliminates paragraph spaces and multiple lines. I want to replace only a single occurrence of \n or \r and retain the paragraph spaces or multiple tabs and next lines.

Stonz2
  • 6,306
  • 4
  • 44
  • 64
ankit_rck
  • 1,796
  • 2
  • 14
  • 24
  • I think you'll need to write your own method that checks if there are multiple occurences of \n or \r and react accordingly – Gil Sand May 06 '15 at 13:06
  • http://stackoverflow.com/questions/8608346/replace-only-the-first-instance-of-a-substring-in-an-nsstring this might help – Vig May 06 '15 at 13:07
  • Can you show us a text extract of what you are processing, and what the result is supposed to be ? – Niko May 06 '15 at 13:20
  • I want to replace on single occurrence of \n. For eg: h\n e\n l\n l\n o\n \n \n \n w\n o\n r\n l\n d should be shown as hello and world, but the number of lines between hello and world should remain same(in this case 3) hello \n\n\n world – ankit_rck May 06 '15 at 14:48

3 Answers3

3

There are two approaches:

  1. Do a manual find in a loop

You can get the range of a string with -rangeOfCharactersFromSet:options:range:. The pearl of such a approach is to reduce the search range with every found match. Doing so you can simply compare the found range with the search range. If the found range is at the very beginning, it has been a double (or tripple) \r.

  1. Get the individual components

With -componentsSeparatedByCharactersFromSet: (NSString) returns an array with strings separated with \r. Empty strings in this array are double (or triple) \r. Simply replace them with a \r and then rejoin the components with a space.

Daij-Djan
  • 49,552
  • 17
  • 113
  • 135
Amin Negm-Awad
  • 16,582
  • 3
  • 35
  • 50
1

You should use NSRegularExpression to do this

NSString *pdfString = convertPDF(path);

//Replace all occurrences of \n by a single \n
NSRegularExpression *regexN = [NSRegularExpression regularExpressionWithPattern:@"\n" options:0 error:NULL];
pdfString = [regexN stringByReplacingMatchesInString:pdfString options:0 range:NSMakeRange(0, [pdfString length]) withTemplate:@"\n"];

//Replace all occurrences of \r by a single \r
NSRegularExpression *regexR = [NSRegularExpression regularExpressionWithPattern:@"\r" options:0 error:NULL];
pdfString = [regexR stringByReplacingMatchesInString:pdfString options:0 range:NSMakeRange(0, [pdfString length]) withTemplate:@"\r"];
Niko
  • 3,412
  • 26
  • 35
  • I want to replace on single occurrence of \n. For eg: h\n e\n l\n l\n o\n \n \n \n w\n o\n r\n l\n d should be shown as hello and world, but the number of lines between hello and world should remain same(in this case 3) hello \n\n\n world – ankit_rck May 06 '15 at 13:19
  • Then modify the regular expression to look for only one instance of \n (ie, not more than one contiguous instance) so only those are replaced / removed. Examples abound. – Joshua Nozzi May 06 '15 at 16:14
0

Have you tried regex? You can catch only the occurrences where an \n appears alone without another \n, then replace those occurrences with empty string:

NSError *error = nil;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:@"[^\n]([\n])[^\n];" options:0 error:&error];
NSString *modifiedString = [regex stringByReplacingMatchesInString:string options:0 range:NSMakeRange(0, [string length]) withTemplate:@""];
Aviel Gross
  • 9,770
  • 3
  • 52
  • 62