0

I have text files to process that have been converted from pdf files. The files contain columns with data where the data is separated by multiple spaces. To make sense of the data, I use

$line=trim($line);
$line=preg_replace("/\s+/", "\t", $line);
$array=explode("\t", $line);

This works pretty well, except for 1 column which contains names. The names are separated by single spaces, some names contain 2 parts (first & last), but some names contain more than 2 parts (e.g. John F. Doe).

Is there any way that I can adjust my preg_replace command so only multiple spaces are translated into a single tab, and single spaces are left as single spaces?

user281681
  • 315
  • 1
  • 7

2 Answers2

2

You could use preg_split, with regex \s{2,} which means two or more spaces.

$line = trim($line);
$array = preg_split('/\s{2,}/', $line);
xdazz
  • 158,678
  • 38
  • 247
  • 274
  • +1 Nice idea the split. We must be having a mindmeld, I just mentioned `\s{2,}` to the poster of the other answer. :) – zx81 Jun 09 '14 at 03:39
  • Thanks, this seems indeed like a better solution (in case there are multiple spaces or other characters to be repeated, this is far more elegant. – user281681 Jun 09 '14 at 19:14
1

/\s\s+/ represents at least 2 spaces

preg_replace("/\s\s+/", "\t", $line);
Fabricator
  • 12,722
  • 2
  • 27
  • 40