22

$data contains tabs, leading spaces and multiple spaces. I wish to replace all tabs with a space. Multiple spaces with one single space, and remove leading spaces.

In fact somthing that would turn this input data:

[    asdf asdf     asdf           asdf   ] 

Into output data:

[asdf asdf asdf asdf]

How do I do this?

Giacomo1968
  • 25,759
  • 11
  • 71
  • 103
Arthur
  • 3,376
  • 11
  • 43
  • 70

9 Answers9

23

Trim, replace tabs and extra spaces with single spaces:

$data = preg_replace('/[ ]{2,}|[\t]/', ' ', trim($data));
ahc
  • 355
  • 2
  • 4
  • First, this is a clunky pattern. Second, as you wrote it, it doesn't rescan, and so it will produce multiple spaces for a run of `tab` `space` `tab` `space` `tab` `space`. Did you test this? – Jeff Oct 03 '17 at 21:57
15
$data = trim(preg_replace('/\s+/g', '', $data));
RaYell
  • 69,610
  • 20
  • 126
  • 152
  • You also forgot to mention trim to get rid of leading spaces. Probably want to mention ltrim too, since he asked for leading spaces then illustrated both ends. – Matthew Scharley Jul 25 '09 at 08:23
  • Yeah, thanks for pointing that. In the example it's shown that both leading and trailing spaces should be removed so I updated my code. – RaYell Jul 25 '09 at 08:27
  • helpful. [preg_replace](http://php.net/manual/en/function.preg-replace.php) defaults to replace all occurrences of the pattern, unless you specify the limit parameter. – JustinP Aug 29 '12 at 14:52
  • 12
    The "g" modifier doesn't seams to work. According to http://www.php.net/manual/en/reference.pcre.pattern.modifiers.php no "g" is needed to repeat on the same line as the whole text, even if it contains newlines, is considered by php as one line – Guillaume Bois Oct 01 '12 at 18:22
  • Typo - you are replacing spaces with '', but you wanted ' '. – Denis Pshenov Dec 11 '14 at 08:00
  • 1
    minus one for a replacement string that removes ALL whitespace, even the single space that the OP wants between his "words" – Jeff Oct 03 '17 at 21:38
4
$data = trim($data);

That gets rid of your leading (and trailing) spaces.

$pattern = '/\s+/';
$data = preg_replace($pattern, ' ', $data);

That turns any collection of one or more spaces into just one space.

$data = str_replace("\t", " ", $data);

That gets rid of your tabs.

Gabriel Hurley
  • 39,690
  • 13
  • 62
  • 88
4

Assuming the square brackets aren't part of the string and you're just using them for illustrative purposes, then:

$new_string = trim(preg_replace('!\s+!', ' ', $old_string));

You might be able to do that with a single regex but it'll be a fairly complicated regex. The above is much more straightforward.

Note: I'm also assuming you don't want to replace "AB\t\tCD" (\t is a tab) with "AB CD".

cletus
  • 616,129
  • 168
  • 910
  • 942
2
$new_data = preg_replace("/[\t\s]+/", " ", trim($data));
slosd
  • 3,224
  • 2
  • 21
  • 17
0

This answer takes the question completely literally: it is only concerned with spaces and tabs. Granted, the OP probably also wants to include other kinds of whitespace in what gets trimmed/compressed, but let's pretend he wants to preserve embedded CR and/or LF.

First, let's set up some constants. This will allow for both ease of understanding and maintainability, should modifications become necessary. I put in some extra spaces so that you can compare the similarities and differences more easily.

define( 'S', '[ \t]+'      ); # Stuff you want to compress; in this case ONLY spaces/tabs
define( 'L', '/\A'.S.'/'   ); # stuff on the Left edge will be trimmed
define( 'M',   '/'.S.'/'   ); # stuff in the Middle will be compressed
define( 'R',   '/'.S.'\Z/' ); # stuff on the Right edge will be trimmed
define( 'T', ' '           ); # what we want the stuff compressed To

We are using \A and \Z escape characters to specify the beginning and end of the subject, instead of the typical ^ and $ which are line-oriented meta-characters. This is not so much because they are needed in this instance as much as "defensive" programming, should the value of S change to make them needed in the future.

Now for the secret sauce: we are going to take advantage of some special semantics of preg_replace, namely (emphasis added)

If there are fewer elements in the replacement array than in the pattern array, any extra patterns will be replaced by an empty string.

function trim_press( $data ){
    return preg_replace( [ M, L, R ], [ T ], $data );
}

So instead of a pattern string and replacement string, we are using a pattern array and replacement array, which results in the extra patterns L and R being trimmed.

Jeff
  • 2,095
  • 25
  • 18
0

In case you need to remove   too.

$data = trim(preg_replace('/\s+|nbsp;/g', '', $data));
Ivo Pereira
  • 3,410
  • 1
  • 19
  • 24
0

After much frustration I found this to be the best solution, as it also removes non breaking spaces which can be two characters long:

$data = html_entity_decode(str_replace(' ',' ',htmlentities($data))); $data = trim(preg_replace('/\h/', ' ', $data)); // replaces more space character types than \s

See billynoah

will
  • 153
  • 3
  • 6
-1

Just use this regex

$str = trim(preg_replace('/\s\s+/', ' ', $str));

it will replace all tabs and spaces by one space,

here sign + in regex means one or more times, pattern means, that wherever there are two or more spaces, replace it by one space

dav
  • 8,931
  • 15
  • 76
  • 140
  • 2
    minus one for a pattern that fails to match a solitary tab (and for not testing your solution before posting). – Jeff Oct 03 '17 at 20:17