-1

EDIT: I was taking this too literally- a commenter correctly pointed out that the + in the querystring is just a url encoding for a space- sorry for the confusion

I need to split a string by a couple delimiters (, and +). The background is I'm saving query string parameters into $query (like so: $query = $_GET["geo"];), and I want to break them into different parts based on , and + (and not space because towns and states can have multiple words):

?geo=Cambridge+Massachusetts or

?geo=Cambridge,Massachusetts

Reading here I'm trying it like so:

$query_array = preg_split("/[+,]+/", $query, -1, PREG_SPLIT_NO_EMPTY);

It's splitting for , but not for +

Do I need to escape it? Or is there fundamentally a different way I should be doing this?

Thanks in advance for any help!

Community
  • 1
  • 1
Diane Kaplan
  • 1,626
  • 3
  • 24
  • 34
  • 1
    `/[\+,]+/` yes, escape it. Just like this by adding a preceeding `\\` – Martin Apr 10 '16 at 14:19
  • @Martin The backslash also needs escaping right? So `/[\\+,]+/`? – Chris Apr 10 '16 at 14:20
  • @Chris no. Running the code through regex101.com gives the correct results without escaping the backslash. OP is searching for `+` not `\+` . – Martin Apr 10 '16 at 14:20
  • 1
    [norepro](https://3v4l.org/oXG5t) => Show your relevant real code – Rizier123 Apr 10 '16 at 14:20
  • what is your input string ? also your `preg_split` will work with `+s` not with `+'s` because of `'` – Alive to die - Anant Apr 10 '16 at 14:20
  • https://eval.in/550571 It's working. – B.Kocaman Apr 10 '16 at 14:22
  • @Martin I could be wrong, but won't PHP interpret the backslash when it processes the string so you need two backslashes so one gets passed to the regex engine? In any case, the original code without any escaping at all seems to work fine for me. – Chris Apr 10 '16 at 14:22
  • @Chris if the original code works fine, then having 2 backslashes will also work fine, because it's double escaping. Therefore if the original code does **not** work fine (thus causing the question) then having 2 backslashes will not resolve this. As I say, I test my PHP PCRE on https://regex101.com and that works as expected with a single backslash preceeding the first `+` – Martin Apr 10 '16 at 14:24
  • @Martin Right, I can believe your code also works fine, since `\+` isn't actually an escape sequence for anything in PHP. Sorry about the unnecessary correction though. Anyhow, it works fine for me, both in its original form and with one or two backslashes so OP is clearly not sharing their real code. – Chris Apr 10 '16 at 14:27
  • @Chris I actually found likewise that the original code works without an issue too.... but the backslash declares that the + can't be a special character. I think the OPs issue may be something slightly different. – Martin Apr 10 '16 at 14:32
  • Diane Can you please show what value of `$query` you are using and what output is being generated? – Martin Apr 10 '16 at 14:32
  • 1
    Please post a clear sample of the input and desired output. – Pedro Lobito Apr 10 '16 at 14:45
  • wow you guys are fast! Sorry for the confusion- I posted a clarification now about what my expected input will be, and my goal to split this into pieces – Diane Kaplan Apr 10 '16 at 14:52
  • http://php.net/manual/en/function.parse-url.php – strangeqargo Apr 10 '16 at 14:55
  • Can you please post what your value of `$query` actually is. have you done any [urldecoding](http://php.net/manual/en/function.urldecode.php)? You're currently showing just what the `$_GET` value is and that is not the same thing! – Martin Apr 10 '16 at 18:09

3 Answers3

2

Your input doesn't have a + in your input has a space in. URL decoding your geo parameter gives Cambridge Massachusetts so split on space and , instead of + and ,.

Chris
  • 5,571
  • 2
  • 20
  • 32
  • ugh you're right- I missed that.... the + in the querystring is representing where a space would be- (so URL decoding makes a space)- I was taking it too literally :( – Diane Kaplan Apr 10 '16 at 18:01
0

PHP/PCRE is usually happy if you do not escape special characters within the square brackets of a PCRE [...], if you are sure you're having problems it may depend on your exact PHP version how it handles PCRE special characters.

But, I find that adding a backslash before the + means that it is always uarguably taken as a literal value rather than any sort of special character.

$query_array = preg_split("/[\+,]+/", $query, -1, PREG_SPLIT_NO_EMPTY);

Regex101 for this.

However, your original code, according to regex101.com works as it should, so what you may have is an issue where you are not searching the global length of the string, so by appending /g in the regex itself (not the PHP function regex) to simulate multiple group capture behaviour.

PHP may be not recognising your counter value and you should replace your -1 with NULL to avoid any possible confusion.

$query_array = preg_split("/[+,]+/", $query, NULL, PREG_SPLIT_NO_EMPTY);

should work perfectly for you.

$query = "che,ese+tree,s+rh,tht+";
$query_array = preg_split("/[+,]+/", $query, NULL, PREG_SPLIT_NO_EMPTY);
print_r($query_array);

Outputs:

$query_array =>
    [0] = "che"
    [1] = "ese"
    [2] = "tree"
    [3] = "s"
    [4] = "rh"
    [5] = "tht"
Martin
  • 22,212
  • 11
  • 70
  • 132
  • thank you! this is my first adventure with regex and your post is very helpful. Though the plot thickens- as the new comment mentions the + is really just a url-encoded space (I'd been taking it too literally and didn't realize), so I need to approach this differently all together – Diane Kaplan Apr 10 '16 at 18:03
0

If you know that geo parameter could come with "+" or "," delimiter, don't use regexp, use:

 explode(',', str_replace('+', ',', $geo))
Martin
  • 22,212
  • 11
  • 70
  • 132
strangeqargo
  • 1,276
  • 1
  • 15
  • 23