45

I have following data:

1||1||Abdul-Jabbar||Karim||1996||1974

I want to delimit the tokens.

Here the delimiter is "||".

My delimiter setter is:

public void setDelimiter(String delimiter) {
    char[] c = delimiter.toCharArray();
    this.delimiter = "\"" + "\\" + c[0] + "\\" + c[1] + "\"";
    System.out.println("Delimiter string is: " + this.delimiter);
}

However,

String[] tokens = line.split(delimiter);

is not giving the required result.

Alex Kulinkovich
  • 4,408
  • 15
  • 46
  • 50
Vicky
  • 16,679
  • 54
  • 139
  • 232
  • You can split by giving the set of characters ,using which you want to split. For example : string1 = today +1 and string2 =today -1 , we can split these strings using set like string1.split("[+-]") or string2.split("[+-]") and result would be today,1 – Raj Asapu Apr 05 '17 at 19:23

10 Answers10

64

There is no need to set the delimiter by breaking it up in pieces like you have done.

Here is a complete program you can compile and run:

import java.util.Arrays;
public class SplitExample {
    public static final String PLAYER = "1||1||Abdul-Jabbar||Karim||1996||1974";
    public static void main(String[] args) {
        String[] data = PLAYER.split("\\|\\|");
        System.out.println(Arrays.toString(data));
    }
}

If you want to use split with a pattern, you can use Pattern.compile or Pattern.quote.

To see compile and quote in action, here is an example using all three approaches:

import java.util.Arrays;
import java.util.regex.Pattern;
public class SplitExample {
    public static final String PLAYER = "1||1||Abdul-Jabbar||Karim||1996||1974";
    public static void main(String[] args) {
        String[] data = PLAYER.split("\\|\\|");
        System.out.println(Arrays.toString(data));

        Pattern pattern = Pattern.compile("\\|\\|");
        data = pattern.split(PLAYER);
        System.out.println(Arrays.toString(data));

        pattern = Pattern.compile(Pattern.quote("||"));
        data = pattern.split(PLAYER);
        System.out.println(Arrays.toString(data));
    }
}

The use of patterns is recommended if you are going to split often using the same pattern. BTW the output is:

[1, 1, Abdul-Jabbar, Karim, 1996, 1974]
[1, 1, Abdul-Jabbar, Karim, 1996, 1974]
[1, 1, Abdul-Jabbar, Karim, 1996, 1974]
Ray Toal
  • 86,166
  • 18
  • 182
  • 232
  • Sanjay's answer without quotes worked.. thanks for the help!! – Vicky Aug 11 '11 at 05:22
  • Yes, `Pattern.quote` is proper, but like the plain string with the backslashes it is still inefficient if you are going to split many times. In this case compiling the pattern is more efficient. Just FYI I updated my answer to show how it is done with a compiled regex, with and without quoting. – Ray Toal Aug 11 '11 at 05:32
  • My delimiter string is "\\|\\|" (quotes excluded i.e. of 6 characters). But Pattern.compile(delimiter) is not working as expected. However, Pattern.compile("\\|\\|") works as presented in your code. What is I want to pass in delimiter string externally. Pattern.compile("\\|\\|") is hard coding, right ?? – Vicky Aug 11 '11 at 06:17
  • String delimiter = "||"; and then Pattern pattern = Pattern.compile(Pattern.quote(delimiter)); String[] tokens = pattern.split(line); worked.. thanks!! – Vicky Aug 11 '11 at 06:21
  • Good to hear! You nailed it. The clean regex goes in your variable, you pass that to `quote` then `compile` it. Now you can efficiently use your pattern for splitting and replacing and matching, etc. Glad it works. – Ray Toal Aug 11 '11 at 06:33
35

Use the Pattern#quote() method for escaping ||. Try:

final String[] tokens = myString.split(Pattern.quote("||"));

This is required because | is an alternation character and hence gains a special meaning when passed to split call (basically the argument to split is a regular expression in string form).

asherbret
  • 5,439
  • 4
  • 38
  • 58
Sanjay T. Sharma
  • 22,857
  • 4
  • 59
  • 71
  • will final String[] tokens = myString.split(Pattern.quote(delimiter)); work ? – Vicky Aug 11 '11 at 05:15
  • I think he knows that `|` has special meaning; that's the whole point of the misguided escaping he's doing. The problem is in the double quotes he's surrounding the delimiter with. +1 for mentioning `Pattern.quote()`. – Mark Peters Aug 11 '11 at 05:16
  • nopes.. this did not work.. token[0] is taking the complete string... I passed delimiter as "||" how you mentioned. – Vicky Aug 11 '11 at 05:16
  • @Nikunj: You're probably doing something wrong. Are you still trying to put in double quotes? Sanjay's solution should work fine. – Mark Peters Aug 11 '11 at 05:17
  • @Nikunj: It should work. Update your question with more details if it doesn't. – Sanjay T. Sharma Aug 11 '11 at 05:20
  • @Mark: I guess I must have missed the escaping part. It looked more like some satanic ritual to appease the Gods. ;-) – Sanjay T. Sharma Aug 11 '11 at 05:21
  • @Nikunj: Given that you have recently joined SO, it would be nice if you start accepting answers which help you out. As a start, accept any one of the answers on this thread by clicking on the "check mark" next to the answer. – Sanjay T. Sharma Aug 11 '11 at 05:23
8

Double quotes are interpreted as literals in regex; they are not special characters. You are trying to match a literal "||".

Just use Pattern.quote(delimiter):

As requested, here's a line of code (same as Sanjay's)

final String[] tokens = line.split(Pattern.quote(delimiter));

If that doesn't work, you're not passing in the correct delimiter.

Mark Peters
  • 80,126
  • 17
  • 159
  • 190
6
String[] strArray= str.split(Pattern.quote("||"));

where

  1. str = "1||1||Abdul-Jabbar||Karim||1996||1974";
  2. Pattern.quote("||") will ignore the special character.
  3. .split function will split the string for every occurrence of ||.
  4. strArray will have the array of string that is delimited by ||.
Yesh
  • 318
  • 3
  • 11
5

Pipe (|) is a special character in regex. to escape it, you need to prefix it with backslash (\). But in java, backslash is also an escape character. so again you need to escape it with another backslash. So your regex should be \\|\\| e.g, String[] tokens = myString.split("\\|\\|");

Nirmit Shah
  • 758
  • 4
  • 10
2

Split uses regex, and the pipe char | has special meaning in regex, so you need to escape it. There are a few ways to do this, but here's the simplest:

String[] tokens = line.split("\\|\\|");
Bohemian
  • 412,405
  • 93
  • 575
  • 722
0
String[] splitArray = subjectString.split("\\|\\|");

You use a function:

public String[] stringSplit(String string){

    String[] splitArray = string.split("\\|\\|");
    return splitArray;
}
Pedro Lobito
  • 94,083
  • 31
  • 258
  • 268
0
StringTokenizer st = new StringTokenizer("1||1||Abdul-Jabbar||Karim||1996||1974",
             "||");
while(st.hasMoreTokens()){
     System.out.println(st.nextElement());
}

Answer will print

1 1 Abdul-Jabbar Karim 1996 1974

Bond - Java Bond
  • 3,972
  • 6
  • 36
  • 59
0

There is something wrong in your setDelimiter() function. You don't want to double quote the delimiters, do you?

public void setDelimiter(String delimiter) {
    char[] c = delimiter.toCharArray();
    this.delimiter = "\\" + c[0] + "\\" + c[1];
    System.out.println("Delimiter string is: " + this.delimiter);
}

However, as other users have said, it's better to use the Pattern.quote() method to escape your delimiter if your requirements permit.

shinkou
  • 5,138
  • 1
  • 22
  • 32
0

The problem is because you are adding quotes to your delimiter. It should be removed, and it will work fine.

public void setDelimiter(String delimiter) {
    char[] c = delimiter.toCharArray();
    this.delimiter = "\\" + c[0] + "\\" + c[1];
    System.out.println("Delimiter string is: " + this.delimiter);
}
Raze
  • 2,175
  • 14
  • 30