stringtokenizer returns wrong tokens

Question

I want to "parse" a given string into vectors. A vector starts with "[" and ends with "]". The values of the vector as well as the vectors themself are seperated by ",". If I use integers as values my code works fine "[1,2,3],[5,2,3],[1,6,3]". But when I mix integer values with double values "[1,2.5,3],[5,2,3],[1,6,3]" stringtokenizer returns wrong values (in this case "1" "2.5" but then "3]" ......)

String s = "[1,2.5,3],[5,2,3],[1,6,3]";

Vector<Vector<Double>> matrix = new Vector<Vector<Double>>();        
for(int j=0;j<s.length();j++) {
   if (s.charAt(j)=='[') {
      int k=s.indexOf("]");      
      StringTokenizer st = new StringTokenizer(s.substring(j+1, j+k));// j+k-1 does not work either
      Vector<Double> vector = new Vector<Double>();
      while (st.hasMoreTokens()) {
          vector.add(Double.parseDouble(st.nextToken(",")));//Exception in thread     "main" java.lang.NumberFormatException: For input string: "3]"
      }
      matrix.add(vector);
   }
}

verdesmarald · Accepted Answer · 2012-09-23T14:37:09.097

if (s.charAt(j)=='[') {
    int k=s.indexOf("]");

Finds the index of the first occurrence of ] in s, starting from the beginning of the string. What you really want is to find the first occurrence after the start of current vector:

if (s.charAt(j)=='[') {
    int k = s.indexOf("]", j);

The reason it works when you just have 2 instead of 2.5 is that the number of characters in each vector just happened to be the same, so taking the fist occurrence of ] to calculate the length of the vector worked by luck.

Note, you will also have to change the end index of your substring call to k:

StringTokenizer st = new StringTokenizer(s.substring(j+1, k));

As a side note, use of StringTokenizer is not recommended. In this case you should be using split() instead:

String[] elements = s.substring(j+1, k).split(",");
Vector<Double> vector = new Vector<Double>();
for (String element : elements) {
    vector.add(Double.parseDouble(element));
}
matrix.add(vector);

int k=s.indexOf("]",j); // needed to add j StringTokenizer st = new StringTokenizer(s.substring(j+1, k)); //needed to remove +j — Drilon Berisha, Sep 23 '12 at 14:34
Yep, was just updating my answer to say that when you posted. :) — verdesmarald, Sep 23 '12 at 14:37

score 1 · Answer 2 · answered Sep 23 '12 at 14:36

1

Vector and StringTokenizer ? :) cute!

    while(s.contains("[")) {
        String s1 = s.substring(s.indexOf("[")+1, s.indexOf("]"));
        if(s1!=null && s1.isEmpty()!=true && s1.contains(",") ) {
            String[] sArr = s1.split(",");
            for (String string : sArr) {
                Double d = Double.valueOf(string);
                System.out.println(d);
                // put it where you need
            }
        }
        s = s.substring(s.indexOf("]")+1);
    }

answered Sep 23 '12 at 14:36

aviad

8,229
9
50
98

shoud i use split instead of stringtokenizer only because its faster or is there another reason? it's not like i have to "parse" 100k char long strings. – Drilon Berisha Sep 23 '12 at 14:51
If you look at String.split() and compare it to StringTokenizer, the relevant difference is that String.split() uses a regular expression, whereas StringTokenizer just uses verbatim split characters. So if I wanted to tokenize a string with more complex logic than single characters (e.g. split on \r\n), I can't use StringTokenizer but I can use String.split() – aviad Sep 23 '12 at 15:00

score 0 · Answer 3 · answered Sep 23 '12 at 14:25

0

StringTokenizer is deprecated, you should use the string split()

answered Sep 23 '12 at 14:25

meirrav

761
1
9
27

2

First of all, `StringTokenizer` is **not** deprecated, however *"It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead"*. Second, this doesn't actually answer the question, and should be a comment. – verdesmarald Sep 23 '12 at 14:27

stringtokenizer returns wrong tokens

3 Answers3