3

I'm using Buffered Reader to pass individual lines of a file to Java's StringTokenizer. The file is structurd as follows:

"2,0";"12345";"foo";"foo.doc"
"2,4";"23456";"foo";"foo.doc";"34567";"foo7";"foo7.doc";"45678";"foo6";"foo6.doc";"56789";"foo5";"foo5.doc";"67890";"foo4";"foo4.doc"   
"3,0";"34567";"foo7";"foo7.doc"
"3,0";"45678";"foo6";"foo6.doc"
"3,0";"56789";"foo5";"foo5.doc"
"3,0";"67890";"foo4";"foo4.doc"

Here's the code I'm using--so far.

public class parse {
  public static void main(String args[]) {
    FileInputStream inputStream = new FileInputStream("whidata0.txt");
    BufferedReader br = new BufferedReader(new InputStreamReader(inputStream)); 
    while((scrubbedInput=br.readLine())!=null) {
      StringTokenizer strTok = new StringTokenizer(scrubbedInput, ";", false);
      int tokens = strTok.countTokens();
      while (strTok.hasMoreTokens()) {
        tok01 = strTok.nextToken();
      }
      System.out.println("  scrubbed: " + scrubbedInput);
      System.out.println("    tokens: " + tokens);
      System.out.println("     tok01: " + tok01);
    }
  }
}

I need to be able to assign each token in a string to a variable to do additional manipulation. However, if I assign those variable in my while loop, the iteration will overwrite my variables, and they will all return with the same value.

I'm trying to devide a way to do the following:

String token01 = strTok.tokenNumber(0);
String token02 = strTok.tokenNumber(1);
String token03 = strTok.tokenNumber(2);
String token04 = strTok.tokenNumber(3);
etc.

but cannot find any methods in the String Tokenizer documentation that will allow that. I can certainly write each line to a String array of thisLineOfTokens[] and use a for loop to create String tokenN = thisLineOfTokens[n], but is there a more direct way to access specific tokens?

I'm kinda lost about the best way to reference a SPECIFIC token from my string.

dwwilson66
  • 6,806
  • 27
  • 72
  • 117

2 Answers2

4

You can use String.split for that instead of a StringTokenizer.

String[] split = scrubbedInput.split(";");

split[2]; //index=2
Gangaraju
  • 4,406
  • 9
  • 45
  • 77
  • This answer is right. In fact @dwwilson66 you are using StringTokenizer the wrong way. It acts as in iterator, you loop through it and take their values. What you specifically want is .split() where you can take out a value at a particular index. Plus in the JavaDocs they say to use .split() instead of the tokenizer. – george_h Oct 07 '13 at 12:09
0

Info from here:

It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead.

So, you can use something like this:

String testLine = "your;test;data;"

String[] result = testLine.split(";");
for (int x=0; x<result.length; x++){
    System.out.println(result[x]);
}

Output:

your
test
data
Max Gabderakhmanov
  • 912
  • 1
  • 18
  • 36