1

Im going to start by posting what the date in the text file looks like, this is just 4 lines of it, the actually file is a couple hundred lines long.

Friday, September 9 2011
-STV 101--------05:00 - 23:59 SSB 4185 Report Printed on 9/08/2011 at 2:37

0-AH 104--------07:00 - 23:00 AH GYM Report Printed on 9/08/2011 at 2:37

-BG 105--------07:00 - 23:00 SH GREAT HALL Report Printed on 9/08/2011 at 2:37

What I want to do with this text file is ignore the first line with the date on it, and then ignore the '-' on the next line but read in the "STV 101", "5:00" and "23:59" save them to variables and then ignore all other characters on that line and then so on for each line after that.

Here is how I am currently reading the lines entirely. And then I just call this function once the user has put the path in the scheduleTxt JTextfield. It can read and print each line out fine.

public void readFile () throws IOException
{
    try
    {
        FileInputStream fstream = new FileInputStream(scheduleTxt.getText());
        DataInputStream in = new DataInputStream(fstream);
        BufferedReader br = new BufferedReader(new InputStreamReader(in));
        String strLine;

        while ((strLine = br.readLine()) != null)   
        {
            System.out.println (strLine);
        }
        in.close();
    }
    catch (Exception e){//Catch exception if any
        System.err.println("Error: " + e.getMessage());
    }
}

UPDATE: it turns out I also need to strip Friday out of the top line and put it in a variable as well Thanks! Beef.

Beef
  • 1,413
  • 6
  • 21
  • 36

1 Answers1

3

Did not test it thoroughly, but this regular expression would capture the info you need in groups 2, 5 and 7: (Assuming you're only interested in "AH 104" in the example of "0-AH 104----") ^(\S)*-(([^-])*)(-)+((\S)+)\s-\s((\S)+)\s(.)*

    String regex = "^(\\S)*-(([^-])*)(-)+((\\S)+)\\s-\\s((\\S)+)\\s(.)*";
    Pattern pattern = Pattern.compile(regex);
    while ((strLine = br.readLine()) != null){
        Matcher matcher = pattern.matcher(strLine);
        boolean matchFound = matcher.find();
        if (matchFound){
            String s1 = matcher.group(2);
            String s2 = matcher.group(5);
            String s3 = matcher.group(7);
            System.out.println (s1 + " " + s2 + " " + s3);
        }

    }

The expression could be tuned with non-capturing groups in order to capture only the information you want.

Explanation of the regexp's elements:

  1. ^(\S)*- Matches group of non-whitespace characters ended by -. Note: Could have been ^(.)*- instead, would not work if there are whitespaces before the first -.
  2. (([^-])*) Matches group of every character except -.
  3. (-)+ Matches group of one or more -.
  4. ((\S)+) Matches group of one or more non-white-space characters. This is captured in group 5.
  5. \s-\s Matches group of white-space followed by - followed by whitespace.
  6. '((\S)+)' Same as 4. This is captured in group 7.
  7. \s(.)* Matches white-space followed by anything, which will be skipped.

More info on regular expression can be found on this tutorial. There are also several useful cheatsheets around. When designing/debugging an expression, a regexp testing tool can prove quite useful, too.

Xavi López
  • 27,550
  • 11
  • 97
  • 161
  • Yes in the case of "0-AH 104----" I only want the "AH 104", thanks, I'll give it a try and see what I get! – Beef Sep 14 '11 at 16:54
  • Update: Worked great, tested it with more extensive versions of the text file and worked without a problem, thanks again – Beef Sep 14 '11 at 17:00
  • I've added explanation of the elements on the expression on the answer for further reference – Xavi López Sep 14 '11 at 18:39
  • @Beef If the file format is fixed, you could just look for `,` in the string before testing the expression, and keep the `strLine.split(",")[0]`. – Xavi López Sep 14 '11 at 18:41
  • I was just informed that I may be working with a different format than this one, but I am going to argue against it, but if I do have to use a different format hopefully I can use your edit of the answer to help make changes to the regex search accordingly – Beef Sep 14 '11 at 18:43
  • @Beef I'll be happy if that edit helps you understanding regexps and tailoring it to your own needs :) – Xavi López Sep 14 '11 at 19:05
  • I dont know if you will be checking this but I had another regex related question Im trying to work through, here is the link http://stackoverflow.com/questions/7432018/strip-data-from-text-file-using-regex – Beef Sep 15 '11 at 14:04