-4

I have to make a program that reads a text file and checks the 8 pieces of data on each line for certain criteria. I got the program to read the text file correctly, and I've implemented a tokenizer to split up the text file data into the 8 separate tokens. I need to find a good way to test the tokens against possible errors. Any easy ways to do this? Here are the errors list.

CODE ERROR

A Invoice code is too short

B Invoice code does not have the right characters

C Invoice code digits are all zero

D Name field has fewer than two words

E Name field has more than four words

F Name field has no comma

G Name field has a bad title

H Name field has a bad initial

I Sale price has no decimal point

J Sale price has more than one decimal point

K Sale price has a leading zero

L Genre has bad symbols (contains upper case letter or contains symbols)

M Order date is not six characters

N Order date is not all digits

P Order date is not a legal date

Q Shipping date is not six characters

T Shipping date is not all digits

U Shipping date is not a legal date

V Unclassified error

Each field could have up to one error. Thus, each record could have up to six errors. Each field should be checked for errors in the order that the above errors are listed.INVOICE CODE exactly 6 characters

CUSTOMER NAME maximum 30 characters

SALE PRICE maximum 8 characters

GENRE maximum 10 characters

ORDER DATE exactly 6 characters

SHIPPING DATE exactly 6 characters

These fields are separated by a semicolon.

Invoice code is supposed to be three upper case letters followed by three digits where at least one digit is not zero.

The customer name should be in the form last name followed by a comma, then optional title, first name, and optional middle initial. Titles must be one of Mr., Mrs., Dr., Miss, or Ms. Middle initials must be an upper case letter followed by a period. The words should be delimited by a single space.

The sale price has two decimal digits to the left of the decimal point. The price should have no leading zero.

The genre should only contain lowercase letters.

The two dates should be in MMDDYY format where all six characters are digits.

The dates must be legal dates. The order date/shipping date may not be after today’s date.

Here is my full code:

package Project3;

import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.StringTokenizer;

public class Main {

public static void main(String[] args) throws IOException{

BufferedReader in = new BufferedReader(new FileReader("movie.dat"));
String line;
String token;
String delimiter = ";";
StringTokenizer tokenizer;


while((line = in.readLine()) != null){
    tokenizer = new StringTokenizer(line, delimiter);

    while (tokenizer.hasMoreTokens()){
        token = tokenizer.nextToken();
        System.out.print("Invoice Code: "+token+" ");
        if(token.length() == 6 ){
        }
            else{
            System.out.println("A");
            }
        if()
        token = tokenizer.nextToken();
        System.out.print("Customer Name: "+token+" ");
        token = tokenizer.nextToken();
        System.out.print("Sale Price: "+token+" ");
        token = tokenizer.nextToken();
        System.out.print("Genre: "+token+" ");
        token = tokenizer.nextToken();
        System.out.print("Order Date: "+token+" ");
        token = tokenizer.nextToken();
        System.out.print("Shipping Date: "+token+" ");
        }
    System.out.println();
    }


in.close();
}

}

  • You should learn how to use RegEx. – CloudPotato Aug 18 '16 at 13:25
  • Can you please post a sample record? – Laur Ivan Aug 18 '16 at 13:27
  • You need to know (and maybe you already know, then fine) the fields and the requirements in more detail. For code A, for example, you need to know which field is invoice code and what the minimum length is. Specifically code V looks like a mystery — maybe you need not use it, or maybe you can use it if you discover errors that don’t fall under the other codes. – Ole V.V. Aug 18 '16 at 13:50
  • Sorry, yes I do know the parameters of the error codes, I will include them – Daniel Winter Aug 18 '16 at 14:01

1 Answers1

0

It looks like you first have to group the codes by field. Depending on the format, you might get away with a simple split() instead of a tokenizer.

To keep things simple, I'd create a method in the class for each field check. Then, in the respective method, you need to parse it (via regex) and generate the error codes (if any). The method would return the error code or nothing (you'll have to define what nothing means to you).

You can adapt then the second part of the code to print the error codes.

Laur Ivan
  • 4,117
  • 3
  • 38
  • 62
  • I understand the concept, but I'm not exactly sure how the code would look. Would you be able to write be a little example? I'm fairly new to java. – Daniel Winter Aug 18 '16 at 13:52