1

I am looking for an idea how to accomplish this task. So I'll start with how my program is working.

My program reads a CSV file. They are key value pairs separated by a comma.

  L1234456,ygja-3bcb-iiiv-pppp-a8yr-c3d2-ct7v-giap-24yj-3gie
  L6789101,zgna-3mcb-iiiv-pppp-a8yr-c3d2-ct7v-gggg-zz33-33ie

etc

Function takes a file and parses it into an arrayList of String[]. The function returns the ArrayList.

    public ArrayList<String[]> parseFile(File csvFile) {
    Scanner scan = null;
    try {
        scan = new Scanner(csvFile);
    } catch (FileNotFoundException e) {

    }

    ArrayList<String[]> records = new ArrayList<String[]>();
    String[] record = new String[2];
    while (scan.hasNext()) {
        record = scan.nextLine().trim().split(",");
        records.add(record);
    }
    return records;
 }

Here is the code, where I am calling parse file and passing in the CSVFile.

  ArrayList<String[]> Records = parseFile(csvFile);

I then created another ArrayList for files that aren't parsed.

  ArrayList<String> NotParsed = new ArrayList<String>();

So the program then continues to sanitize the key value pairs separated by a comma. So we first start with the first key in the record. E.g L1234456. If the record could not be sanitized it then it replaces the current key with "CouldNOtBeParsed" text.

for (int i = 0; i < Records.size(); i++) {
        if(!validateRecord(Records.get(i)[0].toString())) {
            Logging.info("Records could not be parsed " + Records.get(i)[0]);
               NotParsed.add(srpRecords.get(i)[0].toString());
            Records.get(i)[0] = "CouldNotBeParsed";
        } else {
            Logging.info(Records.get(i)[0] + " has been sanitized");
        }
    }

Next we do the 2nd key in the key value pair e.g ygja-3bcb-iiiv-pppp-a8yr-c3d2-ct7v-giap-24yj-3gie

for (int i = 0; i < Records.size(); i++) {
        if(!validateRecordKey(Records.get(i)[1].toString())) {
            Logging.info("Record Key could not be parsed " + Records.get(i)[0]);
               NotParsed.add(Records.get(i)[1].toString());
            Records.get(i)[1] = "CouldNotBeParsed";
        } else {
            Logging.info(Records.get(i)[1] + " has been sanitized");
        }
    }

The problem is that I need both keyvalue pairs to be sanitized, make a separate list of the keyValue pairs that could not be sanitized and a list of the ones there were sanitized so they can be inserted into a database. The ones that cannot will be printed out to the user.

I thought about looping thought the records and removing the records with the "CouldNotBeParsed" text so that would just leave the ones that could be parsed. I also tried removing the records from the during the for loop Records.remove((i)); However that messes up the For loop because if the first record could not be sanitized, then it's removed, the on the next iteration of the loop it's skipped because record 2 is now record 1. That's why i went with adding the text.

Atually I need two lists, one for the Records that were sanitized and another that wasn't.

So I was thinking there must be a better way to do this. Or a better method of sanitizing both keyValue pairs at the same time or something of that nature. Suggestions?

ControlAltDel
  • 33,923
  • 10
  • 53
  • 80
user1158745
  • 2,402
  • 9
  • 41
  • 60

1 Answers1

1

Start by changing the data structure: rather than using a list of two-element String[] arrays, define a class for your key-value pairs:

class KeyValuePair {
    private final String key;
    private final String value;
    public KeyValuePair(String k, String v) { key = k; value = v; }
    public String getKey() { return key; }
    public String getValue() { return value; }
}

Note that the class is immutable.

Now make an object with three lists of KeyValuePair objects:

class ParseResult {
    private final List<KeyValuePair> sanitized = new ArrayList<KeyValuePair>();
    private final List<KeyValuePair> badKey = new ArrayList<KeyValuePair>();
    private final List<KeyValuePair> badValue = new ArrayList<KeyValuePair>();
    public ParseResult(List<KeyValuePair> s, List<KeyValuePair> bk, List<KeyValuePair> bv) {
        sanitized = s;
        badKey = bk;
        badValue = bv;
    }
    public List<KeyValuePair> getSanitized() { return sanitized; }
    public List<KeyValuePair> getBadKey() { return badKey; }
    public List<KeyValuePair> getBadValue() { return badValue; }
}

Finally, populate these three lists in a single loop that reads from the file:

public static ParseResult parseFile(File csvFile) {
    Scanner scan = null;
    try {
        scan = new Scanner(csvFile);
    } catch (FileNotFoundException e) {
        ???
        // Do something about this exception.
        // Consider not catching it here, letting the caller deal with it.
    }
    final List<KeyValuePair> sanitized = new ArrayList<KeyValuePair>();
    final List<KeyValuePair> badKey = new ArrayList<KeyValuePair>();
    final List<KeyValuePair> badValue = new ArrayList<KeyValuePair>();
    while (scan.hasNext()) {
        String[] tokens = scan.nextLine().trim().split(",");
        if (tokens.length != 2) {
            ???
            // Do something about this - either throw an exception,
            // or log a message and continue.
        }
        KeyValuePair kvp = new KeyValuePair(tokens[0], tokens[1]);
        // Do the validation on the spot
        if (!validateRecordKey(kvp.getKey())) {
            badKey.add(kvp);
        } else if (!validateRecord(kvp.getValue())) {
            badValue.add(kvp);
        } else {
            sanitized.add(kvp);
        }
    }
    return new ParseResult(sanitized, badKey, badValue);
}

Now you have a single function that produces a single result with all your records cleanly separated into three buckets - i.e. sanitized records, records with bad keys, and record with good keys but bad values.

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
  • Thanks, that makes a bunch more sense and is a more efficient way to do things. So now i have ParseResult Records = parseFile(csvFile); and i can see my badkey, badValue and Sanitized. Last question, how do i loop though the Records just to pint out say just the bad records, or perhaps just the sanitized Records, etc? For instance, I only want to pass the Sanitized records to another function. For example, before I was passing Records into my prepared statement function e.g ps(Records). I guess how would i only pass the sanitized records to the PS function? – user1158745 Nov 14 '14 at 17:02
  • @user1158745 Now that you have `ParseResult Records`, you could take the individual lists using their getters - for example, like this: `saveToDatabase(Records.getSanitized());` – Sergey Kalinichenko Nov 14 '14 at 17:05
  • Last question for your. How would I take the classes and put them in their own file and reference them in the orignal code just to clean things up? – user1158745 Nov 14 '14 at 19:14
  • @user1158745 The usual way - create separate `class-name.java` classes for each class, add package names, put them in the proper spot in your project tree (or in your directory structure if you use command line instead of an IDE) and use them from your "main" file by name. If you name the classes right and put them in the proper spots in the project tree, the compiler would find them automatically. You may need to make these classes public as well, if you want to make your `parseFile` public. – Sergey Kalinichenko Nov 14 '14 at 19:19