0

I have 2 files that I'm parsing line-by-line adding the information to 2 separate ArrayList<String> containers. I'm trying to create a final container "finalPNList" that reflects the 'Resulting File/ArrayList' below.

Issue is that I'm not successfully avoiding duplicates. I've changed the code various ways without success. Sometimes I restrict the condition too much, and avoid all duplicates, and sometimes I leave it too loose and include all duplicates. I can't seem to find the conditions to make it just right.

Here is the code so far -- in this case, seeing the contents of processLine() ins't truly relevant, just know that you're getting a map with 2 ArrayLists<String>

public static Map<String, List<String>> masterList = new HashMap<String, List<String>>();
public static List<String> finalPNList = new ArrayList<String>();
public static List<String> modifier = new ArrayList<String>();
public static List<String> skipped = new ArrayList<String>();

for (Entry<String, String> e : tab1.entrySet()) {
    String key = e.getKey();
    String val = e.getValue();

    // returns BufferedReader to start line processing
    inputStream = getFileHandle(val);
    // builds masterList containing all data
    masterList.put(key, processLine(inputStream));
}
for (Entry<String, List<String>> e : masterList.entrySet()) {
    String key = e.getKey();
    List<String> val = e.getValue();
    System.out.println(modifier.size());
    for (String s : val) {
        if (modifier.size() == 0)
            finalPNList.add(s);
        if (!modifier.isEmpty() && finalPNList.contains(s)
                && !modifier.contains(key)) {
            // s has been added by parent process so SKIP!
            skipped.add(s);
        } else
            finalPNList.add(s);    
    }    
    modifier.add(key);
}

Here is what the data may look like (extremely simplified dealing with about 20K lines about 10K lines in each file):

File A

123;data
123;data
456,data

File B

123;data
789,data
789,data

Resulting File/ArrayList

123;data
123;data
789,data
789,data
Meesh
  • 379
  • 1
  • 13
Roberto Navarro
  • 948
  • 4
  • 16
  • What is your expected output? – Sualeh Fatehi Sep 27 '13 at 01:33
  • The output I expect is in the resulting file section of my post... The last piece – Roberto Navarro Sep 27 '13 at 02:44
  • Would you please explain in a bit more detail the rules of how you wish to handle duplicates? It's not yet clear from the above example. – Meesh Sep 27 '13 at 05:39
  • @Meesh sorry about not being too clear. That is the meat of the puzzle--In essence, all the data from file A should make it to the output file, however, data from file B, should only be added, if it doesn't already exists (provided by file A). If it exists, add it to an "exists" ArrayList for review...I account for this but end up removing too much. I can add some examples to my thread, but I'd like to keep it reduced to avoid clutter. – Roberto Navarro Sep 27 '13 at 06:04

1 Answers1

1
  • !modifier.contains(key) is always true, it can be removed from your if-statement.
  • modifier.size() == 0 can be replaced with modifier.isEmpty().
  • Since you seem to want to add duplicates from File B, you need to check File A, not finalPNList when checking for existence (I just checked the applicable list in masterList, feel free to change this to something more appropriate / efficient).
  • You need to have an else after your first if-statement, otherwise you're adding items from File A twice.
  • I assumed you just missed 456 in your output, otherwise I might not quite understand.

Modified code with your file-IO replaced with something that's more in the spirit of an SSCCE:

masterList.put("A", Arrays.asList("123","123","456"));
masterList.put("B", Arrays.asList("123","789","789"));
for (Map.Entry<String, List<String>> e : masterList.entrySet()) {
    String key = e.getKey();
    List<String> val = e.getValue();
    System.out.println(modifier.size());
    for (String s : val) {
        if (modifier.isEmpty())
            finalPNList.add(s);
        else if (!modifier.isEmpty() && masterList.get("A").contains(s)) {
            // s has been added by parent process so SKIP!
            skipped.add(s);
        } else
            finalPNList.add(s);    
    }    
    modifier.add(key);
}

Test.

Bernhard Barker
  • 54,589
  • 14
  • 104
  • 138
  • I'll give it a go in a few minutes and let you know if it worked--thank you for your time and effort. Be sure I'll be back to confirm if this did the trick. – Roberto Navarro Sep 27 '13 at 18:01
  • This line made all the difference: "&& masterList.get("A").contains(s)" Thanks for your time and effort! – Roberto Navarro Sep 27 '13 at 18:43