2

I have two lists, let's call them list A and list B. Both of these lists contain names and there are no duplicates (they are unique values). Every name in list B can be found in list A. I want to find which names are missing from list B in order to insert those missing names into a database. Basic example:

List<String> a = new ArrayList<>(Arrays.asList("name1", "name2", "name3", "name4","name5","name6"));
List<String> b = new ArrayList<>(Arrays.asList("name1", "name2", "name4", "name6"));

a.removeAll(b);
//iterate through and insert into my database here

From what I've searched removeAll() seems to be a go-to answer. In my case, I am dealing with a wide range of possible quantities. It could be anywhere between 500 to 50,000 names. Will removeAll() suffice for this? I've read that removeAll() is O(n^2) which may not be a problem with very small quantities, but with larger quantities, it sounds like it could be. I'd imagine it also depends on the user's patience as to when it would be considered a problem? Ultimately I'm wondering if there is a better way to do this without adding a huge amount of complexity as I do appreciate simplicity (to a point).

Matheus Lacerda
  • 5,983
  • 11
  • 29
  • 45
LJR135
  • 55
  • 4

2 Answers2

1

If the only thing you're doing with these lists is inserting them into a database, you shouldn't really care about the order of the elements. You could use HashSets instead of ArrayLists and get an O(n) performance instead of O(n2). As a side bonus, using a Set will ensure the values in a and b are really unique.

Mureinik
  • 297,002
  • 52
  • 306
  • 350
  • Thanks. So using HashSet's contains() seems like it would be much better. If I understand correctly HashSet contains() is O(1). If I stick that in a for loop I'm at O(n) as you mentioned correct? Sorry if it seems like I'm echoing or being redundant. I'm still at a beginner/ low-intermediate level with this stuff and want to make sure I understand correctly. – LJR135 Jun 14 '18 at 06:06
  • @LJR135 a loop would work, but you don't need it. `HashSet` has a `removeAll` method which basically does that for you, so the only thing you need to change in your code are the types. – Mureinik Jun 14 '18 at 06:09
  • Awesome thank you. Didn't even realize HashSet also has a removeAll. Marking as answered. Cheers! – LJR135 Jun 14 '18 at 06:15
0

50000 is a very small amount of data. Unless you're doing this repeatedly, anything reasonable would likely be good enough.

One way to implement this:

List<String> a = new ArrayList<>(Arrays.asList("name1", "name2", "name3", "name4","name5","name6"));
List<String> b = new HashSet<>(Arrays.asList("name1", "name2", "name4", "name6"));

for (String s : a) {
  if (!b.contains(s)) {
    insertToDb(s);
  }
}

or using Stream API in Java 8:

List result = a.stream().filter(s -> !b.contains(s)).collect(Collectors.toList());
// alternatively: Set result = a.stream().filter(s -> !b.contains(s)).collect(Collectors.toSet());
Lie Ryan
  • 62,238
  • 13
  • 100
  • 144