0

I have the following problem. I want to compare two lists with dictionaries and find matches.

Schema List A:

[
  {
    authorID: string,
    articleID: string
  }
]

Schema List B:

[
  {
    authorID: string,
    firstName: string,
    lastName: string
  }
]

The length of the list A is 1653 and the list B is 1344 long.

This is my code with which I tried to compare the lists and resave the matches.

def unitedListArticleAuthor(listA, listB):
    new_list = []
    for article in listA:
        for author in listB:
            if article["authorID"] == author["authorID"]:
                new_list.append({
                    "articleID": article["articleID"],
                    "authorID": article["authorID"],
                    "firstName": author["firstName"],
                    "lastName": author["lastName"],
                })

Now the length of the new list is 1667. That means somewhere a mistake must have happened, because the list A is the biggest and only has to be matched with the list B. Only the author ID's should be matched to add the name of the author to each article.

What is my mistake?

  • 4
    Is there chance theres 14 cases where theres 3 of the same author id – Sayse Jun 01 '22 at 11:08
  • So each author has a unique ID. There can be no duplicates. Or what exactly do you mean? – ghandesign Jun 01 '22 at 11:26
  • maybe check on the integrity of the authors' names, it could be that are not unique, such as "Robert V. Higgs" but also as "Robert Higgs" – cards Jun 01 '22 at 11:48
  • Im suggesting that list a or list b contain the same author id (or author) for more than one entry in the list – Sayse Jun 01 '22 at 11:49

1 Answers1

0

As Sayse said, it is likely that there are multiple instances of the same authorID.

You seem really sure that this is not the case, but try adding a break statement like this:

def unitedListArticleAuthor(listA, listB):
    new_list = []
    for article in listA:
        for author in listB:
            if article["authorID"] == author["authorID"]:
                new_list.append({
                    "articleID": article["articleID"],
                    "authorID": article["authorID"],
                    "firstName": author["firstName"],
                    "lastName": author["lastName"],
                })
                break

Avadem
  • 36
  • 3
  • With the break the length is 1344. But there is now missing data. Because the list should actually be 1653 long. – ghandesign Jun 01 '22 at 11:35
  • If the list is now shorter it is because there are duplicates in listB. I am guessing that if the list that you should get is longer than the list you are you are getting is because there are instances of articles that have an authorID that doesn't appear in listB. – Avadem Jun 01 '22 at 13:46