1

I have two text files that have camera models, however not all models on one text file are present in the other, so, I want to find the missing models. One issue tho, some models have extra strings in their name e.g., :

  • NIKON D610
  • D610
  • CANON POWERSHOT A1200
  • POWERSHOT A1200

"Nikon" and "Canon" is non-existent in one file.

~~ I'm scratching my head since 2 days.

NBG
  • 30
  • 7
  • Is it always the manufacturers name that is missing in one of the files? If so, you could make a list of possible manufacturer names and ignore these string parts while comparing. – Felix Jun 08 '22 at 06:04
  • The thing is, it is totally random, some manufacturers are present, some are not. – NBG Jun 08 '22 at 06:12
  • That is what i expected. But as long as it is feasible to make a complete list of all possible manufacturers (manufacturer strings actually), you can just ignore every string that is in that list when comparing two model strings. (I will try to post some pseudo code in an answer) – Felix Jun 09 '22 at 07:16

1 Answers1

0

At first there are some assumtions required for this answer:

  1. Two strings describing the same model differ in a way that they either do or do not contain a manufacturer string.
  2. It is feasibl to make a list of all possible manufacturer strins.

If these two assumptions are satisfied, one can ingore every string that is part of the manufacturer sting list while comparing two model stings. This way only the rest of the model string is evaluated.

Here is an example in C#. The local strings aClean and bClean are used to not mess up the original strings.

List<string> manufacturers      // List of all possible manufacturer stings
List<string> modelsA            // List of all models strings form file A
List<string> modelsB            // List of all models strings form file B

foreach (string a in modelsA)
{
    // Remove manufacturer name and spaces
    string aClean = RemoveManufacturer(a).Replace(" ", "");
    foreach (string b in modelsB)
    {
        // Remove manufacturer name and spaces
        string bClean = RemoveManufacturer(b).Replace(" ", "");

        // Now compare and process the strings. 
        // Store original strings a or b if required
        ...
    }
}

string RemoveManufacturer(string model)
{
    foreach (string manufacturer in manufacturers)
    {
        // remove manufacturer from model if possible
        model.Replace(manufacturer, "");
    }
    return model;
}

This code is far from optimized. But it seems that your use case is not exactly performance sensitive anyways.

Felix
  • 1,066
  • 1
  • 5
  • 22