At first there are some assumtions required for this answer:
- Two strings describing the same model differ in a way that they either do or do not contain a manufacturer string.
- It is feasibl to make a list of all possible manufacturer strins.
If these two assumptions are satisfied, one can ingore every string that is part of the manufacturer sting list while comparing two model stings. This way only the rest of the model string is evaluated.
Here is an example in C#. The local strings aClean
and bClean
are used to not mess up the original strings.
List<string> manufacturers // List of all possible manufacturer stings
List<string> modelsA // List of all models strings form file A
List<string> modelsB // List of all models strings form file B
foreach (string a in modelsA)
{
// Remove manufacturer name and spaces
string aClean = RemoveManufacturer(a).Replace(" ", "");
foreach (string b in modelsB)
{
// Remove manufacturer name and spaces
string bClean = RemoveManufacturer(b).Replace(" ", "");
// Now compare and process the strings.
// Store original strings a or b if required
...
}
}
string RemoveManufacturer(string model)
{
foreach (string manufacturer in manufacturers)
{
// remove manufacturer from model if possible
model.Replace(manufacturer, "");
}
return model;
}
This code is far from optimized. But it seems that your use case is not exactly performance sensitive anyways.