I have a list of company names, and I have a list of url's mentioning company names.
The end goal is to look into the url, and find out how many of the companies on the url are in my list.
Example URL: http://www.dmx.com/about/our-clients
Each URL will be structured differently, so I don't have a good way to do a regex search and create individual strings for each company name.
I'd like build a for loop to search for each company from the list on the entire contents of the URL. But it seems like Levenshtein is better for two smaller strings, vs. a short string and a large body of text.
Where should this beginner be looking?