I had developed a software in C# using Windows Forms to scrape selected websites for images.
First problem I have is that the websites I monitor constantly change their look and feel, thus making my code in need for updating. I had switched to using XPaths to isolate the divs I look for, but the div ids change too. I have thought of using a text file with the div xpath for each site which the software would read thru, thus saving me the time to edit and recompile the code. Is there a better way to solve this problem ? Maybe CodeDom ?
Secondly, since every website uses different formatting and encoding I had to rewrite parts of code with the HtmlDocument, HtmlWebResponse, HtmlNodes and others for each of them, which ended up accounting for nearly half of my code. I could not put them together since some need extra scraping and paginating and some do not. Is there a way make to simplify this problem ?
Lastly, I have the whole code in one class file with around 600 lines of code. The only methods I have are the backgroundworkers, ui event handlers, a scraping method each for each site, and one method to save the images. Is it alright to have the whole code in one class ? When I used to write in Java, I used to often make use of multiple classes and call them as objects, this helped making changes to particular sections easier. Can I do the same with C# ?
Is there a more efficient approach to making the software ? I was thinking of making a class for each site, so that modifications could be done directly to the class in question, but that would cause a lot of lines to be repeated in each class. Or is it okay to have the whole in one class file ?
Thanks.
PS: This software is for personal use, but I think it is a good opportunity to learn and apply good programming.