I have thousands of web pages(need login with username and passwords) like https://XXX.incometax.XXX/Preview/ViewDetail?TIN_INFO_NO=11935# where only last four digits(11935 for this example) changes for each url. Each url retrives tax information for a taxpayers in different types of tables. Tables are served based on the information entered in the system for each taxpayer e.g. Some taxpayer's information table shows National Identity Card(NID) number for those who created electronic taxpayer's identification number(eTIN) using NID and for some taxpayer's information table shows Passport number(for those who created eTIN using passport number).So the bottom line is information table is different for different taxpayer. Now I need an automation that extracts those tables in a way that all newly found columns should be created and places respective columns data under respective column.
e.g. Suppose one taxpayer can create eTIN using either NID or Passport Number but not the both.Say at first pass automation system finds NID information and in the second pass it finds Passport information, now it will create new column named passport and place respective information under it and if in the third pass it finds NID information then it will place that information under the previously(at first pass) created NID column.Finally the automation system will generate a single csv file.
N.B. There is no legal restrictions for me to extract information from that site.I would like to have a non-programmatic solution.