We need to integrate and link a number of "tables" at hand, in csv, tsv, and excel file formats. They are partially related by field names and/or field values. A simple example would be
table1
id name City
xx namex NY
yy namey Houston
zz namez SA
and table2
old_id old_name vendor
yy namey ven1
zz namez ven2
Without formally linking all fields, which would be very time-consuming, we are looking for a software tool that automatically explores and links information from multiple resources.
For example,
If given
table1
andtable2
, the system would try to match the fields automatically and generate a combined table, based on values in these two tables.If given
table1 id zz
, the system would find all data sources that contain or partially match valuezz
, evaluate their relevance (e.g. if the data source containsxx
,yy
, if the same row containsnamez
), and list all relevant information forzz
.
There are a plethora of tools in the general field of data integration and linking, but can anyone point me to a tool that fits this particular scenario?