I have three tables in my database: tbl_object
, tbl_tag
adn tbl_object_tag_xref
.
tbl_object
and tbl_tag
only have the two columns id
and name
.
tbl_object_tag_xref
matches objects to tags and vice versa, having the columns id
, tbl_object_id
and tbl_tag_id
.
The tbl_object
and tbl_tag
tables be already filled with the appropriate data. Only the tbl_object_tag_xref
table is still empty.
I have a script which parses various data sources and outputs these xrefs in CSV format:
"object1","tag1"
"object1","tag2"
"object2","tag2"
...
Since the script can fail parsing and produce a few wrong lines, this output is then validated and corrected from humans by hand (the real data here is much more complicated and this can't be automated).
After correction, the data is converted to SQL queries:
INSERT INTO tbl_object_tag_xref VALUES (NULL, (SELECT id FROM tbl_object WHERE name = 'object1'), (SELECT id FROM tbl_tag WHERE name = 'tag1'))
INSERT INTO tbl_object_tag_xref VALUES (NULL, (SELECT id FROM tbl_object WHERE name = 'object1'), (SELECT id FROM tbl_tag WHERE name = 'tag2'))
INSERT INTO tbl_object_tag_xref VALUES (NULL, (SELECT id FROM tbl_object WHERE name = 'object2'), (SELECT id FROM tbl_tag WHERE name = 'tag2'))
...
Now I want to import this data into the database by sourcing these statements. Of course there is a intrinsic possibility of errors in these statements. How is the best way to debug these errors? Are there tools which can assist me? Should I rethink the whole workflow?
The optimal debugging-workflow for me would look like this (pseudocode):
BEGIN;
SOURCE insert_statements.sql;
SUMMARIZE all warnings and errors;
including information whether already a subquery failed, which one and why
including information about the respective line numbers and content in the .sql and the .csv files
ASK whether to COMMIT or ROLLBACK;
However, SHOW WARNINGS
and SHOW ERRORS
does not work, since the failing statement is usually not the last one. SHOW INNODB STATUS
is a pain in the ass and only shows the last foreign key error, no summary. Since the data may be huge (3000+ lines) scrolling is no option and the output buffer is usually to small to save all the output.
Any suggestions?