I am developing an access DB where in one of the table inputs, duplicates regularly appear in (what I would eventually like to be) the primary key due to the use of some old software for reasons as can be seen below:
ID | Fluid | Designer | Team |
--------------------------------------------------
A | Water | John | Piping |
A | Water | James | Piping |
B | Steam | Sam | Piping |
B | Steam | Sam | Modeling |
C | Hydrogen | Joe | Piping |
D | Steam | Joe | Piping |
A duplicates query with reference to the ID field yields:
ID | Fluid | Designer | Team |
--------------------------------------------------
A | Water | John | Piping |
A | Water | James | Piping |
B | Steam | Sam | Piping |
B | Steam | Sam | Modeling |
Deleting these duplicates automatically is not desired as it is important to know, say, that the software model changed designer or was transferred to another team. As such, these are manually reviewed.
However, this table can contain as much as 30 fields and finding a single non-duplicate can be quite an eyesore. As such, what my desired output would be:
ID | Fluid | Designer | Team |
--------------------------------------------------
A | | John | |
A | | James | |
B | | | Piping |
B | | | Modeling |
I have very little experience with SQL but have spent some time using visual basic so my guess would be some criteria like:
if(record(n)=record(n-1), display("")
However from my basic understanding of access and SQL, I know that relating visual basic is by far, an apples to oranges comparison.
I'm running MS Access 2007-2010.
Edit: Looking into normalization now (apologies, I'm a beginner). More specifics though on my DB issue. Also updated the final table above.
How I believe my DB should work:
- Import data from software (daily). This yields table 1.
- Use a duplicate query to find any duplicates in the ID column. This is table 2.
- Use a query to generate a table which will display the duplicated IDs along with the respective column that is not duplicated.
The errors in the software derive from the fact that it amends new data onto preexisting data but only removes exact duplicates. If say, a different user works on the model, then instead of updating the current ID, it adds a new row with all details the same except a different user.