How do I delete duplicates in an access database with multiple one to many relationships

Question

So I have just started a job and all of our data is on this terrible access database with no input validation or protection against changing data. I am not that familiar with database management as of yet, though I am working on it and I need to polish this turd if you will. There are multiple tables with the same data copied from one table to another with no relationships where they make sense. One particular table has no primary keys designated and somehow every record has been duplicated at least once and sometimes up to 4 times. It is a modem database. Field1 is a modem_name, Field2 is a software edition, field3 is a waveform, the rest of the fields are data attributed to that waveform.

So as you can see there are many fields that say "modemA" in field1 and "software 1.2.3" in field2 and many fields that say "waveformX" in field3, etc. I need to delete the duplicate records, but I don't understand how to make the query check all of the fields as one whole record rather than individual fields. Because say modemA and modemB both have a software edition of 1.0, and both use waveformX and waveformY but "modemA" & "1.0" & "waveformX" is different from "modemA" & "1.0" & "waveformY" is different from "modemB" & "1.0" & "waveformX" ect... .

I wonder if it is best to just break it all out into separate tables with appropriate one to many relationships. I cannot go through the 7k records and delete every other one and I can't figure out how to add all the fields together to filter out the excess.

We can suspect you can use general ideas from the topic of 'data cleansing" that you could research yourself. But you need to tell us some new DB table(s) & exactly how to determine its value given an old table value before we can help you code a conversion. This description isn't detailed enough to give a detailed answer. Give a [mre]. Please [use text, not images/links, for text--including tables & ERDs](https://meta.stackoverflow.com/q/285551/3404097). — philipxy, Jan 30 '20 at 05:21

score 1 · Answer 1 · answered Jan 30 '20 at 04:07

I would suggest to first take a backup as a precaution before trying anything.

To identify unique rows in a table, you can use the following query:

SELECT DISTINCT * from TableName

You can create a new table out of this query output using the following outlined make table method:

https://support.office.com/en-us/article/create-a-make-table-query-96424f9e-82fd-411e-aca4-e21ad0a94f1b

score 0 · Answer 2 · answered Jan 29 '20 at 23:16

0

You could concatenate all relevant fields into 1 string and then group by that string. This would give you a query with unique records.

answered Jan 29 '20 at 23:16

Corno

5,448
4
25
41

score 0 · Answer 3 · answered Jan 29 '20 at 23:16

This could be a stab in the dark, but what I usually do is use the Query Wizard and there is an option there for finding duplicate data. Then you can choose to display the duplicated field as well as any other fields you choose.

If I understand you correctly about the checking all fields as one record, could you not make a composite primary key out of those? That way Access will only consider it a duplicate records if ALL fields have identical values.

How do I delete duplicates in an access database with multiple one to many relationships

3 Answers3