I have a database that has over 100,000 records of users and has no primary key. Could you explain to me how I would accomplish the following: 1. Auto generate primary keys based on Firstname MiddleName LastName. 2. Note that some records are redundant ie same users and I want the same users to have the same primary key e.g Mary Jane Joe is appearing 20 times in the rows I would want her to have the same primary key in all appearances. Thanks in advance
-
1Why not delete duplicates, then add an auto increment `int` key? – Bohemian Jan 22 '17 at 21:03
-
2Then it's not a primary key – Strawberry Jan 22 '17 at 21:03
-
@Bohemian...how do I delete the duplicates as it's a large database?? – Cavein Muthama Jan 22 '17 at 21:11
-
you can create a composite unique key/ primary key with the OPTION IGNORE then MySQL deletes all duplicates – Bernd Buffen Jan 22 '17 at 21:21
-
Are rows "redundant" based just on the name? Or on _all_ columns of the rows? – Rick James Jan 23 '17 at 00:30
-
Do the redundant rows violate entity integrity, or are they different entity instances that only look the same? – Walter Mitty Jan 23 '17 at 12:33
2 Answers
Best Option
Use a truly unique primary key by adding a not null int field with auto-increment. This will automatically give each record a unique primary key. If you still want to use the fields described above as a unique field, I would add a composite unique key between the fields.
One of the main benefits of this is that integer fields are a lot faster to query than varchars. That being said, your table only has 100,000 records so it will be negligible at this time.
Alternate
If you still want to use the keys you defined, then you have a few options both with the same process.
- Add a regular, nullable varchar field to your table, not a primary key yet(like this)
- Find duplicates and delete them
- Loop through each record and create your primary key based on the values you described and update the current records field defined in step 1 with the key.
- Alter the table to now add your primary key (like this)
I do have to caution you though, it sounds like your idea of a primary key is not truly unique, which defeats the purpose of a primary key.
Build the new table with the auto-inc (if needed) and the uniqueness constraint on name:
CREATE TABLE new (
id INT UNSIGNED AUTO_INCREMENT,
PRIMARY KEY(id),
UNIQUE(Firstname, MiddleName, LastName)
);
Copy all non-dup rows into that table. (This assumes "dup" involves just the 3 columns, not just the name.)
INSERT IGNORE INTO new
(Firstname MiddleName LastName, ...)
SELECT DISTINCT Firstname MiddleName LastName, ...
FROM real;
Switch tables (if desired):
RENAME TABLE real TO old, new TO real;
DROP TABLE old;

- 135,179
- 13
- 127
- 222