I am trying to convert a large database (~3m rows) that contains the following data set titled "Posts":
+-------|---------------|-----------------------+
| id | name | tags |
|-------|---------------------------------------|
| 1 | post title | tag_a, tag_b |
| 2 | another title | tag_b, tag_e, tag_j |
+-------|---------------|-----------------------+
I also have an empty "tags" table with the headings id, title and a "posts_tags" table with the headings id, post_id, tag_id
Post <-- Habtm --> Tag
My question:
I would like to know the most efficient (preferred but not required cake way) of populating the "tags" table and the "posts_tags" habtm table while keeping the tags table free from duplicates?
Many Thanks SO Team!