0

I am trying to convert a large database (~3m rows) that contains the following data set titled "Posts":

+-------|---------------|-----------------------+ | id | name | tags | |-------|---------------------------------------| | 1 | post title | tag_a, tag_b | | 2 | another title | tag_b, tag_e, tag_j | +-------|---------------|-----------------------+

I also have an empty "tags" table with the headings id, title and a "posts_tags" table with the headings id, post_id, tag_id

Post <-- Habtm --> Tag


My question:

I would like to know the most efficient (preferred but not required cake way) of populating the "tags" table and the "posts_tags" habtm table while keeping the tags table free from duplicates?

Many Thanks SO Team!

Tricky
  • 410
  • 8
  • 17
  • Please always mention your exact CakePHP version (last line in vendor/cakephp/cakephp/VERSION.txt or lib/Cake/VERSION.txt) – Rayann Nayran May 15 '17 at 13:39
  • @RayannNayran thanks for the heads up - Version added – Tricky May 15 '17 at 13:45
  • Why do you need to do it the "cake way" (whatever that means exactly) in the first place? This could even be done purely on SQL level. – ndm May 15 '17 at 16:54
  • @ndm It does not strictly have to be the cake way but it has not failed me so far. however if a pure sql answer were provided i see no harm. – Tricky May 15 '17 at 19:24

1 Answers1

0

I have no time to write code right now.

You could get all posts (I recommend you paginate your result) and for each post, you get its tags and explode it by comma.

Then you create an HABTM array data using the tags and the currently post, and finally you save your data.

Rayann Nayran
  • 1,135
  • 7
  • 15