0

In my database I have a Series Table associated to a Videos Table and in the videos table i have Title Column. I am trying to figure out a way to create a rake task to scrub the data.

So say you have an array

If I do

Series.first.videos.all.each {|x| x.title } #=> ["Some Title Episode 1", "Some Title | Episode 1"]

These are duplicates but validates true on the model.

So I am trying to figure out a way to create a rake task to scrub and delete the extra data not needed preferably to delete the data with the older created_at time stamp.

ericraio
  • 1,469
  • 14
  • 35
  • This will be hard due to inconsistencies in the titles. Is it common that a pipe character will separate the title and episode parts? – Austin Mar 29 '12 at 05:21
  • Well the pipe character is just an example that I may have an inconsitency But most of the records do have Episode and then digits in the title so i was trying to think of a way find grab each record and compare them against all of the other records(or elements of the same array) then have it delete the records with the older created_at time stamp. – ericraio Mar 29 '12 at 05:52
  • This sounds like something very hard. It will take alot of string reading, comparing and parsing. You want to take a look at a `Scanner` probably I think. Like mentioned here: http://stackoverflow.com/q/713559/859762 – Ben Mar 29 '12 at 06:33

0 Answers0