I am trying to extract a strings of the movie type from a data set. The data is in the following format where the genre types are randomly distributed in the dataset by different reviewers.Luckily there are only 4 genre types (comedy, action, horror, scifi) in the dataset, but there are also repetitions. So I need to extract those strings from the dataset.
id movie v1 v2 v3 v4 v5 v6
1 LTR comedy highbudget action comedy jj horror
2 MI newmovie fiction scifi funny xx jhee
I am expecting an output of the following form.
id movie genretype1 genretype2 genretype3 genretype4
1 LTR comedy action comedy horror
2 MI scifi --- --- ---
Any suggestions?