I have a dictionary with keys as my customer ID and values as my movie id. Though the customer has watched the same movie many times, I want it to make as one. Here I need to convert my dictionary to binary data. In all the rows I need the customers ID's and columns as movie id's, where if the customer has watched the movie, it gives 1 else 0.
d = {'121212121' : 111, 222, 333, 333,444, 444, '212121212' : 222, 555, 555, 666, '212123322' : 555, 666, 666, 666, 777}
Desired output :
customer ID 111 222 333 444 555 666 777
121212121 1 1 1 1 0 0 0
212121212 0 1 0 0 1 1 0
121323231 0 0 0 0 1 1 1
I have tried using count vectorizer()
code :
cv = CountVectorizer()
movies = cv.fit_transform(cust['movies_list'])
cols = cv.vocabulary_
movies_ = pd.DataFrame(movies.toarray(), columns = cols, index =
cust['customer_id'])
movies_
output :
customer ID 111 222 333 444 555 666 777
212121212 1 1 2 2 0 0 0
121212121 0 1 0 0 2 1 0
121323231 0 0 0 0 1 3 1
The customer Id's dint match and I got a count on how many times he watched the movie.