I am working on some data mining self-learning from a free online resource I found. Basically I got a csv file with a bunch of names, movie titles, and what each person rated it. I'm trying to get the K-Nearest Neighbor from it using a cosine metric but I can't get the output to look not awful. Heres what I have so far for the code:
from pandas import DataFrame
import pandas as pd
import numpy as np
from sklearn.neighbors import NearestNeighbors as nn
df = pd.read_csv("https://docs.google.com/spreadsheets/d/1MSBm3M6YmaLf0aiJCvkvrPsIJB2pPuBwse5ylnzEHRI/pub?gid=639849687&single=true&output=csv",index_col='Unnamed: 0')
df = df.fillna(0)
nn([df], metric = 'cosine')
Pretty simple to do! Except my output looks like this:
NearestNeighbors(algorithm='auto', leaf_size=30, metric='cosine',
metric_params=None, n_jobs=1,
n_neighbors=[ Patrick C Heather Bryan
Patrick T Thomas aaron \
Alien NaN NaN 2.0 NaN 5.0
4.0
Avatar 4.0 5.0 5.0 4.0 2.0 NaN
Blade Runner 5.0 NaN NaN N...
You Got Mail NaN 2.0 2.0 1.0 2.0 NaN 2.0
[25 rows x 25 columns]],
p=2, radius=1.0)
Its messy and doesn't even show all the data. I tried casting it into an array but I go the error message "'ABCMeta' object does not support indexing"
I'm fairly new to Python, I can do a few basic things but I am no expert. I was hoping someone could help nudge me in the direction to help clean this up.
Thank you.