I have written the following code to compute the cosine similarity between a number of preprocessed document (stop word removal, stemming and term frequency-inverse document frequency).
print(X.shape)
similarity = []
for each in X:
similarity.append(cosine_similarity(X[i:1], X))
print(cosine_similarity(X[i:1], X))
i = i+1
However, when I run it I receive this:
(2235, 7791)
[[ 1. 0.01490594 0.11752643 ..., 0.00941571 0.03652551
0.01239277]]
Traceback (most recent call last):
File "...", line 83, in <module>
similarity.append(cosine_similarity(X[i:1], X))
File "/Users/.../anaconda/lib/python3.5/site-packages/sklearn/metrics/pairwise.py", line 881, in cosine_similarity
X, Y = check_pairwise_arrays(X, Y)
File "/Users/.../anaconda/lib/python3.5/site-packages/sklearn/metrics/pairwise.py", line 96, in check_pairwise_arrays
X = check_array(X, accept_sparse='csr', dtype=dtype)
File "/Users/.../anaconda/lib/python3.5/site-packages/sklearn/utils/validation.py", line 407, in check_array
context))
ValueError: Found array with 0 sample(s) (shape=(0, 7791)) while a minimum of 1 is required.
[Finished in 56.466s]