1

I would like to create a summary with the major points of the original document. To do this, I made sentences embeddings with a Universal Sentence Encoder(https://tfhub.dev/google/universal-sentence-encoder/2). After, I would like apply clustering on my vectors.

I've tried with the library sklearn:

import numpy as np
from sklearn.cluster import KMeans

n_clusters = np.ceil(len(encoded)**0.5)
kmeans = KMeans(n_clusters=n_clusters)
kmeans = kmeans.fit(encoded)

But I get an error message:

'numpy.float64' object cannot be interpreted as an integer'
d9ngle
  • 1,303
  • 3
  • 13
  • 30
Eva Rolin
  • 41
  • 5

1 Answers1

1

The problem is caused in this line:

n_clusters = np.ceil(len(encoded)**0.5)

kmeans expects to receive an integer as the number of clusters so simply add:

n_clusters = int(np.ceil(len(encoded)**0.5))
d9ngle
  • 1,303
  • 3
  • 13
  • 30