Questions tagged [cosine-similarity]

Cosine similarity is a measure of similarity between two vectors of an inner product space that measures the cosine of the angle between them. It is a popular similarity measure between two vectors because it is calculated as a normalized dot product between the two vectors, which can be calculated with simple mathematical operations.

From Wikipedia:

Cosine similarity is a measure of similarity between two vectors of an inner product space that measures the cosine of the angle between them. The cosine of 0 degrees is 1, and it is less than 1 for any other angle. It is thus a judgement of orientation and not magnitude: two vectors with the same orientation have a cosine similarity of 1, two vectors at 90 degrees have a similarity of 0, and two vectors diametrically opposed have a similarity of -1, independent of their magnitude.

Cosine similarity is a popular similarity measure between two vectors a and b because it can be computed efficiently dividing the dot product of the two vectors by the Euclidean norm of each (the square root of the sum of the squared terms). For instance, vectors (0, 3, 4) and (-3, 4, 0) have dot product 12 and each have norm 5, so their dot product similarity is 12/5/5 = 0.48.

1004 questions
-1
votes
1 answer

how to calculate dataframe row wise cosine similarity

hello folks i want to know suppose i have a python dataframe and I want to calculate the cosine similarity between the 1st row of the dataframe with the remaining rows of the dataframe. can anyone please help
-1
votes
1 answer

Machine Learning - Comparing Two Vectors

Is there a way to compare two vectors that do not follow any ordering semantics among its elements, by using any ML algorithm? Example - Compare (1,3,5) vs (9,7,5) and arrive at some result, and then use that result to check how close/far away they…
-1
votes
2 answers

Cosine similarity for special vectors (only one component)

I'm trying to implement cosine similarity for two vectors, but I ran into a special case where the two vectors only have one component, like this: v1 = [3] v2 = [4] Here is my implementation for the cosine similarity: def dotProduct(v1, v2): …
efsee
  • 579
  • 1
  • 10
  • 22
-1
votes
1 answer

very large matrix calculation in R efficiently

I am very new in R. I have a data-set with 139 columns and more than 46.5k rows. I have measured pairwise cosine similarity matrices between rows in the data-set where one row will be compared with rest of the other rows and will be excluded during…
-1
votes
1 answer

Comparison of two documents in python

Given two documents, I wish to calculate the similarity between them. I have measures to find out the cosine distance, N-Gram and tf-idf using this: This is a previously asked question I wish to know, what further needs to be done using these…
Chinmay Joshi
  • 89
  • 1
  • 9
-1
votes
1 answer

Compare text stored in each row across 2 columns in R

I have 2 vectors a=c("abc","def","ghi","jkl") b=c("abc","dez","gyx","mno") How can I get cosine values to compare corresponding entries? In this case, I need to be able to say the 1st entries in each vector is perfectly similar and 2nd entry in each…
VSAT
  • 1
  • 1
-1
votes
1 answer

Display nth highest value if matched with comboBox value

I am using cosine similarity function to compare the value between user input and the data in SQL. The highest value will be retrieved and displayed. However, k is the value getting from comboBox and it is hard constraints which mean they need to be…
John Joe
  • 12,412
  • 16
  • 70
  • 135
-1
votes
1 answer

How to pass vector to different class?

How can a vector pass to another class? I want to compare vector 1, which is in User.java and vector 2, in Case.java by using cosine similarity. User.java JButton btnNewButton = new JButton("Process"); btnNewButton.setBounds(360, 296,…
John Joe
  • 12,412
  • 16
  • 70
  • 135
-1
votes
1 answer

Memory issue sklearn pairwise_distances calculation

I have a large data frame where its index is movie_id and column headers represent tag_id. Each row is represent movie to tag relevance 639755209030196 691838465332800 \ 46126718359 0.042 …
add-semi-colons
  • 18,094
  • 55
  • 145
  • 232
-1
votes
2 answers

is this the right approach to calculate cosine similarity?

If you guys can please review if the following approach (pseudo-code) is good to go to calcualte cosine similarity between 2 vectors: var vectorA = [2,5,7,8]; var referenceVector= [1,1,1,1]; //Apply weights to vectors (apply positive or negative…
Rookie
  • 5,179
  • 13
  • 41
  • 65
-1
votes
1 answer

Dataset help for TF-IDF and Vector Model

I want to compare TF-IDF, Vector model and some optimization of TF-IDF algorithm. For that I need a dataset (at least 100 documents of English text). I am not able to find one. any suggestions ?
-2
votes
1 answer

Calcaulte the cosine similarity between two arrays and save the result in amtrix?

I have two arrays, A (size = (20, 200) and B (size = (15, 200)). I want to construct a matrix C (size = (20, 15)) s.t c[i,j] store the cosine similarity between elements A[i] and B[j]? I can do that using a loop, but it takes so long time if A and B…
-2
votes
2 answers

How to: square root squares for cosine similarity within an array ~java~

My issue is that I have am creating a book recommendation system and when I try to square root the squares to determine similarity. I do not believe it is square rooting all the contents of each array. The user is prompted with the twenty books and…
-2
votes
1 answer

How to calculate cosine similarity between scalar and vector?

How to calculate cosine similarity between scalar and vector in Python? I am trying to multiply the output of a ngram model's probability with the output of a pretrained word2vec model to rerank the next possible word using word prediction.…
pr338
  • 8,730
  • 19
  • 52
  • 71
-2
votes
1 answer

How to count word of sentence from database with PHP

I have a table in database |ID| Sentence | |1 | I have a Rabbit | |2 | I have a Turtle | How to count every word in that table (or this is a TF-IDF Raw method)? I = 2 have = 2 a = 2 Rabbit = 1 Turtle = 1 Anybody help me please…
1 2 3
66
67