Questions tagged [similarity]

Similarity measures quantify how much alike objects (e.g. documents, feature vectors) are.

In information retrieval, is used to describe the relevance between document vectors. The measurement is further used to rank search results.

1866 questions
0
votes
0 answers

Retrieving synonyms from an ontology

My question is about my Ontology which retrieves synonyms from the ontology by using isSynonymOf object property. (Note that my Ontology file is 260 kb and expected to become 500kb). I am using following code to retrieve the synonyms. The…
ALee
  • 53
  • 1
  • 8
0
votes
1 answer

Identifying similar elements in a dataset to detect spammers

I am trying to identify spammers in my dataset. My dataset has listing id, user id, email id and phone number. Listing id is the unique key and a single person can have multiple listings. I have used fuzzy lookup to find similarity index between…
Astha
  • 1
0
votes
1 answer

Find Top X most similar users, based on Y boolean attributes

Let's say I have 5 users, each with 5 boolean attributes, which could look like this: | A B C D E -------------------------- User 0 | 1 1 0 1 0 User 1 | 0 1 0 1 0 User 2 | 0 0 1 0 1 User 3 | 1 1 0 0…
user3262883
  • 119
  • 7
0
votes
1 answer

Recommended algorithms for word similarity

I'm researching viable algorithms/solutions to implement and solve following problem: match users based on their common interests Example: U1: skiing, asian culture, meditation, java, crypto U2: yoga, meditation, management, travel tips USA U3:…
zeratul021
  • 2,644
  • 3
  • 32
  • 45
0
votes
1 answer

best algorithm to predict 3 similar blogs based on a blog props and contents only

{ "blogid": 11, "blog_authorid": 2, "blog_content": "(this is blog complete content: html encoded on base64 such as)…
sns
  • 221
  • 4
  • 17
0
votes
2 answers

Compute the similarity between two lists of objects

I'd like to compute the similarity between two lists of various lengths. In particular, the similarity has to take into account different conditions: -Given 2 list A and B, if A=B then similarity(A,B)=1 -In general, if B contains A, then similarity…
0
votes
2 answers

How to compare two lists and return the highest similarity of words in a list

I have a list list1 = ['good'] I have another list with synonyms of the word "good" list2 = ['unspoilt', 'respectable', 'honorable', 'undecomposed', 'goodness', 'well', 'near', 'commodity', 'safe', 'dear', 'just', 'secure', 'in_force',…
Mathwog
  • 17
  • 3
0
votes
0 answers

Detect similar images using PHP with perceptual hashing or other technology

I have 20 JPG images: I need to detect this 10 similar images: I tried to use Perceptual hash implementation for PHP with above code:
kostya572
  • 169
  • 2
  • 21
0
votes
2 answers

how to get most differents string from a list

I have a list of many strings that have similarities, example : $str = array('monkey eat a banana', 'dog eat a banana', 'cat devour an apple', 'cat dine a coco'); //etc I would like to extract X strings from…
JojoLapin45
  • 139
  • 2
  • 11
0
votes
1 answer

Hadoop MapReduce Word (Hashtag) count with similar word grouping not working

I've been trying to create a Twitter Hashtag-Count Hadoop program. I've successfully extracted the text, gotten the hashtags and started trying to count them. One of the earliest problems I encountered is that many hashtags are extremely similar…
Spyros
  • 197
  • 5
  • 17
0
votes
1 answer

find similarity in mysql columns

I have been asked to fetch data from Mysql-Table where Firstname and Lastname column contains similiar or alike values. So far i have tried to use soundex expression select * from table where soundex(firstname) = soundex(lastname) but there are…
Christian Felix
  • 644
  • 1
  • 9
  • 28
0
votes
0 answers

Update Word2vec Vectors

I have a corpus that contains several documents, for example 10 documents. The idea is to compute the similarity between them and combine the most similar ones into one document. So the result may be 4 documents. What I have done so far is that I…
practitioner
  • 412
  • 1
  • 5
  • 12
0
votes
1 answer

Skimage similaritytransform does not work

I am trying to apply a similarity transform to align faces: from skimage.transform import SimilarityTransform, ProjectiveTransform from skimage import transform from scipy.misc import imshow # face image detected facial landmarks src =…
redsphinx
  • 91
  • 5
0
votes
1 answer

gensim document similarity: how to get document titles from most similar results?

I am using gensim to analyze document similarity in a large corpus. Each document has a "title", or more specifically, a unique ID string, along with the content text. After looking through several tutorials about top modeling, indexing and…
tony_tiger
  • 789
  • 1
  • 11
  • 25
0
votes
0 answers

How to best represent movie actors as a binary vector using Spark

I have a database of movies. Each movie has an Actors field among other fields represented as an Array of Strings like: ['Johny Depp', 'Leonardo DiCaprio', 'John Malkovich'] I've already done Plot-based similarity calculation using Spark's ml…
1 2 3
99
100