Questions tagged [vocabulary]

For questions related to dictionary-like structures in programming, mainly to Semantic Web vocabularies. Please do not use in place of the "terminology" tag.

In Semantic Web

  • In the Semantic Web field, a controlled vocabulary is a set of URIs used to identify things, relations or classes.

  • A vocabulary with well-developed subsumption (subclass-superclass) relations is often called a taxonomy.

  • A taxonomy with well-developed non-subsumption relations is often called an ontology.

See also:

190 questions
0
votes
1 answer

RDF vocabulary for units

I am looking for a RDF vocabulary that describes the unit gross tonnage. I have browsed the qudt unit vocabulary without finding the unit I am looking for. Can't seem to find a vocabulary describing gross tonnage when searching around either. Does…
veleda
  • 147
  • 2
  • 7
0
votes
6 answers

What's a fast way to lookup text data in a large text file?

I have a vocabulary with different words and information about them. It's about 100MB in size. Searching this file takes a very long time, however. Is there any way to improve the speed at which I can lookup the data? For example, I was thinking of…
Dj Sushi
  • 313
  • 2
  • 14
0
votes
1 answer

Is There A Way To Not Get False For The Defintion Of A Word With The Vocabulary Module?

I am making a script that takes a random word and gets the definition of the word, then it would convert it into speech and play it, but for some reason whenever I try to get the definition of the word it just returns false, is there any way to fix…
0
votes
3 answers

Build vocabulary representations python

I have a list of strings of this form: ['1---d--e--g--gh','1---c---e--gh--', '1---ghj--h--h--', '1---g--gkk--h--', '1---d--dfe---fg', '1---c--d--dh--j', '1---f--gh--h--h', '1---fg-hg-hh-fg', '1---d--cd7--d--', '1---gghG--g77--', '1---hkj--kl--l-',…
joasa
  • 946
  • 4
  • 15
  • 35
0
votes
0 answers

Are there any tool to search / list of existing concept when developing ontology?

Are there any search tool / list of common concept/vocabulary from existing schema that can be used as reference when developing new ontology? e.g. foaf http://xmlns.com/foaf/spec/ Currently I am developing new ontology, and wondering whether I…
MDGS
  • 11
  • 3
0
votes
1 answer

Get specific classes n-grams

I have a dataset of tweets, each labeled as hate (1) or non hate (0). I vectorized the data using a [3,4] character n-grams bag of words (sklearn's CountVectorizer) and I want to extract the most frequent n-grams for each class. The following code…
GRoutar
  • 1,311
  • 1
  • 15
  • 38
0
votes
1 answer

hash vectorizer in R text2vec package with stopwords removal option

I am using R text2vec package for creating document-term-matrix. Here is my code: library(lime) library(text2vec) # load data data(train_sentences, package = "lime") # tokens <- train_sentences$text %>% word_tokenizer it <- itoken(tokens,…
Sam S.
  • 627
  • 1
  • 7
  • 23
0
votes
2 answers

Difference between the total number of words (length of a list) and vocabulary of a list or file in NLP?

How to compute the total number of words and vocabulary of a corpus stored as a list in python? What is the major difference between these two terms? Suppose, I am using the following list. The total number of words or the length of the list can be…
M S
  • 894
  • 1
  • 13
  • 41
0
votes
1 answer

Tensorflow: How to feed in data in vocabulary feature column?

I'm currently working on a classification problem on text input basis and my main question is the following: Am I correct in assuming that I can parse my complete sentence as one string to the vocabulary column or do I need to split the sentence in…
Tom
  • 33
  • 5
0
votes
2 answers

how to use views to display taxonomy vocabulary list in 2 level on Drupal 7

I'm working on my first Drupal 7 project. I'm having a problem about vocabulary term listing :/ I hope you can advise... I created Vocabulary named Services, and terms are 2 level as below (there will be many more); and I have a Services page to…
designer-trying-coding
  • 5,994
  • 17
  • 70
  • 99
0
votes
1 answer

Gensim DOC2VEC trims and delete the vocabulary

I tried creating a simple Doc2Vec model: sentences = [] sentences.append(doc2vec.TaggedDocument(words=[u'scarpe', u'rosse', u'con', u'tacco'], tags=[1])) sentences.append(doc2vec.TaggedDocument(words=[u'scarpe', u'blu'], tags=[2])) …
Nicolò Gasparini
  • 2,228
  • 2
  • 24
  • 53
0
votes
0 answers

pytorch language table for use in deep learning

can anyone tell me the error here? import pytorch_translator as pt import pickle import torch # Set up the dictionary (vocab table) pairs, src, tgt = pt.genLangs("de", "en", "train-short.txt", vocab_size=6000) with open("langs.pkl", 'wb') as…
Helen
  • 3
  • 4
0
votes
1 answer

Offline use of semantic web vocabularies

There are some useful vocabularies out there for use of semantic web applications, one of which is the well known "foaf". How should I use it in an offline system, meaning a network disconnected from the www? Is it downloadable? Should I use some…
DannyL
  • 505
  • 4
  • 10
0
votes
1 answer

How to create an object which stores mappings from a word in a vocabulary to its index?

I have a tokenized list of words in a vocabulary. (It's been passed through a set, so there are no duplicates.) My problem I want to generate a method which creates a dictionary that allows a mapping from the word to its index in the vocabulary. My…
quanty
  • 824
  • 1
  • 12
  • 21
0
votes
1 answer

gensim: Retrieving word frequency in doc2vec vocabulary

I just came across this StackOverflow post on word counts in a doc2vec model vocabulary. I wonder if there is another method to retrieve the word frequency, other than for word, vocab_obj in model.wv.vocab.items(): print(str(word) +…
Christopher
  • 2,120
  • 7
  • 31
  • 58