Questions tagged [udpipe]

UDPipe comprises a free C++ library and a binary executable for Natural Language Processing (NLP).

UDPipe is a free C++ library for Natural Language Processing (NLP). UDPipe can do tokenization, parts-of-speech tagging, lemmatization and dependency parsing of raw text.

Binaries for Windows/Linux/OS X are also available, and there exist a web service and a REST API.

For details see http://ufal.mff.cuni.cz/udpipe and https://github.com/ufal/udpipe .

37 questions
0
votes
1 answer

How to extract entities names with SpacyR with personalized data?

Good afternoon, I am trying to sort a large corpus of normative texts of different lengths, and to tag the parts of speech (POS). For that purpose, I was using the tm and udpipe libraries, and given the length of the database. The other task I need…
0
votes
0 answers

Identify Country Name with nametagger model

I need to identify all countries mentioned in a text file using the nametagger model. However, I found out that there are mistakes in the Output. For expample, it identify Cuba as 'O' instead of 'B-LOC'. Also, it cannot correctly identify words…
0
votes
1 answer

Output Bash pipes to Python-compatible format

I'm working on text tokenization and lemmatization using UDPipe models. I can complete the task itself by using !echo commands or printing into a file, but I would like to generate a Python data structure to further process the output. What…
Cath Neman
  • 19
  • 5
0
votes
0 answers

Split the string into multiple sentences with R and pos tagging

I don't know if this is the right place, but if possible, could you help me split a text into several sentences using R. I have a database that contains the description of activities that employees perform. I would like to split this text into…
0
votes
1 answer

Find all possible phrase matches between string and lookup table

I have a data frame with a bunch of text strings. In a second data frame I have a list of phrases that I'm using as a lookup table. I want to search the text strings for all possible phrase matches in the lookup table. My problem is that some of the…
Obed
  • 403
  • 3
  • 12
0
votes
0 answers

r udpipe cooccurrence function throwing error 'i is not found in calling scope and it is not a column name either'

I would like to find out how many times nouns and adjectives are used in the same doc id. I found the cooccurrence() function of the udpipe package that perfectly serves this purpose. Here is my data frame: x <- structure(list(doc_id = c("doc1",…
Ane
  • 335
  • 1
  • 11
0
votes
1 answer

R - NLP - Extract pair

Hi guys I'm new to the NLP algorithm with R. I would like to extract a pair ( VERB-Noun) from a pdf? I'm stuck at a frequency of words topic. Like "Represent clients in criminal and civil litigation and other legal proceedings, draw up legal…
0
votes
1 answer

R SQL Server file does not exist error - but it does

I'm running R 3.5.2 inside SQL Server 2019. Loading the pre-trained udpipe model using the following command: udmodel_english <- udpipe_load_model(file = ''C:/ud/english-ewt-ud-2.5-191206.udpipe'') This works fine in Rstudio, and R directly.…
Maz
  • 183
  • 1
  • 2
  • 10
0
votes
1 answer

R Udpipe package install into SQL Server error

I get the following error when I try to run UDPIPE via external script call in SQL Server. Msg 39004, Level 16, State 20, Line 31 A 'R' script error occurred during execution of 'sp_execute_external_script' with HRESULT 0x80004004. Msg 39019, Level…
Maz
  • 183
  • 1
  • 2
  • 10
0
votes
1 answer

How to augment udpipe models with custom dictionary?

Is there a way to add a dictionary of custom user defined words to the udpipe models? For example, below using the default english model, some of the words should have been identified as the keywords, such as R, Python, SQL, javascript, Excel,…
Afiq Johari
  • 1,372
  • 1
  • 15
  • 28
0
votes
0 answers

problems with UDpipe models

I'm trying to implement a sentiment analysis study on data extracted from Twitter, with R. I am using the udpipe library when I write udpipe_dowload_model("model") model< <- udpipe_load_model("directory) out <- as.data.frame(udpipe_annotate(object,…
0
votes
1 answer

How to fix memory allocation issues when converting annotated NLP model to dataframe in R

I am trying to convert an annotated NLP model of size 1.2GB to dataframe. I am using the Udpipe package for natural language processing in R with following code: # Additional Topic Models # annotate and tokenize corpus model <-…
nigus21
  • 337
  • 2
  • 11
0
votes
1 answer

SpaCy-UDpipe load custom model colab

I'm trying to load a custom spacy-udpipe model into google colab. I tried !pip install ufal.udpipe !pip install spacy-udpipe import spacy_udpipe nlp = udpipe_download_model(language = ("italian-postwita")) but I get the following error…
komy83
  • 1
0
votes
1 answer

How to find the co-occurences of a specific term with udpipe in R?

I am new to the udpipe package, and I think it has great potential for the social sciences. A current project of mine to study how news articles write about networks and networking (i.e. the people kind, not computer networks). For this, I…
0
votes
0 answers

Can I enforce consistent return data types when processing a data.table?

I’m (trying) to annotate a fairly large data set with the udpipe package. For efficiency, I’ve got my data in a data.table, and am looping over the data set in smaller batches. Like so (data sample at end):…