Questions tagged [spacy]

Industrial strength Natural Language Processing (NLP) with Python and Cython

spaCy is a library for advanced Natural Language Processing in Python and Cython. Its features include tokenization, part-of-speech tagging, dependency parsing, sentence boundary detection, named entity recognition and training of statistical neural network models.


Resources

3742 questions
28
votes
5 answers

Evaluation in a Spacy NER model

I am trying to evaluate a trained NER Model created using spacy lib. Normally for these kind of problems you can use f1 score (a ratio between precision and recall). I could not find in the documentation an accuracy function for a trained NER…
Mpizos Dimitris
  • 4,819
  • 12
  • 58
  • 100
27
votes
3 answers

How to add a Spacy model to a requirements.txt file?

I have an app that uses the Spacy model "en_core_web_sm". I have tested the app on my local machine and it works fine. However when I deploy it to Heroku, it gives me this error: "Can't find model 'en_core_web_sm'. It doesn't seem to be a shortcut…
rohit0505
  • 375
  • 1
  • 3
  • 10
27
votes
4 answers

Sharing data across my gunicorn workers

I have a Flask app, served by Nginx and Gunicorn with 3 workers. My Flask app is a API microservice designed for doing NLP stuff and I am using the spaCy library for it. My problem is that they are taking huge number of RAM as loading the spaCy…
Lee
  • 3,044
  • 1
  • 12
  • 25
26
votes
12 answers

Failed building wheel for spacy

I'm trying to install spacy by running pip install spacy for python version 3.6.1 but continuously i'm getting errors like below,how to get rid of this issue? previously i was having cl.exe not found error, after that i added visual studio path in…
Sachin Prasad
  • 363
  • 1
  • 3
  • 9
26
votes
3 answers

How to find the most common words using spacy?

I'm using spacy with python and its working fine for tagging each word but I was wondering if it was possible to find the most common words in a string. Also is it possible to get the most common nouns, verbs, adverbs and so on? There's a count_by…
Harry Loyd
  • 429
  • 2
  • 7
  • 14
25
votes
7 answers

Spacy nlp = spacy.load("en_core_web_lg")

I already have spaCy downloaded, but everytime I try the nlp = spacy.load("en_core_web_lg"), command, I get this error: OSError: [E050] Can't find model 'en_core_web_lg'. It doesn't seem to be a shortcut link, a Python package or a valid path to a…
codingInMyBasement
  • 728
  • 1
  • 6
  • 20
25
votes
12 answers

Spacy link error

When running: import spacy nlp = spacy.load('en') the following is printed: Warning: no model found for 'en' Only loading the 'en' tokenizer. /site-packages/spacy/data is empty with the exception of the init file. all filepaths are only…
negfrequency
  • 1,801
  • 3
  • 18
  • 30
24
votes
3 answers

Where does spacy language model download?

I have a simple command: python -m spacy download en_core_web And I cannot for the life of me figure out where it downloads. I search for "en_core_web" but can find absolutely nothing, anywhere. And I can't for the life of me figure out what to…
Josh Flori
  • 295
  • 1
  • 2
  • 13
24
votes
1 answer

Get position of word in sentence with spacy

I'm aware of the basic spacy workflow for getting various attributes from a document, however I can't find a built in function to return the position (start/end) of a word which is part of a sentence. Would anyone know if this is possible with…
jack west
  • 569
  • 1
  • 4
  • 14
23
votes
4 answers

SpaCy: how to load Google news word2vec vectors?

I've tried several methods of loading the google news word2vec vectors (https://code.google.com/archive/p/word2vec/): en_nlp = spacy.load('en',vector=False) en_nlp.vocab.load_vectors_from_bin_loc('GoogleNews-vectors-negative300.bin') The above…
Jasper
  • 1,115
  • 1
  • 8
  • 12
22
votes
3 answers

spaCy and spaCy models in setup.py

In my project I have spaCy as a dependency in my setup.py, but I want to add also a default model. My attempt so far has been: install_requires=['spacy',…
w4nderlust
  • 1,057
  • 2
  • 12
  • 22
21
votes
4 answers

How can i work with Example for nlp.update problem with spacy3.0

i am trying to train my data with spacy v3.0 and appareantly the nlp.update do not accept any tuples. Here is the piece of code: import spacy import random import json nlp = spacy.blank("en") ner =…
TanrCans
  • 213
  • 1
  • 2
  • 5
21
votes
7 answers

spacy with joblib library generates _pickle.PicklingError: Could not pickle the task to send it to the workers

I have a large list of sentences (~7 millions), and I want to extract the nouns from them. I used joblib library to parallelize the extracting process, like in the following: import spacy from tqdm import tqdm from joblib import Parallel,…
Minions
  • 5,104
  • 5
  • 50
  • 91
21
votes
5 answers

Spacy, Strange similarity between two sentences

I have downloaded en_core_web_lg model and trying to find similarity between two sentences: nlp = spacy.load('en_core_web_lg') search_doc = nlp("This was very strange argument between american and british person") main_doc = nlp("He was from…
Mr.D
  • 7,353
  • 13
  • 60
  • 119
21
votes
1 answer

How does spacy use word embeddings for Named Entity Recognition (NER)?

I'm trying to train an NER model using spaCy to identify locations, (person) names, and organisations. I'm trying to understand how spaCy recognises entities in text and I've not been able to find an answer. From this issue on Github and this…
Navaneethan Santhanam
  • 1,707
  • 2
  • 13
  • 17