0

I'm trying to use Stanza language models with Presidio and running into this blocker.

import stanza
stanza.download("en")

from presidio_analyzer.nlp_engine import StanzaNlpEngine
StanzaNlpEngine(models={"en": "en"})

Above throws...

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".../presidio_analyzer/nlp_engine/stanza_nlp_engine.py", line 41, in __init__
    for lang_code, model_name in models.items()
  File ".../presidio_analyzer/nlp_engine/stanza_nlp_engine.py", line 41, in <dictcomp>
    for lang_code, model_name in models.items()
NameError: name 'StanzaLanguage' is not defined

This seems like this should work, looking at the code.

Even this throws the same error.

StanzaNlpEngine()
Tomerikoo
  • 18,379
  • 16
  • 47
  • 61
conner.xyz
  • 6,273
  • 8
  • 39
  • 65

1 Answers1

0

The issue was fixed in version 2.2.2:

the Spacy-Stanza interface had changed in spaCy 3. This PR proposes:

  1. The fix to the spacy-stanza interface.
  2. Bug fix on getting the right recognizer given the nlp engine (SpacyRecognizer for SpancyNlpEngine and StanzaRecognizer for StanzaNlpEngine)
  3. Bug fix in the decision process string for Stanza
Tomerikoo
  • 18,379
  • 16
  • 47
  • 61