i need help with regex in spacy in japanese. I have this text: 道が凍っているから気を付けなさい。 I need to find match every word until the character "を" in japanese, so essentially i need to get "道が凍っているから気を" . I tried this code:
nlp =spacy.load("ja_core_news_sm")
matcher = Matcher(nlp.vocable)
pattern = [{"TEXT": {"REGEX": "^.*?[を]"}}]
matcher.add("mypattern", [pattern])
doc = nlp(Verbwithnoun)
matches = matcher(doc)
for match_id, start, end in matches:
string_id = nlp.vocab.strings[match_id]
print(doc[start:end)
But it prints me nothing, but when i try this pattern "^.*?[を]" on different python regex test site like Regex101 or Pythex it works perfectly, it returns me the correct sentence. But in spacy it doesn't work. It prints nothing. Can somebody please help me?