I'm not sure what exactly you want to achieve. If you posted a CURL query that does what you want, it would make it easier to translate it into Elasticsearch DSl or elasticsearch-py interface.
If you're looking for an alternative to _analyze
method but in Python, you can achieve it using elasticsearch-py, I'm not sure you can do that using Elasticsearch DSL though. So let's say I want to see the results of how my string jestem biały miś
is analyzed using my analyzer named morfologik
. Using CURL I would just run:
$ curl -XGET "http://localhost:9200/morf_texts/_analyze" -H 'Content-Type: application/json' -d'
{
"analyzer": "morfologik",
"text": "jestem biały miś"
}'
{
"tokens": [
{
"token": "być",
"start_offset": 0,
"end_offset": 6,
"type": "<ALPHANUM>",
"position": 0
},
{
"token": "biały",
"start_offset": 7,
"end_offset": 12,
"type": "<ALPHANUM>",
"position": 1
},
{
"token": "miś",
"start_offset": 13,
"end_offset": 16,
"type": "<ALPHANUM>",
"position": 2
},
{
"token": "misić",
"start_offset": 13,
"end_offset": 16,
"type": "<ALPHANUM>",
"position": 2
}
]
}
In order to achieve the same result using elasticsearch-py, you can run the following:
from elasticsearch import Elasticsearch
from elasticsearch.client import IndicesClient
client = Elasticsearch()
indices_client = IndicesClient(client)
indices_client.analyze(
body={
"analyzer": "morfologik",
"text": "jestem biały miś",
}
)
The output of the analyze
method is the same as of the above CURL request:
{'tokens': [{'token': 'być',
'start_offset': 0,
'end_offset': 6,
'type': '<ALPHANUM>',
'position': 0},
{'token': 'biały',
'start_offset': 7,
'end_offset': 12,
'type': '<ALPHANUM>',
'position': 1},
{'token': 'miś',
'start_offset': 13,
'end_offset': 16,
'type': '<ALPHANUM>',
'position': 2},
{'token': 'misić',
'start_offset': 13,
'end_offset': 16,
'type': '<ALPHANUM>',
'position': 2}]}