I'm working with OpenSearch, and I have a large input text that contains several exercise names. I'd like to extract these exercise names from the input text and search for documents that match these names in my OpenSearch index.
The input text can be of any format and contain various characters, such as lowercase or uppercase letters, numbers, and special characters. Exercise names within the input text are not guaranteed to start with a capital letter or follow any specific pattern. Here's an example of an input text:
I will make a good 10 push-ups and Dumbbell Deficit Push-up
In the index I have:
[
{
"id": 2,
"name": "Ankle Circles"
},
{
"id": 3,
"name": "Barbell Deep Squat"
},
{
"id": 10,
"name": "Push-ups"
},
{
"id": 11,
"name": "Sit-up"
},
{
"id": 12,
"name": "Air Squats"
},
{
"id": 13,
"name": "Dumbbell Deficit Push-up"
},
{
"id": 14,
"name": "Pretzel Stretch"
},
{
"id": 15,
"name": "Cobra Stretch"
},
{
"id": 20,
"name": "Push-ups with Elevated Feet"
}...
]
Here my Search Request:
SearchResponse<ExerciseOSDto> searchResponse = openSearchClient.search(
s -> s.index("exercises")
.query(new Query.Builder().match(
new MatchQuery.Builder()
.field("name")
.query(new FieldValue.Builder()
.stringValue(payload.getText()).build())
.operator(Operator.Or)
.build())
.build()), ExerciseOSDto.class);
But from this example i have all exercises where present (up/ups/push).
From the input text, I'd like to get exercises with id - 10 and 13
What is the best approach to extract these exercise names from the input text and perform a search in OpenSearch?
Any help or guidance would be much appreciated!