0

My problem is to match text with math expressions inside. I saw this topic - What is the best way to index documents which contain mathematical expression in elastic search?

But in my context I can't just create an additional field with math expression. I have to match text like math tasks for students and these tasks could have multiple math expressions. I think the MathML is most preferred format here because I can split MathML tags into words and match them as a usual words.

I'm interested to get most close match results to math expressions. What is the most proper way to reach this kind of matching?

Examples:

  1. Solve the equation (2x + 7) ^ 2 = (2x - 1) ^ 2 .

  2. Find all values of the parameter a, for each of which the equation: | x - a ^ 2 + a + 2 | + | x - a ^ 2 + 3a - 1 | = 2a - 3 has roots, but none of them belongs to the interval (4; 19)

    P.S. graphical representation of equation: equation pict

Dmitry K.
  • 3,065
  • 2
  • 21
  • 32
  • 1
    Can you share a sample document? – Val Jan 25 '21 at 08:57
  • @Val added examples to question – Dmitry K. Jan 25 '21 at 09:12
  • I think it would be still a good idea to extract equations or math expressions into their own fields, keeping them in the description. Maybe even have three kinds of fields: description field containing human-readable text, you wouldn't index it. And then two separate fields: description without equations and equations where each of them would be analyzed differently. Then you could use `multi_match` to query both of them. Trying to index a single field with very, very, very different rules will be unpractical. – Evaldas Buinauskas Jan 25 '21 at 14:36
  • @EvaldasBuinauskas Yes, I thought about it too, MathML expressions can be easily extracted from text, because they're have XML markup. And if I understand you correct it also is possible to rank matches by math expressions match score. – Dmitry K. Jan 26 '21 at 05:38
  • There's nothing out of the box that would work for you, so you'd need to build a custom analyzer or parser for that. But it should be possible. The main idea is to separate searchable and user-facing fields due to completely different search semantics. :) – Evaldas Buinauskas Jan 26 '21 at 07:48

0 Answers0