Converting the pdf to the Scalable Vector Graphics (svg) xml format with mupdf will give you the information you want.
Download the mupdf tool here:
http://artifex.com/developers-mupdf-download/mupdf-download-resources/
and choose the GNU AGPL LICENSE
Or here:
https://mupdf.com/downloads/
Details:
https://mupdf.com/index.html
After you download the executable you should add the path to the mupdf executable to your PATH
Environment Variable.
You can then use the following from a command line interface (CLI) to convert the pdf (note - there will be a separate svg file for each page):
mutool convert -F svg -O text=text -o "your_pdf_pg.svg" your_pdf.pdf
More CLI details:
https://mupdf.com/docs/manual-mutool-convert.html
In all of the cases I have seen, the font, size, style, color, and page coordinates for each line of text where that information is the same. Except for underlines and strikeouts which are included in the svg file as <paths>
in the same coordinate system as the text. So you can develop some code to parse the xml and tag the text with the respective <u> </u>
or <del> </del>
accordingly.