If the font size is not set on the run, and a style is in use, you need to check the style hierarchy. If it is not set there, it comes back to defaults.
As ECMA 4ed Part 1 puts it in 17.7.2 (Style Hierarchy):
This process can be described as follows:
- First, the document defaults are applied to all runs and paragraphs in the document.
- Next, the table style properties are applied to each table in the document, following the conditional formatting inclusions and
exclusions specified per table.
- Next, numbered item and paragraph properties are applied to each paragraph formatted with a numbering style.
- Next, paragraph and run properties are applied to each paragraph as defined by the paragraph style.
- Next, run properties are applied to each run with a specific character style applied.
- Finally, we apply direct formatting (paragraph or run properties not from styles). If this direct formatting includes numbering, that
numbering + the associated paragraph properties are applied.
If the value of the rFonts element (§17.3.2.26) references a font
which is not available, applications determine a suitable alternative
font via a process called font substitution, which is defined in
§17.8.2.
docx4j does something like this - see for example line 430 and ff in https://github.com/plutext/docx4j/blob/master/src/main/java/org/docx4j/model/PropertyResolver.java
Similar principles apply to font color.
I don't address here how to iterate through the document word by word (or rather, run by run), other than to say google traversalutil
Example of setting font size explicitly in a run
<w:r>
<w:rPr>
<w:sz w:val="36"/>
</w:rPr>
<w:t>this is 18 points</w:t>
</w:r>
You can set that in Microsoft Word, or using docx4j. To see how to do it in docx4j, you can use to the webapp to generate code from a sample docx, but the essence is:
org.docx4j.wml.R yourRun;
yourRun.getRPr().setSz(an HpsMeasure);