In short:
with iWork rich text objects, breaking the text up in words goes from:
"This... he said, is a sentence!"
to:
["This", "he", "said", "is", "a", "sentence"]
So: periods, comma and exclamation point have disappeared. Similar to the AppleScript situation, but with Javascript for Automation it is unclear to me how to set the text item delimiter (plus: I am hoping it can be simpler than in the old days).
In detail:
I would like to modify rich text like:
testing [value] units <ignore this>
>>>
also ignore this
<<<
etc.
The text can contain variations in size/color/weight, which should be kept. The result should be e.g.:
testing 123 units
etc.
When I go through the words (in my case: presenter notes in Keynote), I get:
["testing", "value", "units", "ignore", "this", "also", "ignore", "this", "etc"]
instead of:
["testing", "[value]", "units", "<ignore", "this>", ">>>", "also", "ignore", "this", "<<<", "etc."]
So: characters like ., [, and > don't show up, which makes it impossible to search/replace.
To get the words, I use:
words = Application("Keynote").documents[0].slides[0].presenterNotes.words
I also tried using whose() in combination with ignoring/considering (case, hyphens, punctuation), but the result is the same.
How can I get a list of words that include the non-alphanumeric characters?