Using this Javascript code as an Action inside of Acrobat Pro:
// Iterates over all pages and find a given string and extracts all
// pages on which that string is found to a new file.
var pageArray = [];
var stringToSearchFor = "USA";
for (var p = 0; p < this.numPages; p++) {
// iterate over all words
for (var n = 0; n < this.getPageNumWords(p); n++) {
if (this.getPageNthWord(p, n) == stringToSearchFor) {
pageArray.push(p);
break;
}
}
}
if (pageArray.length > 0) {
// extract all pages that contain the string into a new document
var d = app.newDoc(); // this will add a blank page - we need to remove that once we are done
for (var n = 0; n < pageArray.length; n++) {
d.insertPages( {
nPage: d.numPages-1,
cPath: this.path,
nStart: pageArray[n],
nEnd: pageArray[n],
} );
}
// remove the first page
d.deletePages(0);
}
Performs as expected when the stringToSearchFor variable is set to "USA"
But fails when that string is changed to the actual text I want to match, "U.S.A." with periods after each initial.
I've tried escaping the . characters and setting them as wildcard * and changing the string to a RegEx pattern to no avail.
The action does what it's supposed to when the text in the PDF and variable are both USA.
The script runs but does not extract any pages when PDF and variable are both U.S.A.