0

I want to convert the entire content of that page to PDF by searching for a specific word on each page (which may be on one page or more). For example, we have a file that has three pages, there is a special word on the first page, and the next special word on the third page. I want to save the PDF from the first to the second page and then save the third page separately. The PDF files will be named according to the specific word on that page.

My problem is that I don't know how to loop for each page and read the content of that page to get to the special word and save the pages as a PDF. Thank You

  • All that has been posted is a program description. However, we need you to ask a question according to the [ask] page. We can't be sure what you want from us. Please [edit] your post to include a valid question that we can answer. Reminder: make sure you know what is on-topic here by visiting the [help/on-topic]; asking us to write the program for you, suggestions, and external links are off-topic. – gunr2171 Dec 23 '19 at 15:32

1 Answers1

0

Here is how you can do it.

  1. Paginate your Word document using DocumentModel.GetPaginator method.
  2. Read the text content of each page using FrameworkElement.ToText extension method.
  3. Save selected pages to PDF using DocumentModelPage.Save method.

In other words, try the following:

string search = "Your Specific Word";
string inputPath = "input.docx";

// Load Word document.
var document = DocumentModel.Load(inputPath);

// 1. Get document's pages.
var pages = document.GetPaginator().Pages;

for (int i = 0, count = pages.Count; i < count; ++i)
{
    // 2. Read page's text content.
    DocumentModelPage page = pages[i];
    string pageTextContent = page.PageContent.ToText();

    // 3. Save page as PDF.
    if (pageTextContent.Contains(search))
    {
        string outputPath = $"{search}_{i}.pdf";
        page.Save(outputPath);
    }
}
Mario Z
  • 4,328
  • 2
  • 24
  • 38
  • Hi Mario. Thanks for replay. This Work But this export always one page, if Specific Word exists in First Page and third Page, I want Export 2 File (one File : First to Second page) (Second File : only Third Page) – Sam Samangi Dec 25 '19 at 06:40
  • @SamSamangi can you upload somewhere your Word document and send me a link so that I can take a look at it? – Mario Z Dec 25 '19 at 09:21
  • One solution to this is to save the desired pages into separate PDF files, one by one, and then [merge those PDF files into one](https://www.gemboxsoftware.com/pdf/examples/c-sharp-vb-net-merge-pdf/201). Or you can find out [what content is on what Word page](https://stackoverflow.com/a/60769555/2699178) and then delete all the content that is not in the targeted range of pages and save the whole resulting `DocumentModel` to PDF. I would recommend using the first approach, it's easier. – Mario Z Jul 12 '21 at 09:09