5

I want to crop PDF File using iTextSharp and rectangle (0,1000,600,155). Everything is fine and when you open created *.pdf file you can see only that cropped content, BUT! If you parse that pdf, there are still information and text from not visible part of document, I can't accept that. How can I remove that data completly?

Here is my code sample:

        static void cropiTxtSharp(){
        string file ="C:\\testpdf.pdf";
        string oldchar = "testpdf.pdf";
        string repChar = "test.pdf";
        PdfReader reader = new PdfReader(file);
        PdfDictionary pageDict;
        PdfRectangle rect = new PdfRectangle(0, 1000, 600, 115);
        pageDict = reader.GetPageN(1);
        pageDict.Put(PdfName.CROPBOX, rect);
        PdfStamper stamper = new PdfStamper(reader, new FileStream(file.Replace(oldchar, repChar), FileMode.Create, FileAccess.Write));
        stamper.Close();
        reader.Close();
    }

EDIT: Here is code which works, I spend some hours but finally I did it :P
First, add following to project:

using iTextSharp.text.pdf;
using iTextSharp.text;
using iTextSharp.xtra.iTextSharp.text.pdf.pdfcleanup;


Then you can use my code:

    static void textsharpie()
    {
        string file = "C:\\testpdf.pdf";
        string oldchar = "testpdf.pdf";
        string repChar = "test.pdf";
        PdfReader reader = new PdfReader(file);
        PdfStamper stamper = new PdfStamper(reader, new FileStream(file.Replace(oldchar, repChar), FileMode.Create, FileAccess.Write));
        List<PdfCleanUpLocation> cleanUpLocations = new List<PdfCleanUpLocation>();
        cleanUpLocations.Add(new PdfCleanUpLocation(1, new iTextSharp.text.Rectangle(0f, 0f, 600f, 115f), iTextSharp.text.BaseColor.WHITE));
        PdfCleanUpProcessor cleaner = new PdfCleanUpProcessor(cleanUpLocations, stamper);
        cleaner.CleanUp();
        stamper.Close();
        reader.Close();
    }


Unfortunatelly I can't use that code if I want to commercialize my application without paying for license, so I had to think on different library...

rafixwpt
  • 155
  • 3
  • 9

1 Answers1

2

What you're doing is setting the CropBox of the page, which does absolutely nothing to the content of the document. This is by design and was always like that since Acrobat 1.0.

What you want to do is called redaction (or in your case, exclusive redaction since you want to remove everything outside the bounds of a rectangle). It is decidedly non-trivial to do correctly, mostly because of issues with content that partially overlaps the bounds to which to want to redact (images, text, and paths).

plinth
  • 48,267
  • 11
  • 78
  • 120
  • 1
    So there is no way to create new pdf file based on content only inside that rectangle? Any examples for that will be helpful, maybe for my document everything will be ok cus there is only text. – rafixwpt Mar 25 '15 at 15:53
  • IText recently started providing redaction functionality. The OP may use that. – mkl Mar 25 '15 at 15:55
  • Could you give me some links where examples will be available for C#? I can't find any. – rafixwpt Mar 25 '15 at 16:12
  • 1
    Have a look at [this answer](http://stackoverflow.com/a/24037497/1729265) for a first impression. Additionally there are other `PdfCleanUpProcessor` constructors, too, which allow you to explicitly select the locations to clean, not merely implicitly via redaction annotations, cf. [this test](https://svn.code.sf.net/p/itext/code/trunk/xtra/src/test/java/com/itextpdf/text/pdf/pdfcleanup/PdfCleanUpProcessorTest.java). Equivalent calls should be possible for C#. – mkl Mar 25 '15 at 16:30
  • 1
    [mkl](http://stackoverflow.com/users/1729265/mkl), thank you a lot, after some time, and a lot more compiling, fighting with errors, crying and hating the God I figured out how I can run that PdfCleanUp directly from *.dll, now everything works as it should :) – rafixwpt Mar 25 '15 at 23:18
  • @rafixwpt *fighting with errors, crying and hating the God* - ;)) Glad to read that you've got it working eventually. – mkl Mar 26 '15 at 09:14
  • @rafixwpt maybe you could post your solution? – Sam Sippe Mar 31 '15 at 23:43
  • 2
    @firstTimeCaller I edited my question, you can see working method there. – rafixwpt Apr 01 '15 at 22:11