Questions tagged [pdf-manipulation]

56 questions
2
votes
2 answers

How do I execute ghostscript from C# program

I am trying to call ghost script from my C# program, passing it some args to crop the footer of a PDF file, then overwrite the temp file with the new modified version. I think I'm calling the gs.exe incorrectly. Does anyone see a reason that the…
Frantumn
  • 1,725
  • 7
  • 36
  • 61
2
votes
2 answers

Co-ordinates of a element in a pdf file using iText

I'm creating a pdf file using BIRT reporting library. Later I need to digitally sign these files. I'm using iText to digitally sign the document. The issue I'm facing is, I need to place the signature in different places in different reports. I…
Arun P Johny
  • 384,651
  • 66
  • 527
  • 531
2
votes
2 answers

free PDF manipulation library or code?

I think of developing a tool for commercial usage (I intent to sell it), which will include manipulating document files. The manipulations will include: 1. concatenating several PDF files into one. 2. converting doc/docx file into a PDF file. 3.…
user1028741
  • 2,745
  • 6
  • 34
  • 68
2
votes
4 answers

Web component to redact sensitive data in PDF (or image)

First use case - in our web application user scans or uploads (to server) pdf. Then we let him to black some sensitive data. Right now I wrote some code that extract tiff from pdf and show it to user, who drow black rectangles in places he wants to…
tester.one
  • 369
  • 2
  • 6
  • 23
1
vote
3 answers

PDF document manipulation

I have several PDFs with the following properties: Each PDF contains a variable number of "documents" with differing number of pages. Each page in a "document" has text such as "Page 3 of 26". I want to be able to automatically identify the first…
bugmenot
1
vote
0 answers

C# Mask or Hide or Remove or Redact certain areas in pdf file

Currently we have a webservice called by clients to get a pdf file. The webservice goes out to another system to fetch that file, returned in hex format. Our webservice then converts the Hex string to bytes and then responds back to the clients with…
techrookie
  • 61
  • 9
1
vote
2 answers

Convert content stream of graphical text (consisting of `q` and `Q`) to proper content stream

I have a pdf of which the content stream of the pdf doc looks like image1. But once I open the pdf in adobe dc and tried to change the reading order. The entire content stream is changed. (Please see image2) And here is the link to source pdf…
SuperNova
  • 25,512
  • 7
  • 93
  • 64
1
vote
1 answer

Tag content in pdf

I have a pdf which looks like below. I would want to tag the paragraph as 'paragraph'. I have searched a lot about this, and there are ways to create a tagged pdf from scratch, or convert html content to tagged pdf, but I have not had success in…
SuperNova
  • 25,512
  • 7
  • 93
  • 64
1
vote
0 answers

pdf manipulation - tagging image or figure

I have a source pdf(untagged.pdf) out of which I would be creating a tagged version(tagged.pdf) I have information of all the html tags of all contents of the source pdf. Now I have a figure on page 3. When I programmatically parse, this will not be…
SuperNova
  • 25,512
  • 7
  • 93
  • 64
1
vote
0 answers

React Native: Placing image into existing PDF at specific X,Y

I am currently working on react-native app, where I would like to place image into existing pdf at specific coorinates. Firstly, I create PDF from HTML template with the following library, and at this point, I also create a space (let's say some X…
1
vote
1 answer

Sejda merging PDFs from CSV filelist names

I recently installed sedja-console for merging pdf files from command line. The names of the input pdf files are in a CSV file named filelist-inputs.csv like…
Trimax
  • 2,413
  • 7
  • 35
  • 59
1
vote
0 answers

HTML to PDF back to HTML

Trying to wrap my head around this one. 2 part question... Is this possible - I am trying to create a HTML page with certain elements editable (class="editme" contentEditable). Once they click save I want to take that page and convert/save it to…
RooksStrife
  • 1,647
  • 3
  • 22
  • 54
1
vote
0 answers

How to merge a PDF into another PDF after every n-th page in an efficient way

I would like to find the best (most efficient and most reliable) solution to do the following: I have one big PDF page A (let's say with 1000 pages) I have another PDF page B (smaller, let's say 2 pages) I want to merge PDF B into PDF A after…
spaudanjo
  • 764
  • 7
  • 13
1
vote
2 answers

Python PyPDF2 join pages

I have a PDF with a big table splitted in pages, so I need to join the per-page tables into a big table in a large page. Is this possible with PyPDF2 or another library? Cheers
Felipe Buccioni
  • 19,109
  • 2
  • 28
  • 28
1
vote
1 answer

Chinese characters instead of text into the metadata called "Producer"

I have a problem when I edit the metadata of a pdf with iTextSharp. I save a word document in pdf with Word. The field called "Producer" is filled by word with the text "Microsoft Word 210". After, I edit the metadata with ITextSharp and iTextSharp…