Questions tagged [pdfbox]

The Apache PDFBox library is an open source Java tool for working with PDF documents. This project allows creation of new PDF documents, manipulation of existing documents and the ability to extract content from documents. Apache PDFBox also includes several command line utilities.

The Apache PDFBox library is an open source Java tool for working with PDF documents. This project allows creation of new PDF documents, manipulation of existing documents and the ability to extract content from documents. Apache PDFBox also includes several command line utilities.

Features:

  • PDF to text extraction
  • Merge PDF Documents
  • PDF Document Encryption/Decryption
  • Lucene Search Engine Integration
  • Fill in form data FDF and XFDF
  • Create a PDF from a text file
  • Create images from PDF pages
  • Print a PDF
  • PDF/A validation

Official Website: http://pdfbox.apache.org/

Latest release: 2.0.21 released on 2020-08-20

Useful Links:

3571 questions
1
vote
0 answers

Image Rendering in java throws Out of memory error (using pdfbox)

I was trying to render an image and I was getting out of memory error in this line. try{ BufferedImage image = pdfRenderer.renderImageWithDPI(page-1, 300,ImageType.GRAY); ImageIOUtil.writeImage(image,"G:/Trial/tempImg.png", 300); int…
ANKIT
  • 126
  • 2
  • 11
1
vote
1 answer

PDF rendering using pdfbox

When i try to convert pdf to image then for some pdfs i get a "out of memory" error. So i increased heap size and then i again got the error for some different pdf file. for the time being assume I have no memory leak from other objects. So what…
ANKIT
  • 126
  • 2
  • 11
1
vote
1 answer

Confusion about current transformation matrix in a PDF

I am having some confusions about the current transformation matrix (CTM) in PDFs. For page 5 in this PDF, I have examined the Token Stream (http://pastebin.com/k6g4BGih) and that shows the last cm operation before the curve (c) commands sets the…
rivu
  • 2,004
  • 2
  • 29
  • 45
1
vote
1 answer

move PDF content using PDFBox

I need to be able to specify a rectangular area on a PDF page and move the text and graphic content of that area to a new location on the same page using PDFBox. Any graphics (lines, pictures, etc) will each move as a whole unit if selected in the…
DavesPlanet
  • 576
  • 5
  • 14
1
vote
0 answers

PDFbox how to create a PADES-LTV sample

I'm using PDFBox 2.0 I would like to create a PDF with PADES-LTV format but not me steps to do so. My question is on the part of the LTV parameters and when applied. Need to know at what point are added and how I put part of my code if you can…
Leuqarut
  • 21
  • 1
  • 4
1
vote
1 answer

Add image in pdf using pdfbox at a particular cell

I am using pdfbox to generate pdf. I want to make a letterhead. I am not able to place the image in pdf at front, while I am getting it at the end of the document. Why it is not coming in front?
1
vote
1 answer

pdf reading via pdfbox in java

I have encountered a problem while reading the pdf using pdfbox. My actual pdf is partially unreadable so when i copy and paste the unreadable part in an editor it shows little box symbols, but when i try to read the same file via pdfbox , those…
ANKIT
  • 126
  • 2
  • 11
1
vote
1 answer

PDFbox overlaped links at index document

I'm completely newbie to PDFBox and I'm having an issue I can't find the way to solve by the moment. I get from my database a list of folder and documents located in those folders, I iterate over all these data to generate an index with active links…
1
vote
0 answers

How to shrink content of pdf page according to media box dimensions in pdfbox 2.0

How to resize pdf page content to its media box size in apache pdfbox 2.0 ? My application will receive pdf document as input with different margins. If margin is with "x+1" inch , then content has to be fit into predefined media box dimensions 612…
1
vote
0 answers

Getting ClassNotFound Exception while running my program

getting exception: ClassNotFoundException And I have included fontbox and pdfbox jar files in my classpath. package com.KyaHub.action; import java.io.File; import java.io.FileInputStream; import java.io.IOException; import…
Dipti Kadu
  • 39
  • 5
1
vote
1 answer

Apache Tika and Apache PDFBox 2.0

We are using tika 1.4. Now we need to use PDF Box 2.0.1 for digital signature. I can see that some of the classes of PDF box is used in tika. Is Pdf box part of tika? If so, I needn't add pdf box separately. do I? Is tika 1.13 backward compatible…
Cybermonk
  • 514
  • 1
  • 6
  • 28
1
vote
0 answers

PDFBox Nested Colored Tables with Word Wrap

I am trying to create a table which has multiple nested and formatted tables or rows in it. Currently I have created this , whereas i am trying to create something more similar to this . My data comes from a xml file which i have full access to. I…
codeCompiler77
  • 508
  • 7
  • 22
1
vote
1 answer

PDFbox arabic textnot show retrieve from database mysql

I want to display PDF report in arabic which will be generated through mysql database. Here is my code: protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { // TODO Auto-generated…
Nadeem Mohammed
  • 559
  • 1
  • 6
  • 14
1
vote
1 answer

PDFBox import error in intellij

PDFbox jar is added in intellij(Setting> Project Structure> Modules> Dependencies) and Have added the gradle dependency as testCompile 'org.apache.pdfbox:pdfbox:2.0.1' in gradle build and the build is successful.Even after this importing 'import…
Sera
  • 21
  • 1
  • 6
1
vote
2 answers

Using pdfbox to convert a color PDF to a b/w tiff

I am have a bit of a problem converting some color PDFs to tiff images. The PDFs I am having problems with have hand written signatures written in blue ink. These signatures do not appear in the generated binary tiffs. I suspect there is a threshold…
Safford96
  • 23
  • 1
  • 7
1 2 3
99
100