0
  1. read pdf file from amazon s3 (using boto)
  2. save it locally as 123.pdf
  3. open and parse the locally saved pdf using PDFLib/TET

I am currently able to perform all 3 steps above but I want to skip the step 2 to save on Disk I/O.

It looks like one can use tet_open_document_mem to make TET open the document in memory but there is no documentation available on how one can use one.

comiventor
  • 3,922
  • 5
  • 50
  • 77

1 Answers1

1

TET offers the so called PDFlib Virtual Filesystem (PVF) to handle such a situation.

You may use create_pvf() to create a named virtual read-only file from data provided in memory.

The API looks like this (C):

void TET_create_pvf(TET *tet, const char *filename, int len, const void *data, size_t size, const char *optlist)

So it might be used like this:

TET_create_pvf(tet, pvfname, 0, data, length, "");
doc = TET_open_document(tet, pvfname, 0, docoptlist);

More details can be found in the TET-Manual http://www.pdflib.com/fileadmin/pdflib/pdf/manuals/TET-4.3-manual.pdf

TET_open_document_mem is an old API which is no longer supported.

rjs
  • 86
  • 4