3

For some reason I need to store some files (mostly images or pdfs) into my database (PG 9.2.20). Those files are uploaded by users and when I download them back, they are corrupted.

  • I'm working with nodejs.
  • The column type I store the file is BYTEA.

This is how I store them :

const { files, fields } = await asyncBusboy(ctx.req);
const fileName = files[0].filename;
const mimeType = files[0].mimeType;
const bufferedFile = fs.readFileSync(files[0].path, { encoding: 'hex' });
const fileData = `\\x${bufferedFile}`;

//Just a basic insert into with knex.raw
const fileId = await storageModel.create(fields.name, fields.description, fileName, mimeType, fileData, ctx.user);

And this is how I retrieve my file :

const file = await storageModel.find(ctx.params.fileId, ctx.user);
ctx.body = Buffer.from(file.file_bin, 'hex');
ctx.set('Content-disposition', `attachment; filename=${file.file_name}`);

The file is corrupted, and of course, if I look closely, the uploaded file and the one I downloaded are different. See hex screenshot, there is some additional data at the start of the downloaded one : https://i.stack.imgur.com/0Pq33.jpg

After some more testing I can tell the problem lies into the koa part, when I put the buffer into the ctx.body. It got corrupted (???)

EDIT : I was working with Swagger UI : https://github.com/swagger-api/swagger-ui/issues/1605

thesearentthedroids
  • 588
  • 1
  • 7
  • 24

1 Answers1

3

You should not use bytea as a regular text string. You should pass in type Buffer directly, and have the driver escape it for you correctly.

Not sure which driver you are using, but for example...

vitaly-t
  • 24,279
  • 15
  • 116
  • 138
  • I think knexjs escape Buffers as it should be. I wasnt sure so I continued to search with replacing my BYTEA column by just a TEXT one. So now I use the .toString method on my buffer to store it and then when I retrieve the data I use Buffer.from (both in base 64) to recreate a buffer, then I return it. My files are still corrupted (with always the same offset as in my screenshot) – thesearentthedroids Jul 15 '17 at 14:27
  • @thesearentthedroids That is a very a bad approach. You should instead inquire how to insert binary data within knex, and keep type as `bytea`. I can't offer more details, as I'm not familiar with knex myself. – vitaly-t Jul 15 '17 at 15:00
  • @vitaly-t, So to be clear, you suggest to save buffers in "text" postgreSQL columns? The pbkdf2 in nodejs produces buffers and I want to store it in the postgreSQL. I was searching around and I found your answer. Thanks – slevin May 26 '18 at 22:49
  • 1
    @slevin No, I never suggested that. You normally use type `bytea`, and save type `Buffer` directly, let the library escape it automatically. – vitaly-t May 26 '18 at 23:56