86

i want download a pdf file with axios and save on disk (server side) with fs.writeFile, i have tried:

axios.get('https://xxx/my.pdf', {responseType: 'blob'}).then(response => {
    fs.writeFile('/temp/my.pdf', response.data, (err) => {
        if (err) throw err;
        console.log('The file has been saved!');
    });
});

the file is saved but the content is broken...

how do I correctly save the file?

ar099968
  • 6,963
  • 12
  • 64
  • 127
  • you get the console log "The file has been saved" and the file is created and just the content is wrong? – Roland Starke Mar 27 '19 at 10:19
  • where you are calling axios.get ? it will not wait for the file to be written. better promisify the fs or use fs-extra or use promisfied methods from fs. and use like return fs.writeFile(...) – AZ_ Mar 27 '19 at 10:21
  • @RolandStarke yes, the file is saved – ar099968 Mar 27 '19 at 10:27
  • 1
    I've posted a more cleaner approach to solve the problem using node stream pipelines below. It's on the same concept which the accepted answer proposes. stackoverflow.com/a/64925465/3476378 – Aman Saraf Nov 20 '20 at 07:39

13 Answers13

139

Actually, I believe the previously accepted answer has some flaws, as it will not handle the writestream properly, so if you call "then()" after Axios has given you the response, you will end up having a partially downloaded file.

This is a more appropriate solution when downloading slightly larger files:

export async function downloadFile(fileUrl: string, outputLocationPath: string) {
  const writer = createWriteStream(outputLocationPath);

  return Axios({
    method: 'get',
    url: fileUrl,
    responseType: 'stream',
  }).then(response => {

    //ensure that the user can call `then()` only when the file has
    //been downloaded entirely.

    return new Promise((resolve, reject) => {
      response.data.pipe(writer);
      let error = null;
      writer.on('error', err => {
        error = err;
        writer.close();
        reject(err);
      });
      writer.on('close', () => {
        if (!error) {
          resolve(true);
        }
        //no need to call the reject here, as it will have been called in the
        //'error' stream;
      });
    });
  });
}

This way, you can call downloadFile(), call then() on the returned promise, and making sure that the downloaded file will have completed processing.

Or, if you use a more modern version of NodeJS, you can try this instead:

import * as stream from 'stream';
import { promisify } from 'util';

const finished = promisify(stream.finished);

export async function downloadFile(fileUrl: string, outputLocationPath: string): Promise<any> {
  const writer = createWriteStream(outputLocationPath);
  return Axios({
    method: 'get',
    url: fileUrl,
    responseType: 'stream',
  }).then(response => {
    response.data.pipe(writer);
    return finished(writer); //this is a Promise
  });
}
Joel
  • 5,732
  • 4
  • 37
  • 65
csotiriou
  • 5,653
  • 5
  • 36
  • 45
  • 1
    This is correct and solves exactly the problem related to Partial data error – Giorgio Andretti Apr 27 '20 at 13:12
  • 2
    this should be the accepted answer. it fixed the partial download error – ariezona May 12 '20 at 18:10
  • 2
    I've posted a more cleaner approach on the same concept of your using stream pipelines below: https://stackoverflow.com/a/64925465/3476378. – Aman Saraf Nov 20 '20 at 07:38
  • It's better to register close and error event handlers before piping the response stream on the write stream. – Mohammed Essehemy Jan 19 '21 at 09:49
  • 1
    Does this wait for the file to be completely downloaded before starting to write using `writer`? Ie, that doesn't seem like it is really streaming read to write in that case as it's waiting for one to finish before the other begins. – 1252748 Feb 26 '21 at 01:36
  • 2
    I'm not sure I follow. As bytes are downloaded, they are being streamed to a file, and once all bytes have been streamed, the Promise ends, and the rest of the application flow continues. The 'then' in the example is called before the file has finished downloading - check the documentation about the `stream` responseType of axios. – csotiriou Feb 27 '21 at 22:17
  • 1
    good, but, why not use await ? (await axios .... followed by pipe, and finaly return the promise returned by finished, there are no cases where you have to use then, and only very few cases where you need explicit promises (mostly to wrap old code) – Martijn Scheffer Apr 24 '21 at 20:38
  • 7
    response.data.pipe is not a function – rendom Jun 12 '21 at 05:56
  • A very good answer, worked for me after hours of searching. – Aerodynamika Aug 17 '22 at 09:23
  • There are easier ways nowadays. Scroll down! – Mattia Rasulo Aug 30 '22 at 08:02
89

You can simply use response.data.pipe and fs.createWriteStream to pipe response to file

axios({
    method: "get",
    url: "https://xxx/my.pdf",
    responseType: "stream"
}).then(function (response) {
    response.data.pipe(fs.createWriteStream("/temp/my.pdf"));
});
ponury-kostek
  • 7,824
  • 4
  • 23
  • 31
  • Thank you so much!! Was looking for this forever – Harrison Cramer Feb 26 '20 at 01:16
  • 1
    This answer is not complete, because when you download some larger files, the pipe will give you more than one events. This code does not wait until the whole file has been downloaded before one can call `then` on it. Take a look at my solution to find what I consider a more complete solution. – csotiriou Apr 17 '20 at 10:33
  • 32
    response.data.pipe is not a function – Murat Serdar Akkuş Jun 11 '20 at 01:36
  • if to not download file on local storage then how to do that i tried res.sendFile in node.js – s.j Apr 16 '21 at 08:06
  • 1
    To critique this solution, you **must** set the responseType to "stream". Not doing so will cause an error when you try to pipe it to another stream. – RashadRivera Oct 21 '22 at 15:02
31

The problem with broken file is because of backpressuring in node streams. You may find this link useful to read: https://nodejs.org/es/docs/guides/backpressuring-in-streams/

I'm not really a fan of using Promise base declarative objects in JS codes as I feel it pollutes the actual core logic & makes the code hard to read. On top of it, you have to provision event handlers & listeners to make sure the code is completed.

A more cleaner approach on the same logic which the accepted answer proposes is given below. It uses the concepts of stream pipelines.

const util = require('util');
const stream = require('stream');
const pipeline = util.promisify(stream.pipeline);

const downloadFile = async () => {
  try {
    const request = await axios.get('https://xxx/my.pdf', {
      responseType: 'stream',
    });
    await pipeline(request.data, fs.createWriteStream('/temp/my.pdf'));
    console.log('download pdf pipeline successful');   
  } catch (error) {
    console.error('download pdf pipeline failed', error);
  }
}

exports.downloadFile = downloadFile

I hope you find this useful.

Aman Saraf
  • 537
  • 8
  • 13
14

The following code taken from https://gist.github.com/senthilmpro/072f5e69bdef4baffc8442c7e696f4eb?permalink_comment_id=3620639#gistcomment-3620639 worked for me

const res = await axios.get(url, { responseType: 'arraybuffer' });
fs.writeFileSync(downloadDestination, res.data);
12
// This works perfectly well! 
const axios = require('axios'); 

axios.get('http://www.sclance.com/pngs/png-file-download/png_file_download_1057991.png', {responseType: "stream"} )  
.then(response => {  
// Saving file to working directory  
    response.data.pipe(fs.createWriteStream("todays_picture.png"));  
})  
    .catch(error => {  
    console.log(error);  
});  
Armand
  • 137
  • 1
  • 3
  • 10
    Welcome to StackOverflow! You may want to provide a bit of explanation to go along with your code sample. – Airn5475 Sep 05 '19 at 13:06
  • 2
    this is not going to work properly, as it will not wait until the file has finished downloading before continuing the promise chain. – csotiriou Feb 28 '21 at 10:08
6

node fileSystem writeFile encodes data by default to UTF8. which could be a problem in your case.

Try setting your encoding to null and skip encoding the received data:

fs.writeFile('/temp/my.pdf', response.data, {encoding: null}, (err) => {...}

you can also decalre encoding as a string (instead of options object) if you only declare encoding and no other options. string will be handled as encoding value. as such:

fs.writeFile('/temp/my.pdf', response.data, 'null', (err) => {...}

more read in fileSystem API write_file

bendytree
  • 13,095
  • 11
  • 75
  • 91
fedesc
  • 2,554
  • 2
  • 23
  • 39
  • @double-beep tnx for your comment. i've edited with some explanation and read material from `node fileSystem API` about writeFile function. :) – fedesc Mar 27 '19 at 11:56
5

There is a much simpler way that can be accomplished in a couple of lines:

import fs from 'fs';

const fsPromises = fs.promises;

const fileResponse = await axios({
    url: fileUrl,
    method: "GET",
    responseType: "stream",
});

// Write file to disk (here I use fs.promise but you can use writeFileSync it's equal
await fsPromises.writeFile(filePath, fileResponse.data);

Axios has internal capacity of handling streams and you don't need to necessarily meddle with low-level Node APIs for that.

Check out https://axios-http.com/docs/req_config (find the responseType part in the docs for all the types you can use).

Mattia Rasulo
  • 1,236
  • 10
  • 15
3

I have tried, and I'm sure that using response.data.pipe and fs.createWriteStream can work.


Besides, I want to add my situation and solution

Situation:

  • using koa to develop a node.js server
  • using axios to get a pdf via url
  • using pdf-parse to parse the pdf
  • extract some information of pdf and return it as json to browser

Solution:

const Koa = require('koa');
const app = new Koa();
const axios = require('axios')
const fs = require("fs")
const pdf = require('pdf-parse');
const utils = require('./utils')

app.listen(process.env.PORT || 3000)

app.use(async (ctx, next) => {
      let url = 'https://path/name.pdf'
      let resp = await axios({
          url: encodeURI(url),
          responseType: 'arraybuffer'
        })

        let data = await pdf(resp.data)

        ctx.body = {
            phone: utils.getPhone(data.text),
            email: utils.getEmail(data.text),
        }
})

In this solution, it doesn't need to write file and read file, it's more efficient.

levy9527
  • 39
  • 3
  • How is buffering the entire data file in memory before sending it "more efficient" than streaming it? – RashadRivera Oct 06 '22 at 14:09
  • I don't get your point? What do you mean by saying "sending"? In my situation, I only need to parse pdf file, no need to respond – levy9527 Oct 08 '22 at 08:03
3

If you just want the file use this

const media_data =await axios({url: url, method: "get",  responseType: "arraybuffer"})
writeFile("./image.jpg", Buffer.from(media_data.data), {encoding: "binary"}, console.log)
Seth Samuel
  • 245
  • 1
  • 3
2

This is what worked for me and it also creates a temporary file for the image file in case the output file path is not specified:

const fs = require('fs')
const axios = require('axios').default
const tmp = require('tmp');

const downloadFile = async (fileUrl, outputLocationPath) => {
    if(!outputLocationPath) {
        outputLocationPath = tmp.fileSync({ mode: 0o644, prefix: 'kuzzle-listener-', postfix: '.jpg' });
    }
    let path = typeof outputLocationPath === 'object' ? outputLocationPath.name : outputLocationPath
    const writer = fs.createWriteStream(path)
    const response = await axios.get(fileUrl, { responseType: 'arraybuffer' })
    return new Promise((resolve, reject) => {
        if(response.data instanceof Buffer) {
            writer.write(response.data)
            resolve(outputLocationPath.name)
        } else {
            response.data.pipe(writer)
            let error = null
            writer.on('error', err => {
                error = err
                writer.close()
                reject(err)
            })
            writer.on('close', () => {
                if (!error) {
                    resolve(outputLocationPath.name)
                }
            })
        }
    })
}

Here is a very simple Jest test:

it('when downloadFile should downloaded', () => {
    downloadFile('https://i.ytimg.com/vi/HhpbzPMCKDc/hq720.jpg').then((file) => {
        console.log('file', file)
        expect(file).toBeTruthy()
        expect(file.length).toBeGreaterThan(10)
    })
})
gil.fernandes
  • 12,978
  • 5
  • 63
  • 76
2

Lorenzo's response is probably the best answer as it's using the axios built-in. Here's a simple way to do it if you just want the buffer:

const downloadFile = url => axios({ url, responseType: 'stream' })
  .then(({ data }) => {
    const buff = []
    data.on('data', chunk => buff.push(chunk))
    return new Promise((resolve, reject) => {
      data.on('error', reject)
      data.on('close', () => resolve(Buffer.concat(buff)))
    })
  })

// better
const downloadFile = url => axios({ url, responseType: 'arraybuffer' }).then(res => res.data)

const res = await downloadFile(url)
fs.writeFileSync(downloadDestination, res)

I'd still probably use the 'arraybuffer' responseType

Aronanda
  • 191
  • 1
  • 2
  • 3
-3
import download from "downloadjs";

export const downloadFile = async (fileName) => {
    axios({
        method: "get",
        url: `/api/v1/users/resume/${fileName}`,
        responseType: "blob",
    }).then(function (response) {
        download(response.data, fileName);
    });
};

it's work fine to me

-4

This is my example code run with node js There is a synctax error

should be writeFile not WriteFile

const axios = require('axios');
const fs = require('fs');
axios.get('http://www.africau.edu/images/default/sample.pdf', {responseType: 'blob'}).then(response => {
  fs.writeFile('./my.pdf', response.data, (err) => {
        if (err) throw err;
        console.log('The file has been saved!');
    });
});

After the file is saved it might look like in a text editor, but the file was saved properly

%PDF-1.3
%����

1 0 obj
<<
/Type /Catalog
/Outlines 2 0 R
/Pages 3 0 R
>>
endobj

2 0 obj
<<
/Type /Outlines
/Count 0
>>
endobj

3 0 obj
<<
/Type /Pages
/Count 2
/Kids [ 4 0 R 6 0 R ] 
>>
endobj