3

I'm trying to download a pdf using the latest google-chrome & chromedriver on a Ubuntu 16.04 LTS VPS server with the following code.

import json
import time
from pyvirtualdisplay import Display
from selenium import webdriver

display = Display(visible=0, size=(1768, 1368))
display.start()
chrome_options = webdriver.ChromeOptions()
# chrome_options.add_argument('--headless')
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument("--disable-notifications")
chrome_options.add_argument("--disable-popup-blocking")
chrome_options.add_argument("--disable-logging")
chrome_options.add_argument("--log-level=3")
chrome_options.add_argument("--kiosk-printing")
appState = {
    "recentDestinations": [{"id": "Save as PDF", "origin": "local"}],
    "selectedDestinationId": "Save as PDF",
    "version": 2,
}

prefs = {
    "printing.print_preview_sticky_settings.appState": json.dumps(appState),
    "download": {
        "default_directory": "/path/to/dir/",
        "prompt_for_download": False,
        "directory_upgrade": True,
    },
}
chrome_options.add_experimental_option("prefs", prefs)
driver = webdriver.Chrome(chrome_options=chrome_options)
driver.get(
    "https://www.adobe.com/content/dam/acom/en/accessibility/products/acrobat/pdfs/acrobat-x-accessibility-checker.pdf"
)
time.sleep(10)
driver.execute_script("window.print();")
time.sleep(30)
driver.quit()
display.stop()

When I test the above code locally, it downloads the file in the system's default download directory instead of path/to/dir but downloads the file anyway.

But, the same code when executed in the VPS server doesn't download anything.

Things I've tried so far:

  • locate any pdf downloaded by the script using locate -i *.pdf (It confirms no new pdfs were downloaded)
  • setting an environment variable by using: export XDG_DOWNLOAD_DIR='path/to/dir'
  • Running the command: xdg-user-dirs-update --set DOWNLOAD path/to/dir
  • verified default download dir is set using the command: xdg-user-dir DOWNLOAD (It shows system's default download folder)

But nothing worked so far, any help will be appreciated!

NOTE: I know it is possible to download the file by making a GET request using modules like requests, urllib3, etc. I'm just looking for a selenium based solution.

Wasi
  • 1,473
  • 3
  • 16
  • 32
  • Have you checked that directory is accessible by user. Because I am also uses same like `prefs = {"download.default_directory": dl_location}`. This works perfect for me. – Prashant Godhani Sep 28 '19 at 18:51
  • Yes I did. I even went ahead and tried running the script as root. So, permission shouldn't be a issue. I tried setting the `"download.default_directory"` to a non-existent folder on vps. This trick helped me to save the pdf in system's default directory but, if I write some other existing directory path then it doesn't download which is odd! – Wasi Sep 28 '19 at 19:00

1 Answers1

0

I ran into a similar problem and was able to find the following solution:

The download.default_directory setting is only for downloaded content. Chrome treats files saved on the page differently. To change the default folder for a printout of the page, simply set the savefile.default_directory value instead.

In order to update the download directory for window printing, update your prefs with this:

prefs = {
    "printing.print_preview_sticky_settings.appState": json.dumps(appState),
    "download": {
        "default_directory": "/path/to/dir/",
        "prompt_for_download": False,
        "directory_upgrade": True,
    },
    "savefile.default_directory": "path/to/dir/"
}
Jack Hubert
  • 33
  • 1
  • 8
  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Sep 28 '21 at 06:35