-1

How can i read and store the javascript rendered HTML of a website using selenium in headless mode? I wrote this code, but the div tag i am interested in to store is not rendered, while if i run the same script in non headless mode i am able to get the tag.

from selenium.webdriver.chrome.options import Options
from selenium import webdriver
import time
from bs4 import BeautifulSoup

userAgent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36"
options = Options()
options.add_argument(f"user-agent={userAgent}")
options.add_argument("--headless")

driver = webdriver.Chrome(options = options, executable_path= "C:/Users/User/chromedriver.exe")
driver.get("https://www.yeezysupply.com")
time.sleep(2)

html_src = driver.page_source
soup = BeautifulSoup(html_src, "lxml")

print(soup.prettify())

Websites' HTML:

<div id="app" role="main">
   <div class="header___2fE4A">
    <div class="header-item___19I2L">
     <button aria-hidden="false" aria-label="Close menu" class="navigation-menu-trigger___1TiqR" data-auto-id="menu-close-button">
      Menu
     </button>
    </div>
    <div class="minicart-link___12Tm8 header-item___19I2L header-item__right___1TF3M">
     <div>
      <a class="cartlink___XXQml" data-auto-id="yeezy-mini-basket" href="/cart">
      </a>
     </div>
    </div>
   </div>
   <div class="minicart-modal___2Q8-_">
   </div>
   <div class="container___3PPPZ">
    <div class="main___2aRHM">
     <div class="bloom_plp___Zq2FC">
      <div class="row gl-align-items-center gl-vspacing-all-large">
       <div class="col-s-12 col-l-8">
        <div class="col-s-12 col-l-24 gl-text-center">
         <a class="" data-auto-id="yeezy-plp-bloom-link" href="/product/FV3258" title="700 MNVN ADULTS">
          <img alt="700 MNVN ADULTS" class="img_with_fallback___2aHBu image___1XIlT" data-auto-id="yeezy-plp-product-img" src="https://assets.yeezysupply.com/images/w_800,f_auto,q_auto:sensitive,fl_lossy/e4a5a081a5e44ebca8d0ab4c015f4adb_ce49/700_MNVN_ADULTS_FV3258_FV3258_04_standard.png" style=""/>
         </a>
        </div>
        <article class="col-s-12 col-l-24">
         <a class="bloom_plp_available___3vKoX gl-text-center" data-auto-id="yeezy-plp-bloom-link-available" href="/product/FV3258" title="700 MNVN ADULTS">
          <p class="denseText___2wFAe gl-text-center" data-auto-id="ys-product-name">
           700 MNVN ADULTS
          </p>
          <p class="denseText___2wFAe gl-text-center" data-auto-id="ys-product-color">
           ORANGE
          </p>
          <p class="denseText___2wFAe gl-text-center">
           IN STORES ONLY. LA &amp; PARIS.
          </p>
          <p class="denseText___2wFAe gl-text-center" data-auto-id="ys-product-name">
           February 28
          </p>
         </a>
        </article>
       </div>
       <div class="col-s-12 col-l-8">
        <div class="col-s-12 col-l-24 gl-text-center">
         <a class="" data-auto-id="yeezy-plp-bloom-link" href="/product/FX3354" title="700 MNVN KIDS">
          <img alt="700 MNVN KIDS" class="img_with_fallback___2aHBu image___1XIlT" data-auto-id="yeezy-plp-product-img" src="https://assets.yeezysupply.com/images/w_800,f_auto,q_auto:sensitive,fl_lossy/4413ccfad2274aadaab1ab4c015f48a6_ce49/700_MNVN_KIDS_FX3354_FX3354_04_standard.png" style=""/>
         </a>
        </div>
        <article class="col-s-12 col-l-24">
         <a class="bloom_plp_available___3vKoX gl-text-center" data-auto-id="yeezy-plp-bloom-link-available" href="/product/FX3354" title="700 MNVN KIDS">
          <p class="denseText___2wFAe gl-text-center" data-auto-id="ys-product-name">
           700 MNVN KIDS
          </p>
          <p class="denseText___2wFAe gl-text-center" data-auto-id="ys-product-color">
           ORANGE
          </p>
          <p class="denseText___2wFAe gl-text-center">
           IN STORES ONLY. LA &amp; PARIS.
          </p>
          <p class="denseText___2wFAe gl-text-center" data-auto-id="ys-product-name">
           February 28
          </p>
         </a>
        </article>
       </div>
       <div class="col-s-12 col-l-8">
        <div class="col-s-12 col-l-24 gl-text-center">
         <a class="" data-auto-id="yeezy-plp-bloom-link" href="/product/FX3355" title="700 MNVN INFANTS">
          <img alt="700 MNVN INFANTS" class="img_with_fallback___2aHBu image___1XIlT" data-auto-id="yeezy-plp-product-img" src="https://assets.yeezysupply.com/images/w_800,f_auto,q_auto:sensitive,fl_lossy/b13df737dd564c8fa44bab4c015f4e08_ce49/700_MNVN_INFANTS_FX3355_FX3355_04_standard.png" style=""/>
         </a>
        </div>
        <article class="col-s-12 col-l-24">
         <a class="bloom_plp_available___3vKoX gl-text-center" data-auto-id="yeezy-plp-bloom-link-available" href="/product/FX3355" title="700 MNVN INFANTS">
          <p class="denseText___2wFAe gl-text-center" data-auto-id="ys-product-name">
           700 MNVN INFANTS
          </p>
          <p class="denseText___2wFAe gl-text-center" data-auto-id="ys-product-color">
           ORANGE
          </p>
          <p class="denseText___2wFAe gl-text-center">
           IN STORES ONLY. LA &amp; PARIS.
          </p>
          <p class="denseText___2wFAe gl-text-center" data-auto-id="ys-product-name">
           February 28
          </p>
         </a>
        </article>
       </div>
      </div>
      <div class="row gl-align-items-center gl-vspacing-all-large">
       <div class="col-s-12 col-l-8">
        <div class="col-s-12 col-l-24 gl-text-center">
         <a class="" data-auto-id="yeezy-plp-bloom-link" href="/product/FV6125" title="YEEZY POWERPHASE">
          <img alt="YEEZY POWERPHASE" class="img_with_fallback___2aHBu image___1XIlT" data-auto-id="yeezy-plp-product-img" src="https://assets.yeezysupply.com/images/w_800,f_auto,q_auto:sensitive,fl_lossy/8453c34c94ae490cb790aad10125695b_ce49/YEEZY_POWERPHASE_FV6125_FV6125_04_standard.png" style=""/>
         </a>
        </div>
        <article class="col-s-12 col-l-24">
         <a class="bloom_plp_available___3vKoX gl-text-center" data-auto-id="yeezy-plp-bloom-link-available" href="/product/FV6125" title="YEEZY POWERPHASE">
          <p class="denseText___2wFAe gl-text-center" data-auto-id="ys-product-name">
           YEEZY POWERPHASE
          </p>
          <p class="denseText___2wFAe gl-text-center" data-auto-id="ys-product-color">
           Quiet Grey
          </p>
          <p class="denseText___2wFAe gl-text-center">
           Yeezy Supply Exclusive
          </p>
          <p class="denseText___2wFAe gl-text-center" data-auto-id="ys-product-orderable">
           Available now
          </p>
          <div class="gl-price gl-text-center gl-price__block___30YeK denseText___2wFAe" data-auto-id="ys-product-price">
           <span class="gl-price__value">
            $120
           </span>
          </div>
         </a>
        </article>
       </div>
       <div class="col-s-12 col-l-8">
        <div class="col-s-12 col-l-24 gl-text-center">
         <a class="" data-auto-id="yeezy-plp-bloom-link" href="/product/FV6126" title="YEEZY POWERPHASE">
          <img alt="YEEZY POWERPHASE" class="img_with_fallback___2aHBu image___1XIlT" data-auto-id="yeezy-plp-product-img" src="https://assets.yeezysupply.com/images/w_800,f_auto,q_auto:sensitive,fl_lossy/b6ffa790975241eea8b8aad101256602_ce49/YEEZY_POWERPHASE_FV6126_FV6126_04_standard.png" style=""/>
         </a>
        </div>
        <article class="col-s-12 col-l-24">
         <a class="bloom_plp_available___3vKoX gl-text-center" data-auto-id="yeezy-plp-bloom-link-available" href="/product/FV6126" title="YEEZY POWERPHASE">
          <p class="denseText___2wFAe gl-text-center" data-auto-id="ys-product-name">
           YEEZY POWERPHASE
          </p>
          <p class="denseText___2wFAe gl-text-center" data-auto-id="ys-product-color">
           Clear Brown
          </p>
          <p class="denseText___2wFAe gl-text-center">
           Yeezy Supply Exclusive
          </p>
          <p class="denseText___2wFAe gl-text-center" data-auto-id="ys-product-orderable">
           Available now
          </p>
          <div class="gl-price gl-text-center gl-price__block___30YeK denseText___2wFAe" data-auto-id="ys-product-price">
           <span class="gl-price__value">
            $120
           </span>
          </div>
         </a>
        </article>
       </div>
       <div class="col-s-12 col-l-8">
        <div class="col-s-12 col-l-24 gl-text-center">
         <a class="" data-auto-id="yeezy-plp-bloom-link" href="/product/FV6129" title="YEEZY POWERPHASE">
          <img alt="YEEZY POWERPHASE" class="img_with_fallback___2aHBu image___1XIlT" data-auto-id="yeezy-plp-product-img" src="https://assets.yeezysupply.com/images/w_800,f_auto,q_auto:sensitive,fl_lossy/7eb2186490734230a20faad101256852_ce49/YEEZY_POWERPHASE_FV6129_FV6129_04_standard.png" style=""/>
         </a>
        </div>
        <article class="col-s-12 col-l-24">
         <a class="bloom_plp_available___3vKoX gl-text-center" data-auto-id="yeezy-plp-bloom-link-available" href="/product/FV6129" title="YEEZY POWERPHASE">
          <p class="denseText___2wFAe gl-text-center" data-auto-id="ys-product-name">
           YEEZY POWERPHASE
          </p>
          <p class="denseText___2wFAe gl-text-center" data-auto-id="ys-product-color">
           Simple Brown
          </p>
          <p class="denseText___2wFAe gl-text-center">
           Yeezy Supply Exclusive
          </p>
          <p class="denseText___2wFAe gl-text-center" data-auto-id="ys-product-orderable">
           Available now
          </p>
          <div class="gl-price gl-text-center gl-price__block___30YeK denseText___2wFAe" data-auto-id="ys-product-price">
           <span class="gl-price__value">
            $120
           </span>
          </div>
         </a>
        </article>
       </div>
      </div>
     </div>
    </div>
    <div class="footer___1Hsf2">
     <footer class="footer___1_Npt">
      <a data-auto-id="ccpa-data-do-not-sell-link">
       Do not sell my info
      </a>
      <span class="separator___1ZOj_">
      </span>
      <a data-auto-id="ccpa-data-settings-link">
       Data settings
      </a>
     </footer>
    </div>
   </div>
  </div>
  <div id="modal-root">
  </div>
  <script id="__LOADABLE_REQUIRED_CHUNKS__" type="application/json">
   []
  </script>
  <script async="" data-chunk="app" src="/glass/react/0cf642c/yeezy/runtime.js">
  </script>
  <script async="" data-chunk="app" src="/glass/react/0cf642c/yeezy/vendor.app.js">
  </script>
  <script async="" data-chunk="app" src="/glass/react/0cf642c/yeezy/app.app.js">
  </script>
  <noscript>
   <img src="https://www.yeezysupply.com/akam/11/pixel_c4f80c3?a=dD0wYWEyOTVlOGNiMGVmOTRiYjg4OWY1NjI1MTkyZmQyNGZlOTY2YjA2JmpzPW9mZg==" style="visibility: hidden; position: absolute; left: -999px; top: -999px;"/>
  </noscript>
  <script type="text/javascript">
   var _cf = _cf || []; _cf.push(['_setFsp', true]);  _cf.push(['_setBm', true]); _cf.push(['_setAu', '/static/bcb1cb0eead157e63de815826062ee8']);
  </script>
  <script src="/static/bcb1cb0eead157e63de815826062ee8" type="text/javascript">
  </script>
 </body>
</html>

The script doesn't render the HTML inside the 'div class="main___2aRHM"' tag.

Here is the uotput i get in the terminal:

<html class="theme-yeezysupply" data-reactroot="" lang="en" prefix="og: http://ogp.me/ns# fb: http://ogp.me/ns/fb#">
 <head>
  <title data-rh="true">
   YEEZY SUPPLY
  </title>
  <script>
   (function(i,s,o,g,r,a,m){i['InstanaEumObject']=r;i[r]=i[r]||function(){ (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o), m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m) })(window,document,'script','//eum.instana.io/eum.min.js','ineum'); ineum('reportingUrl', 'https://eum-eu-west-1.instana.io'); ineum('autoClearResourceTimings', false); ineum('ignoreErrorMessages',[ /.*Failed to execute 'querySelector'.*/i ]); ineum('apiKey', 'Whstm32ZQCKMZdZ-8ugNSg');
  </script>
  <script>
   bazadebezolkohpepadr="206536812"
  </script>
  <script defer="" src="https://www.yeezysupply.com/akam/11/c4f80dc" type="text/javascript">
  </script>
 </head>
 <body>
  <div style="width:0;height:0;position:absolute;overflow:hidden">
   <?xml version="1.0" encoding="utf-8"?>
   <svg xmlns="http://www.w3.org/2000/svg">
    <symbol id="arrow-down" viewbox="0 0 24 24">
     <path d="M5 9h14l-7 6z" fill="currentColor">
     </path>
    </symbol>
    <symbol id="arrow-up" viewbox="0 0 24 24">
     <path d="M5 15h14l-7-6z" fill="currentColor">
     </path>
    </symbol>
    <symbol id="broken-image" viewbox="0 0 24 24">
     <g fill="none" fill-rule="evenodd" stroke="none" stroke-width="1">
      <g id="Group" stroke="#ABABAB" stroke-width="2" transform="translate(3.000000, 3.000000)">
       <circle cx="9" cy="9" id="Oval" r="8">
       </circle>
       <line id="Path-6" x1="15" x2="3" y1="3" y2="15">
       </line>
      </g>
     </g>
    </symbol>
    <symbol id="checkbox-checkmark" viewbox="0 0 16 24">
     <path d="M1 13l4 4L15 7" fill="none" stroke="currentColor" stroke-miterlimit="10" stroke-width="2">
     </path>
    </symbol>
    <symbol id="close" viewbox="0 0 24 24">
     <path d="M8 8l7.604 7.604M15.604 8L8 15.604" stroke="currentColor" stroke-width="2">
     </path>
    </symbol>
    <symbol id="cross-small" viewbox="0 0 24 24">
     <path d="M8 8l7.604 7.604M15.604 8L8 15.604" stroke="currentColor" stroke-width="2">
     </path>
    </symbol>
    <symbol id="cross" viewbox="0 0 24 24">
     <path d="M8 8l7.604 7.604M15.604 8L8 15.604" stroke="currentColor" stroke-width="2">
     </path>
    </symbol>
    <symbol id="dropdown" viewbox="0 0 24 24">
     <path d="M5 9h14l-7 6z" fill="currentColor">
     </path>
    </symbol>
    <symbol id="message" viewbox="0 0 24 24">
     <path d="M22,1 C23.1045695,1 24,1.8954305 24,3 L24,16.9230769 C24,18.0276464 23.1045695,18.9230769 22,18.9230769 L11.9918686,18.9230769 L7.5511675,22.6302054 C7.12719804,22.9841381 6.49658376,22.927362 6.14265107,22.5033926 C5.99254107,22.3235786 5.91031569,22.0967759 5.91031569,21.8625408 L5.91031569,18.9230769 L2,18.9230769 C0.8954305,18.9230769 1.3527075e-16,18.0276464 0,16.9230769 L0,3 C-1.3527075e-16,1.8954305 0.8954305,1 2,1 L22,1 Z M6,8 C4.8954305,8 4,8.8954305 4,10 C4,11.1045695 4.8954305,12 6,12 C7.1045695,12 8,11.1045695 8,10 C8,8.8954305 7.1045695,8 6,8 Z M18,8 C16.8954305,8 16,8.8954305 16,10 C16,11.1045695 16.8954305,12 18,12 C19.1045695,12 20,11.1045695 20,10 C20,8.8954305 19.1045695,8 18,8 Z M12,8 C10.8954305,8 10,8.8954305 10,10 C10,11.1045695 10.8954305,12 12,12 C13.1045695,12 14,11.1045695 14,10 C14,8.8954305 13.1045695,8 12,8 Z" fill="currentColor">
     </path>
    </symbol>
    <symbol id="minus" viewbox="0 0 24 24">
     <path d="M9.5 11.802h6.316" stroke="currentColor" stroke-linecap="square" stroke-width="2">
     </path>
    </symbol>
    <symbol id="plus" viewbox="0 0 24 24">
     <path d="M9 12.5h6M12 9.5v6" stroke="currentColor" stroke-linecap="square" stroke-width="2">
     </path>
    </symbol>
    <symbol id="tooltip" viewbox="0 0 24 24">
     <g fill="none" fill-rule="evenodd">
      <path d="M11 18h2v2h-2z" fill="currentColor">
      </path>
      <path d="M7.522 8.577c0-1.96 1.774-3.51 4.35-3.51 2.574 0 3.65 2.194 3.65 3.51 0 3.092-3.65 3.175-3.65 6.49" stroke="currentColor" stroke-width="2">
      </path>
     </g>
    </symbol>
   </svg>
  </div>
  <div id="app" role="main">
   <div class="minicart-link___12Tm8 minicart-link-desktop___36sXW">
    <div>
     <a class="cartlink___XXQml" data-auto-id="yeezy-mini-basket" href="/cart">
     </a>
    </div>
   </div>
   <div class="minicart-modal___2Q8-_">
   </div>
   <div class="sidebar___POIEI">
    <div class="navigation___3100Z" data-auto-id="yeezy-sidebar">
     <a class="active" data-auto-id="yeezy-navigation-link-help-terms" href="/">
      Home
     </a>
     <nav class="navigation-bottom___3rb_3">
      <ul class="navigation-items___quP0r">
       <li>
        <a class="" data-auto-id="yeezy-navigation-link-help-general" href="/pages/general">
         General Info
        </a>
       </li>
       <li>
        <a class="" data-auto-id="yeezy-navigation-link-help-contact" href="/pages/contact">
         Contact
        </a>
       </li>
       <li>
        <a class="" data-auto-id="yeezy-navigation-link-help-privacy" href="/pages/privacy">
         Privacy
        </a>
       </li>
       <li>
        <a class="" data-auto-id="yeezy-navigation-link-help-terms" href="/pages/terms">
         Terms
        </a>
       </li>
       <li>
        <a class="" data-auto-id="yeezy-navigation-link-archive" href="/archive">
         Archive
        </a>
       </li>
      </ul>
     </nav>
    </div>
   </div>
   <div class="container___3PPPZ desktop-container___1UV4E header-margin___1jFHm">
    <div class="main___2aRHM">
    </div>
    <div class="footer___1Hsf2">
     <footer class="footer___1_Npt">
      <a data-auto-id="ccpa-data-do-not-sell-link">
       Do not sell my info
      </a>
      <span class="separator___1ZOj_">
      </span>
      <a data-auto-id="ccpa-data-settings-link">
       Data settings
      </a>
     </footer>
    </div>
   </div>
  </div>
  <div id="modal-root">
  </div>
  <script id="__LOADABLE_REQUIRED_CHUNKS__" type="application/json">
   []
  </script>
  <script async="" data-chunk="app" src="/glass/react/0cf642c/yeezy/runtime.js">
  </script>
  <script async="" data-chunk="app" src="/glass/react/0cf642c/yeezy/vendor.app.js">
  </script>
  <script async="" data-chunk="app" src="/glass/react/0cf642c/yeezy/app.app.js">
  </script>
  <noscript>
   <img src="https://www.yeezysupply.com/akam/11/pixel_c4f80dc?a=dD0wYWEyOTVlOGNiMGVmOTRiYjg4OWY1NjI1MTkyZmQyNGZlOTY2YjA2JmpzPW9mZg==" style="visibility: hidden; position: absolute; left: -999px; top: -999px;"/>
  </noscript>
  <script type="text/javascript">
   var _cf = _cf || []; _cf.push(['_setFsp', true]);  _cf.push(['_setBm', true]); _cf.push(['_setAu', '/static/bcb1cb0eead157e63de815826062ee8']);
  </script>
  <script src="/static/bcb1cb0eead157e63de815826062ee8" type="text/javascript">
  </script>
 </body>
</html>

Any idea is appreciated.

EDIT: I added the code and the HTML

  • headless chrome should render also code. You could write both HTML and compare them - maybe they are totally different - ie. it send message for bots/script. You should also URL to this page and code as text so we could test it. – furas Feb 27 '20 at 15:44
  • always put code, data and error message as text in question. Python can't load image with code to run it - so we can't test it. – furas Feb 27 '20 at 15:44
  • @furas I added the code and HTML. I don't get any errors, it's just that headless chrome (I think) can't render javascript generated HTML. A similar error occurred in this topic https://stackoverflow.com/questions/58091266/chrome-browser-headless-problem-some-specific-pages-are-not-rendering-in-headl , but no solution was found. Again, any help is really appreciated. – Andrea Ginevro Feb 27 '20 at 16:54
  • code works for me with `headless` on Linux. Python 3.7, Selenium 3.141.0, Chromium 80.0.3987.87 (Oficial version) Built on Ubuntu, ChromeDriver 79.0.3945.36 – furas Feb 27 '20 at 18:12
  • 1
    All right, thanks. Just fixed the problem reinstalling the chromedriver. I think it got curropted when i tried modifying it in HEX editor to bypass a control. Thanks a lot. – Andrea Ginevro Feb 27 '20 at 18:20
  • you could add your last comment as answer. – furas Feb 27 '20 at 18:24

1 Answers1

1

Just fixed the problem reinstalling the chromedriver. Make sure that it is compatible with the Chrome version you are using, and that it is not internally modified. To check chrome version, type chrome://version in url. To install chrome driver, go to https://chromedriver.chromium.org/downloads . Thanks to @furas