0

There are two links format: (with title in a[1] or a[2]

I want it clicks all links in the websites in every page, but using the following codes, it just clicking the first one again and again.

The first link in the website is always [2], with the rest are a[1], I don't know how to click all links then if only use a[1], then 'WebDriverException: unknown error: unsupported protocol (Session info: chrome=88.0.4324.104)'.

<div class="company-left-title">
<a href="javascript:Go('/qy-l-0-4-3595-3595-1.html');">
            </a>
<a href="http://15256160037.58food.com/" target="_blank">亳州市九熹堂药业有限公司</a>  
            </div>

OR

<div class="company-left-title">
            <a href="http://hubeianran.58food.com/" target="_blank">湖北安然保健品有限公司</a>                
            </div>

I used:

driver.get('http://www.58food.com/qy-l-0-3595.html')
while True:
    try:
        links = [link.get_attribute('href') for link in driver.find_elements_by_xpath('//*[@class="company-left-title"]/a[2]')]
    except:
        links = [link.get_attribute('href') for link in driver.find_elements_by_xpath('//*[@class="company-left-title"]/a[1]')]
    for link in links:
        driver.get(link)
        driver.back()
halfer
  • 19,824
  • 17
  • 99
  • 186
Joyce
  • 435
  • 4
  • 13
  • This sounds like an [X-Y problem](http://xyproblem.info/). Instead of asking for help with your solution to the problem, edit your question and ask about the actual problem. What are you trying to do? – undetected Selenium Feb 07 '21 at 12:32
  • @Cathy did you want all the links for the two xpaths which given your site is 16 links. – Arundeep Chohan Feb 07 '21 at 15:09
  • Please read [Under what circumstances may I add “urgent” or other similar phrases to my question, in order to obtain faster answers?](//meta.stackoverflow.com/q/326569) - the summary is that this is not an ideal way to address volunteers, and is probably counterproductive to obtaining answers. Please refrain from adding this to your questions. – halfer Feb 11 '21 at 21:07

3 Answers3

0

In the first format you showed, this is the element hierarchy:

div
  |-- a
  |-- a

The xpath expression to find the a elements would be '//*[@class="company-left-title"]/a'.

In the second format, there are also a elements underneath a p element:

div
  |--a
  |--p
     |--a

You need the previous xpath expression to find the first a, but you also need another, separate, xpath expression for the second a: '//*[@class="company-left-title"]/p/a'.

joao
  • 2,220
  • 2
  • 11
  • 15
  • Hi, but the title I want to get is in the first element in a, I do not want the p/a indeed, I can delete that part to prevent any confusion, sorry for that – Joyce Feb 07 '21 at 09:17
  • You should also show more of the outputs or errors that you're receiving, so we can understand what'shappening. And be sure to change the a[2] to just a in the xpath. – joao Feb 07 '21 at 09:39
  • it just clicking the first title again and again, the first link in the website is always [2], with the rest are a[1], I dont know how to click all links then – Joyce Feb 07 '21 at 09:42
0

You can try to put your XPath in parenthesis so that you get the second occurence:

(//*[@class="company-left-title"]/a)[2]

Reference:
Xpath to get the 2nd url with the matching text in the href tag

K. B.
  • 3,342
  • 3
  • 19
  • 32
0

To grab all 16 links use the | or for both xpaths.

links = [link.get_attribute('href') for link in driver.find_elements_by_xpath('//*[@class="company-left-title"]/a[2] | //*[@class="company-left-title"]/a[1] ')]
Arundeep Chohan
  • 9,779
  • 5
  • 15
  • 32