0

I am trying to obtain the links of the job offers on a website, I have managed to obtain the title of the position and the company but I cannot extract the link of the offer.

The source of the data is: https://www.jobs.ch/en/vacancies/?term=Data%20Analyst

import requests
from bs4 import BeautifulSoup as bs

url = 'https://www.jobs.ch/en/vacancies/?term=Data%20Analyst'
page = requests.get(url)
soup = bs(page.content, "html.parser")

results = soup.find(class_="Div-sc-1cpunnt-0 ujqkk")
job_elements = results.find_all("a", class_="Link__ExtendedRR6Link-sc-czsz28-1 jzwvjr Link-sc-czsz28-2 VacancyLink___StyledLink-sc-ufp08j-0 bzpUGN zoplL")

for job_element in job_elements:
    title_element = job_element.find("span", class_="Span-sc-1ybanni-0 Text__span-sc-1lu7urs-12 Text-sc-1lu7urs-13 VacancySerpItem___StyledText-sc-ppntto-4 jpKTRn bbefum hSicAH")
    company_element = job_element.find("p", class_="P-sc-hyu5hk-0 Text__p2-sc-1lu7urs-10 Span-sc-1ybanni-0 Text__span-sc-1lu7urs-12 Text-sc-1lu7urs-13 cHnalP cTUsVs")
    print(title_element.text)
    print(company_element.text)
    print()

# Until here everything works !

Now I want to be able to get the links of each job offer.

I have tried with this code:

for job_element in job_elements:
    link = job_element.find('a', attrs={'class':'Link__ExtendedRR6Link-sc-czsz28-1 jzwvjr Link-sc-czsz28-2 VacancyLink___StyledLink-sc-ufp08j-0 bzpUGN zoplL'})
    print(link.get('href'))

I get this message:

AttributeError                            Traceback (most recent call last)
c:\Users\leant\OneDrive\Documentos\Jupyter\WebScrapping\Youtube\program01.ipynb Cell 8 in <cell line: 1>()
      2 link = job_element.find('a', attrs={'class':'Link__ExtendedRR6Link-sc-czsz28-1 jzwvjr Link-sc-czsz28-2 VacancyLink___StyledLink-sc-ufp08j-0 bzpUGN zoplL'})
      3 #print(title_element.text)
      4 #print(company_element.text)
----> 5 print(link.get('href'))

AttributeError: 'NoneType' object has no attribute 'get'

I have tried too this:

for job_element in job_elements:
    link = job_element.find('a', class_='Link__ExtendedRR6Link-sc-czsz28-1 jzwvjr Link-sc-czsz28-2 VacancyLink___StyledLink-sc-ufp08j-0 bzpUGN zoplL')
    print(link.get('href'))

But I get the same result, I can't find the error. Here is a piece of the html code of the site:

<a class="Link__ExtendedRR6Link-sc-czsz28-1 jzwvjr Link-sc-czsz28-2 VacancyLink___StyledLink-sc-ufp08j-0 bzpUGN zoplL" data-cy="job-link" data-event-type="internal_link" href="/en/vacancies/detail/c82b50d0-cccb-42af-88a3-8cb9e79a88a6/?source=vacancy_search" tabindex="0" title="Data Analyst / Anwendungsentwickler*in">

Thank you very much for your contributions!

1 Answers1

0
job_elements = results.find_all("a", class_="Link__ExtendedRR6Link-sc-czsz28-1 jzwvjr Link-sc-czsz28-2 VacancyLink___StyledLink-sc-ufp08j-0 bzpUGN zoplL")

You have already found all the a tags with the specific class in the above code.

Now looping inside this list of results and finding all the a tags again does not make any sense in the code you tried:-

for job_element in job_elements:
    link = job_element.find('a', class_='Link__ExtendedRR6Link-sc-czsz28-1 jzwvjr Link-sc-czsz28-2 VacancyLink___StyledLink-sc-ufp08j-0 bzpUGN zoplL')
    print(link.get('href'))

This should be able to give you the answer :-

for job_element in job_elements:
    link = job_element.get('href')
    print(link)

Output :-

/en/vacancies/detail/c82b50d0-cccb-42af-88a3-8cb9e79a88a6/?source=vacancy_search
/en/vacancies/detail/da91a6ab-d29d-4458-b26b-32cbec00a614/?source=vacancy_search
/en/vacancies/detail/46d87b52-fdda-4c83-a879-a173af224b94/?source=vacancy_search
/en/vacancies/detail/2a6cc7ba-8290-4fbd-a81a-03a8c2f77da2/?source=vacancy_search
/en/vacancies/detail/2c8ae61b-2646-4b92-bfd1-b2af30d0aae0/?source=vacancy_search
/en/vacancies/detail/8f7f8be8-c263-4ee6-885b-dfd499ea5bb5/?source=vacancy_search
/en/vacancies/detail/ff1e42ca-d952-4985-8e38-bb772e44bc45/?source=vacancy_search
/en/vacancies/detail/86d66a1b-0c44-4d84-9891-f52584406fd9/?source=vacancy_search
/en/vacancies/detail/834b9b81-b412-45c6-bd52-5d66e33c2fe5/?source=vacancy_search
/en/vacancies/detail/7fb10e0c-5637-4311-b127-4309fe19587e/?source=vacancy_search
/en/vacancies/detail/7c016932-e2da-446e-bc7c-dae090988890/?source=vacancy_search
/en/vacancies/detail/b9e1e034-4140-49d8-8450-345e14810788/?source=vacancy_search
/en/vacancies/detail/ce245255-01bb-4dc5-b744-ea661266bc1b/?source=vacancy_search
/en/vacancies/detail/3086cb28-bef2-4a37-905b-5ddfd7d949b4/?source=vacancy_search
/en/vacancies/detail/8e30de7b-d7d2-43d1-9d18-f157138ebebd/?source=vacancy_search
/en/vacancies/detail/7576f680-f770-4d67-9fc5-26ed7a0210a8/?source=vacancy_search
/en/vacancies/detail/f86858c3-0068-407e-a278-098ff840f11a/?source=vacancy_search
/en/vacancies/detail/97caa664-3db8-49e2-ae08-1fe577c41f51/?source=vacancy_search
/en/vacancies/detail/ea7e7d45-8059-40c5-aadf-6b3fc58c301e/?source=vacancy_search
/en/vacancies/detail/bb71a0c2-c5a9-42cf-b91d-46d53c4ad105/?source=vacancy_search
Nehal Birla
  • 142
  • 1
  • 14