0

I'm new to Python and trying to scrape Indeed for remote data analyst positions and send them to a csv file. I purposely added code to get past an SSL Certificate issue I've been facing. The results say I have jobs added to my file, but nothing shows up but my headers.

Can you help me figure out what I'm doing wrong? Much thanks.

Here's my code:

import requests
import csv
from bs4 import BeautifulSoup

# Define the dataanalyst variable.
dataanalyst = "data analyst"

def get_job_postings(dataanalyst):
  """Gets the job postings from Indeed for the given keyword."""

  # Get the Indeed search URL for the given keyword.
  search_url = "https://www.indeed.com/jobs?q=data+analyst&l=remote&vjk=30f58c7471301c42".format(keyword)

  # Make a request to the Indeed search URL.
  response = requests.get(search_url, verify=False)

  # Parse the response and get the job postings.
  soup = BeautifulSoup(response.content, "html.parser")
  job_postings = soup.find_all("div", class_="jobsearch-result")

  return job_postings

def write_job_postings_to_csv(job_postings, filename):
  """Writes the job postings to a CSV file."""

  # Create a CSV file to store the job postings.
  with open(filename, "w", newline="") as csvfile:

    # Create a CSV writer object.
    writer = csv.writer(csvfile)

    # Write the header row to the CSV file.
    writer.writerow(["Title", "Company", "Location", "Description"])

    # Write the job postings to the CSV file.
    for job_posting in job_postings:
      title = job_posting.find("h2", class_="jobtitle").text
      company = job_posting.find("span", class_="company").text
      location = job_posting.find("span", class_="location").text
      description = job_posting.find("div", class_="job-snippet").text

      writer.writerow([title, company, location, description])

if __name__ == "__main__":

  # Define the dataanalyst variable.
  dataanalyst = "data+analyst"

  # Get the keyword from the user.
  keyword = "data analyst"

  # Get the job postings from Indeed.
  job_postings = get_job_postings(dataanalyst)

  # Write the job postings to a CSV file.
  write_job_postings_to_csv(job_postings, "remote_data_analyst_positions.csv")

  print("The job postings have been successfully scraped and written to a CSV file.")

Here's my terminal results:

PS C:\Users\chlor\OneDrive\Documents\Python> & C:/Users/chlor/AppData/Local/Programs/Python/Python311/python.exe c:/Users/chlor/OneDrive/Documents/Python/Indeed_DataAnalyst_Remote/DataAnalyst_Remote.py C:\Users\chlor\AppData\Local\Programs\Python\Python311\Lib\site-packages\urllib3\connectionpool.py:1045: InsecureRequestWarning: Unverified HTTPS request is being made to host 'localhost'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#ssl-warnings warnings.warn( The job postings have been successfully scraped and written to a CSV file. PS C:\Users\chlor\OneDrive\Documents\Python>

I expected this to write the job openings to my CSV file.

EpicMe
  • 1

1 Answers1

0

Assuming you are able to fetch complete html content using requests just change

 job_postings = soup.find_all("div", class_="jobsearch-result")

To

job_postings = soup.find_all("div", class_="job_seen_beacon")

I have tested it and was able to, but i have used selenium for html content as requests was not fetching complete content

Abhay Chaudhary
  • 1,763
  • 1
  • 8
  • 13