0

I am working on a Python script to collect data from an API, specifically tweets from Twitter. Currently, my script retrieves all the available tweets within a specified time range. However, I want to modify the script to collect a specific number of tweets per hour, with a controlled time step.

Here's a simplified version of my code:

# Code snippet

import requests
import time
from datetime import datetime
from datetime import timedelta

search_url = "https://api.twitter.com/2/tweets/search/all"
sleep_seconds = 300  # Sleep time in case of reaching API limit

# Other code...

def main(loop_counter, total_tweets):
    jobs = pd.read_csv("capture_jobs.csv", sep=";")

    for index, row in jobs.iterrows():
        start_date = datetime.strptime(row["start"], "%d/%m/%Y").strftime("%Y-%m-%d")
        end_date = datetime.strptime(row["end"], "%d/%m/%Y").strftime("%Y-%m-%d")
        timestep_minutes = 60  # Set the desired time step in minutes
        current_time = datetime.strptime(row["start_time"], "%H:%M:%S")

        while current_time <= datetime.strptime(row["end_time"], "%H:%M:%S"):
            start_time = current_time.strftime("%H:%M:%S")
            current_time += timedelta(minutes=timestep_minutes)
            end_time = current_time.strftime("%H:%M:%S")

            # Rest of the code...

            # Sleeping to control the time step between iterations
            time.sleep(timestep_minutes * 60)  # Convert minutes to seconds

        # Rest of the code...

if __name__ == "__main__":
    main(1, 0)

How can I modify this code to achieve the desired time step for data collection? For example, if I want to collect 100 tweets per hour, how can I control the time step between API requests to ensure that I collect data within each hour and then start collecting data within the next hour?

Any help or suggestions would be greatly appreciated.

Note: I have already reviewed the code and the existing questions on Stack Overflow, but I couldn't find a suitable solution that fits my requirements.

Let me know if you need any further clarification or information.

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129

0 Answers0