Implementing pause and resume to handle flood of requests using Pyrogram in python

Question

I'm trying to list the message_id and filenames from a Telegram supergroup topic. The last messages saved to results.txt were

Message ID: 142452
File Name: 12_La_Modella_Di_Pickman.pdf

Message ID: 142451
File Name: 11_Halloween.pdf

But when the files are very many, I end up getting a FLOOD problem. I noticed that in the text file results.txt it prints around 1200 Message ID and 1200 File Name

Traceback (most recent call last):
  File "C:\Users\Peter\Desktop\script\messagge_id_telegram\temp\tg_id_fileditesto.py", line 42, in <module>
    app.run(main())
  File "C:\Users\Peter\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyrogram\methods\utilities\run.py", line 77, in run
    run(coroutine)
  File "C:\Users\Peter\AppData\Local\Programs\Python\Python311\Lib\asyncio\base_events.py", line 653, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "C:\Users\Peter\Desktop\script\messagge_id_telegram\temp\tg_id_fileditesto.py", line 18, in main
    async for message in app.get_discussion_replies(chat_id=group_id, message_id=topic_id):
  File "C:\Users\Peter\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyrogram\methods\messages\get_discussion_replies.py", line 59, in get_discussion_replies
    r = await self.invoke(
        ^^^^^^^^^^^^^^^^^^
  File "C:\Users\Peter\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyrogram\methods\advanced\invoke.py", line 79, in invoke
    r = await self.session.invoke(
        ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Peter\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyrogram\session\session.py", line 389, in invoke
    return await self.send(query, timeout=timeout)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Peter\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyrogram\session\session.py", line 357, in send
    RPCError.raise_it(result, type(data))
  File "C:\Users\Peter\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyrogram\errors\rpc_error.py", line 91, in raise_it
    raise getattr(
pyrogram.errors.exceptions.flood_420.FloodWait: Telegram says: [420 FLOOD_WAIT_X] - A wait of 16 seconds is required (caused by "messages.GetReplies")
PS C:\Users\Peter\Desktop\script\messagge_id_telegram\temp>

When this happens the terminal stops operations and resuming operations from where you left off is not possible. But I would like it not to interrupt the scan/print definitely, but to start again after having waited the necessary time to be able to continue the operations.
I would like the scan to restart from the last scanned message_id in order to avoid redoing everything from the beginning (and therefore never finishing printing all the documents in the text file) In my case the scan should start from

 Message ID: 142451
 File Name: 11_Halloween.pdf

Obviously this is an example, I don't think you should have to enter the message id manually, this is because the flood problem could occur in different points.

THIS is the original code and I can print many values (I can print more than 1000 requests for each run).
HERE is an example of how it works

To manage the 20 second pause every 500 requests and update the results.txt file without having to rewrite it (to avoid losing the previously acquired data) I'm trying to follow this way, but this time the Flood error appears after only one request.. (while with the original code it doesn't happen, I can make another 1000 and more each time)

from pyrogram import Client
import time

app = Client(
    name="@Peter_LongX",
    api_id=*******,
    api_hash="******************",
    phone_number="+3********",
    password="" or None
)

group_id = -1001867911973
topic_id = 665
msg_file_dict = {}
last_processed_message_id = 0  # Initialize the variable here

async def main():
    global last_processed_message_id  # Declare it as global

    async with app:
        count = 0
        file = open("results.txt", "a", encoding="utf-8")
        async for message in app.get_discussion_replies(chat_id=group_id, message_id=topic_id):
            if message.id <= last_processed_message_id:
                continue

            last_processed_message_id = message.id  # Update the ID of the last processed message

            print(f"Message ID: {message.id}")

            file_name = None
            if message.video or (message.document and message.document.mime_type.endswith(("rar", "zip", "pdf", "epub", "cbr"))):
                file = message.video or message.document
                print("File video o rar/zip/pdf/epub/cbr trovato")

                file_name = file.file_name or f"VID_{message.id}_{file.file_unique_id}.{file.mime_type.split('/')[-1]}"
                print(file_name)
                msg_file_dict[message.id] = file_name

            # Add the values to the text file
            with open("results.txt", "a", encoding="utf-8") as file:
                file.write(f"Message ID: {message.id}\n")
                file.write(f"File Name: {file_name}\n\n")

            count += 1
            if count == 500:
                # Take a 20 second break
                time.sleep(20)
                count = 0

app.run(main())

# Print the collected data
print(msg_file_dict.keys())  # List of Message ID
print(msg_file_dict.values())  # List of File Name

score 0 · Answer 1 · answered Aug 30 '23 at 17:11

I fixed it by adding a request counter

async def main():
    async with app:
        processed_messages = 0  # Initialize the counter
        async for message in app.get_discussion_replies(chat_id=group_id, message_id=topic_id):
            print(f"Message ID: {message.id}")
            message_date = message.date

            if start_date <= message_date <= end_date:  # Check if the message date is within the specified range
                print(f"Message ID: {message.id}")

            file_name = None  # Declare the variable outside the if statement

            if message.video or (message.document and (message.document.mime_type.endswith("rar") or message.document.mime_type.endswith("zip") or message.document.mime_type.endswith("pdf") or message.document.mime_type.endswith("epub") or message.document.mime_type.endswith("cbr"))):
                file = message.video or message.document
                print("Video or rar/zip/pdf/epub/cbr file found")

                msg_id = message.id
                file_name = file.file_name or f"VID_{message.id}_{file.file_unique_id}.{file.mime_type.split('/')[-1]}"
                print(file_name)
                msg_file_dict[msg_id] = file_name
            print()

            # Appending results to the text file with utf-8 encoding
            with open("results.txt", "a", encoding="utf-8") as file:
                if file_name:
                    file.write(f"Message ID: {message.id}\n")
                    file.write(f"File Name: {file_name}\n")
                    file.write("\n")

            processed_messages += 1  # Increment the counter

            if processed_messages % 40 == 0:
                time.sleep(15)  # Pause for 15 seconds after every 40 messages

app.run(main())

Implementing pause and resume to handle flood of requests using Pyrogram in python

1 Answers1