Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem when using bot.retrieve_data #2245

Open
YouKnow-sys opened this issue Apr 27, 2024 · 4 comments
Open

Problem when using bot.retrieve_data #2245

YouKnow-sys opened this issue Apr 27, 2024 · 4 comments

Comments

@YouKnow-sys
Copy link
Contributor

YouKnow-sys commented Apr 27, 2024

Please answer these questions before submitting your issue. Thanks!

  1. What version of pyTelegramBotAPI are you using?
    4.16.1

  2. What OS are you using?
    Linux and Windows

  3. What version of python are you using?
    3.12

there is a problem with using bot.retrieve_data that can some times lead to data loss, the problem can appear when we do some other async work inside a async block using bot.retrieve_data (awaiting a bot.send_message for example). this can cause us to recieve another update in our handler that use bot.retrieve_data so in end we can potentially lose one of the data that we modified.
here is a fully working example of this problem happeningg

import asyncio
from telebot import async_telebot
import logging
from telebot import types

async_telebot.logger.setLevel(logging.INFO)

bot = async_telebot.AsyncTeleBot(token="TOKEN")

class Single:
    def __init__(self, message: types.Message) -> None:
        self.message = message
        
class Group:
    def __init__(self, message: types.Message) -> None:
        self.messages = [message]
        self.media_group_id = message.media_group_id
        
    def add_message(self, message: types.Message):
        self.messages.append(message)

@bot.message_handler(commands=['start'])
async def on_start(message: types.Message):
    await bot.reply_to(
        message,
        "Hi",
    )
    await bot.set_state(message.from_user.id, 1, message.chat.id)
    await bot.add_data(message.from_user.id, message.chat.id, posts=[])

@bot.message_handler(chat_types=['private'], content_types=["audio", "photo", "voice", "video", "document", "text"],)
async def on_posts(message: types.Message):
    async with bot.retrieve_data(message.from_user.id, message.chat.id) as data:
        posts: list[Single | Group] = data['posts']
        
        if message.media_group_id:
            if posts:
                for idx in reversed(range(len(posts))):
                    if isinstance(posts[idx], Group) and posts[idx].media_group_id == message.media_group_id: # type: ignore
                        posts[idx].add_message(message) # type: ignore
                        print(f"Added as part of {idx} group")
                        return
            posts.append(Group(message))
            print(f"Added new group at {len(posts)}")
        else:
            posts.append(Single(message))
            print(f"Added new single at {len(posts)}")
            
        await bot.reply_to(
            message,
            "Msg added to posts."
        )

asyncio.run(bot.infinity_polling())

if you run this example and send a few media gallery to the bot (remember to send /start at first) we can see that some of them fully get lost in process, and thats all because the await bot.reply_to that we are doing.

possible way to fix

  • Add some kind of look to the StateContext (maybe asyncio.Look) so StateContext could acquire the look in aenter and release it in aexit in this way we can guarantee that only one lock to the StateContext is allowed at the time.
  • If we can ensure that if more than one reference to the same storage is grabbed at a time, they both get the same copy of the data rather then two different copies.
@coder2020official
Copy link
Collaborator

NEVER send messages inside that function

@YouKnow-sys
Copy link
Contributor Author

NEVER send messages inside that function

I know I shouldn't do it, but it was just an example, it's still possible that we need to call another async function in order to modify the data or doing something else, that's why that I think this function should be usable in this scenario as well

@BlocksDevPro
Copy link

I found the issue.

The problem with the current bot.retrieve_data is, when you get multiple updates it wont push the data on each processd_update, it wait till all the updates are processed and then just updates on the last one.

Explaination in code.

@bot.message_handler(content_types=['photo', 'video'])
async def on_post(message: types.Message):
    async with bot.retrieve_data(message.from_user.id, message.chat.id) as data:
        # on media group, the data only gets updated on the last update: Media of the group.
        # so whatever you update on the group, it will only be updated if this is the last update: Media of the group.

        # if we add 1 to the list everytime, it will be only update one time and on the last update.
        # if we get 3 media in a group, what we expect the data['posts'] to be is [1, 1, 1]
        data['posts'].append(1)
        # but we get the data['posts'] as [1].

Solution

@bot.message_handler(content_types=['photo', 'video'])
async def on_post(message: types.Message):
    async with bot.retrieve_data(message.from_user.id, message.chat.id) as data:

        data['posts'].append(1)

        # so the solution is very simple just update the data manually
        await bot.add_data(message.from_user.id, message.chat.id, posts=data['posts'])

        # now when user submits 3 media in a group, we get the data['posts'] as [1, 1, 1]

@coder2020official
Copy link
Collaborator

Use retrieve_data to quickly get data or alter it, or even add. But never use any API requests inside, that will solve the issue. Calling a function to add data inside the retrieve_data is ridiculous.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants