Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: list_and_count triggering a lot of SQL queries #193

Closed
1 of 4 tasks
sorasful opened this issue May 12, 2024 · 3 comments
Closed
1 of 4 tasks

Bug: list_and_count triggering a lot of SQL queries #193

sorasful opened this issue May 12, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@sorasful
Copy link

sorasful commented May 12, 2024

Description

I don't know if that could be related to this issue : #128

Basically, I have a route which uses a Service which does a list_and_count, it took way too much time for this little select, so I decided to show the queries with the echo parameter. (I'm using the Litestar Fullstack repo btw).

And I noticed that I had a lot of queries that looked like this : (see logs)

I know that i'm not using the latest version of advanced_alchemy, but I've looked through the commits and I did not see things related to this since 0.8.3

I am using a Postgres 15.1 database

URL to code causing the issue

No response

MCVE

# Here are the main parts of my program



@dataclass
class DatabaseSettings:
    ECHO: bool = field(
        default_factory=lambda: os.getenv("DATABASE_ECHO", "False") in {"True", "1", "yes", "Y", "T", "debug"},
    )
    """Enable SQLAlchemy engine logs."""
    ECHO_POOL: bool = field(
        default_factory=lambda: os.getenv("DATABASE_ECHO_POOL", "False") in {"True", "1", "yes", "Y", "T", "debug"},
    )
    """Enable SQLAlchemy connection pool logs."""
    POOL_DISABLED: bool = field(
        default_factory=lambda: os.getenv("DATABASE_POOL_DISABLED", "False") in TRUE_VALUES,
    )
    """Disable SQLAlchemy pool configuration."""
    POOL_MAX_OVERFLOW: int = field(default_factory=lambda: int(os.getenv("DATABASE_MAX_POOL_OVERFLOW", "10")))
    """Max overflow for SQLAlchemy connection pool"""
    POOL_SIZE: int = field(default_factory=lambda: int(os.getenv("DATABASE_POOL_SIZE", "5")))
    """Pool size for SQLAlchemy connection pool"""
    POOL_TIMEOUT: int = field(default_factory=lambda: int(os.getenv("DATABASE_POOL_TIMEOUT", "30")))
    """Time in seconds for timing connections out of the connection pool."""
    POOL_RECYCLE: int = field(default_factory=lambda: int(os.getenv("DATABASE_POOL_RECYCLE", "300")))
    """Amount of time to wait before recycling connections."""
    POOL_PRE_PING: bool = field(
        default_factory=lambda: os.getenv("DATABASE_PRE_POOL_PING", "False") in TRUE_VALUES,
    )
    """Optionally ping database before fetching a session from the connection pool."""
    URL: str = field(default_factory=lambda: os.getenv("DATABASE_URL", "sqlite+aiosqlite:///db.sqlite3"))
    """SQLAlchemy Database URL."""
    MIGRATION_CONFIG: str = f"{BASE_DIR}/db/migrations/alembic.ini"
    """The path to the `alembic.ini` configuration file."""
    MIGRATION_PATH: str = f"{BASE_DIR}/db/migrations"
    """The path to the `alembic` database migrations."""
    MIGRATION_DDL_VERSION_TABLE: str = "ddl_version"
    """The name to use for the `alembic` versions table name."""
    FIXTURE_PATH: str = f"{BASE_DIR}/db/fixtures"
    """The path to JSON fixture files to load into tables."""
    _engine_instance: AsyncEngine | None = None
    """SQLAlchemy engine instance generated from settings."""

    @property
    def engine(self) -> AsyncEngine:
        return self.get_engine()

    def get_engine(self) -> AsyncEngine:
        if self._engine_instance is not None:
            return self._engine_instance
        if self.URL.startswith("postgres"):
            engine = create_async_engine(
                url=self.URL,
                future=True,
                json_serializer=encode_json,
                json_deserializer=decode_json,
                echo=self.ECHO,
                echo_pool=self.ECHO_POOL,
                max_overflow=self.POOL_MAX_OVERFLOW,
                pool_size=self.POOL_SIZE,
                pool_timeout=self.POOL_TIMEOUT,
                pool_recycle=self.POOL_RECYCLE,
                pool_pre_ping=self.POOL_PRE_PING,
                pool_use_lifo=True,  # use lifo to reduce the number of idle connections
                connect_args={"server_settings": {"jit": "off"}},
                poolclass=NullPool if self.POOL_DISABLED else None,
            )
            """Database session factory.

            See [`async_sessionmaker()`][sqlalchemy.ext.asyncio.async_sessionmaker].
            """

            @event.listens_for(engine.sync_engine, "connect")
            def _sqla_on_connect(dbapi_connection: Any, _: Any) -> Any:  # pragma: no cover
                """Using msgspec for serialization of the json column values means that the
                output is binary, not `str` like `json.dumps` would output.
                SQLAlchemy expects that the json serializer returns `str` and calls `.encode()` on the value to
                turn it to bytes before writing to the JSONB column. I'd need to either wrap `serialization.to_json` to
                return a `str` so that SQLAlchemy could then convert it to binary, or do the following, which
                changes the behaviour of the dialect to expect a binary value from the serializer.
                See Also https://github.com/sqlalchemy/sqlalchemy/blob/14bfbadfdf9260a1c40f63b31641b27fe9de12a0/lib/sqlalchemy/dialects/postgresql/asyncpg.py#L934  pylint: disable=line-too-long
                """

                def encoder(bin_value: bytes) -> bytes:
                    return b"\x01" + encode_json(bin_value)

                def decoder(bin_value: bytes) -> Any:
                    # the byte is the \x01 prefix for jsonb used by PostgreSQL.
                    # asyncpg returns it when format='binary'
                    return decode_json(bin_value[1:])

                dbapi_connection.await_(
                    dbapi_connection.driver_connection.set_type_codec(
                        "jsonb",
                        encoder=encoder,
                        decoder=decoder,
                        schema="pg_catalog",
                        format="binary",
                    ),
                )
                dbapi_connection.await_(
                    dbapi_connection.driver_connection.set_type_codec(
                        "json",
                        encoder=encoder,
                        decoder=decoder,
                        schema="pg_catalog",
                        format="binary",
                    ),
                )

        elif self.URL.startswith("sqlite"):
            engine = create_async_engine(
                url=self.URL,
                future=True,
                json_serializer=encode_json,
                json_deserializer=decode_json,
                echo=self.ECHO,
                echo_pool=self.ECHO_POOL,
                pool_recycle=self.POOL_RECYCLE,
                pool_pre_ping=self.POOL_PRE_PING,
            )
            """Database session factory.

            See [`async_sessionmaker()`][sqlalchemy.ext.asyncio.async_sessionmaker].
            """

            @event.listens_for(engine.sync_engine, "connect")
            def _sqla_on_connect(dbapi_connection: Any, _: Any) -> Any:  # pragma: no cover
                """Override the default begin statement.  The disables the built in begin execution."""
                dbapi_connection.isolation_level = None

            @event.listens_for(engine.sync_engine, "begin")
            def _sqla_on_begin(dbapi_connection: Any) -> Any:  # pragma: no cover
                """Emits a custom begin"""
                dbapi_connection.exec_driver_sql("BEGIN")

        else:
            engine = create_async_engine(
                url=self.URL,
                future=True,
                json_serializer=encode_json,
                connect_args={"server_settings": {"jit": "off"}},
                json_deserializer=decode_json,
                echo=self.ECHO,
                echo_pool=self.ECHO_POOL,
                max_overflow=self.POOL_MAX_OVERFLOW,
                pool_size=self.POOL_SIZE,
                pool_timeout=self.POOL_TIMEOUT,
                pool_recycle=self.POOL_RECYCLE,
                pool_pre_ping=self.POOL_PRE_PING,
            )
        self._engine_instance = engine
        return self._engine_instance


# Models 


class User(UUIDAuditBase):
    __tablename__ = "user_accounts"  # type: ignore[assignment]
    __table_args__ = {"comment": "User accounts for application access"}
    __pii_columns__ = {"name", "email", "avatar_url"}

    email: Mapped[str] = mapped_column(unique=True, index=True, nullable=False)
    name: Mapped[str | None] = mapped_column(nullable=True, default=None)
    # ...

    devices: Mapped[list[Device]] = relationship(
        back_populates="user",
        lazy="selectin",
        uselist=True,
        cascade="all, delete",
    )
    groups: Mapped[list[Group]] = relationship(
        back_populates="user",
        uselist=True,
        lazy="selectin",
        cascade="all, delete",
    )


    def __repr__(self) -> str:
        return self.email



class Group(UUIDAuditBase):
    """Group of devices that belong to a specific user."""

    __tablename__ = "groups"
    __table_args__ = {"comment": "A group of devices belonging to a user"}

    name: Mapped[str] = mapped_column(String(length=255), nullable=False, index=True)

    # -----------
    # ORM Relationships
    # ------------
    user_id: Mapped[UUID] = mapped_column(ForeignKey("user_accounts.id", ondelete="cascade"), nullable=False)
    user: Mapped[User] = relationship(
        back_populates="groups",
        innerjoin=True,
        uselist=False,
        lazy="joined",
    )

    devices: Mapped[list[Device]] = relationship(
        secondary=device_group_association,
        back_populates="groups",
        lazy="selectin",
    )

    def __repr__(self) -> str:
        return self.name



class Device(UUIDAuditBase):
    """Device to send push notifications to."""

    __tablename__ = "devices"
    __table_args__ = (UniqueConstraint("user_id", "expo_token"),)  # type: ignore[assignment]


    # -----------
    # ORM Relationships
    # ------------
    user_id: Mapped[UUID] = mapped_column(ForeignKey("user_accounts.id", ondelete="cascade"), nullable=False)
    user: Mapped[User] = relationship(
        back_populates="devices",
        innerjoin=True,
        uselist=False,
        lazy="joined",
    )

    groups: Mapped[list[Group]] = relationship(
        secondary=device_group_association,
        back_populates="devices",
    )

    def __repr__(self) -> str:
        return self.expo_token




async def provide_groups_service(
    db_session: AsyncSession | None = None,
) -> AsyncGenerator[GroupService, None]:
    """

    Args:
        db_session (AsyncSession | None, optional): current database session. Defaults to None.

    """
    async with GroupService.new(
        session=db_session,
        statement=select(Group).options(
            selectinload(Group.devices),
        ),
    ) as service:
        yield service


# And the controller

class GroupController(Controller):
    tags = ["Groups"]

    dependencies = {
        "users_service": Provide(provide_users_service),
        "devices_service": Provide(provide_devices_service),
        "groups_service": Provide(provide_groups_service),
    }

    path = "/api/groups"

    @get(
        path="/",
        media_type=MediaType.JSON,
        cache=False,
        status_code=200,
        guards=[],
        summary="Route for an user to list the groups and count.",
    )
    async def list_groups(
        self,
        request: Request,
        groups_service: GroupService,
        current_user: UserModel,
    ) -> OffsetPagination[GroupShow]:

        results, total = await groups_service.list_and_count(user_id=current_user.id)

        return groups_service.to_schema(GroupShow, results, total)

Steps to reproduce

1. Go to '...'
2. Click on '....'
3. Scroll down to '....'
4. See error

Screenshots

"In the format of: ![SCREENSHOT_DESCRIPTION](SCREENSHOT_LINK.png)"

Logs

024-05-12 23:01:43,456 INFO sqlalchemy.engine.Engine [caching disabled (excess depth for ORM loader options) 0.00310s ] ('11db6fe2-861b-49c7-97db-6b62c753ca5f',)
INFO - 2024-05-12 23:01:43,454 - sqlalchemy.engine.Engine - base - SELECT user_account_roles.user_id AS user_account_roles_user_id, user_account_roles.role_id AS user_account_roles_role_id, user_account_roles.assigned_at AS user_account_roles_assigned_at, user_account_roles.id AS user_account_roles_id, user_account_roles.sa_orm_sentinel AS user_account_roles_sa_orm_sentinel, user_account_roles.created_at AS user_account_roles_created_at, user_account_roles.updated_at AS user_account_roles_updated_at, user_accounts_1.email AS user_accounts_1_email, user_accounts_1.name AS user_accounts_1_name, user_accounts_1.team_name AS user_accounts_1_team_name, user_accounts_1.hashed_password AS user_accounts_1_hashed_password, user_accounts_1.avatar_url AS user_accounts_1_avatar_url, user_accounts_1.is_active AS user_accounts_1_is_active, user_accounts_1.is_superuser AS user_accounts_1_is_superuser, user_accounts_1.is_verified AS user_accounts_1_is_verified, user_accounts_1.is_demo AS user_accounts_1_is_demo, user_accounts_1.verified_at AS user_accounts_1_verified_at, user_accounts_1.joined_at AS user_accounts_1_joined_at, user_accounts_1.login_count AS user_accounts_1_login_count, user_accounts_1.premium_plan AS user_accounts_1_premium_plan, user_accounts_1.premium_plan_expiration_date AS user_accounts_1_premium_plan_expiration_date, user_accounts_1.lemonsqueezy_customer_id AS user_accounts_1_lemonsqueezy_customer_id, user_accounts_1.id AS user_accounts_1_id, user_accounts_1.sa_orm_sentinel AS user_accounts_1_sa_orm_sentinel, user_accounts_1.created_at AS user_accounts_1_created_at, user_accounts_1.updated_at AS user_accounts_1_updated_at, roles_1.slug AS roles_1_slug, roles_1.name AS roles_1_name, roles_1.description AS roles_1_description, roles_1.id AS roles_1_id, roles_1.sa_orm_sentinel AS roles_1_sa_orm_sentinel, roles_1.created_at AS roles_1_created_at, roles_1.updated_at AS roles_1_updated_at 
FROM user_account_roles JOIN user_accounts AS user_accounts_1 ON user_accounts_1.id = user_account_roles.user_id JOIN roles AS roles_1 ON roles_1.id = user_account_roles.role_id 
WHERE user_account_roles.user_id IN ($1::UUID)
INFO - 2024-05-12 23:01:43,456 - sqlalchemy.engine.Engine - base - [caching disabled (excess depth for ORM loader options) 0.00310s ] ('11db6fe2-861b-49c7-97db-6b62c753ca5f',)
2024-05-12 23:01:43,503 INFO sqlalchemy.engine.Engine SELECT user_account_roles.user_id AS user_account_roles_user_id, user_account_roles.role_id AS user_account_roles_role_id, user_account_roles.assigned_at AS user_account_roles_assigned_at, user_account_roles.id AS user_account_roles_id, user_account_roles.sa_orm_sentinel AS user_account_roles_sa_orm_sentinel, user_account_roles.created_at AS user_account_roles_created_at, user_account_roles.updated_at AS user_account_roles_updated_at, user_accounts_1.email AS user_accounts_1_email, user_accounts_1.name AS user_accounts_1_name, user_accounts_1.team_name AS user_accounts_1_team_name, user_accounts_1.hashed_password AS user_accounts_1_hashed_password, user_accounts_1.avatar_url AS user_accounts_1_avatar_url, user_accounts_1.is_active AS user_accounts_1_is_active, user_accounts_1.is_superuser AS user_accounts_1_is_superuser, user_accounts_1.is_verified AS user_accounts_1_is_verified, user_accounts_1.is_demo AS user_accounts_1_is_demo, user_accounts_1.verified_at AS user_accounts_1_verified_at, user_accounts_1.joined_at AS user_accounts_1_joined_at, user_accounts_1.login_count AS user_accounts_1_login_count, user_accounts_1.premium_plan AS user_accounts_1_premium_plan, user_accounts_1.premium_plan_expiration_date AS user_accounts_1_premium_plan_expiration_date, user_accounts_1.lemonsqueezy_customer_id AS user_accounts_1_lemonsqueezy_customer_id, user_accounts_1.id AS user_accounts_1_id, user_accounts_1.sa_orm_sentinel AS user_accounts_1_sa_orm_sentinel, user_accounts_1.created_at AS user_accounts_1_created_at, user_accounts_1.updated_at AS user_accounts_1_updated_at, roles_1.slug AS roles_1_slug, roles_1.name AS roles_1_name, roles_1.description AS roles_1_description, roles_1.id AS roles_1_id, roles_1.sa_orm_sentinel AS roles_1_sa_orm_sentinel, roles_1.created_at AS roles_1_created_at, roles_1.updated_at AS roles_1_updated_at 
FROM user_account_roles JOIN user_accounts AS user_accounts_1 ON user_accounts_1.id = user_account_roles.user_id JOIN roles AS roles_1 ON roles_1.id = user_account_roles.role_id 
WHERE user_account_roles.user_id IN ($1::UUID)
2024-05-12 23:01:43,504 INFO sqlalchemy.engine.Engine [caching disabled (excess depth for ORM loader options) 0.00148s ] ('11db6fe2-861b-49c7-97db-6b62c753ca5f',)
INFO - 2024-05-12 23:01:43,503 - sqlalchemy.engine.Engine - base - SELECT user_account_roles.user_id AS user_account_roles_user_id, user_account_roles.role_id AS user_account_roles_role_id, user_account_roles.assigned_at AS user_account_roles_assigned_at, user_account_roles.id AS user_account_roles_id, user_account_roles.sa_orm_sentinel AS user_account_roles_sa_orm_sentinel, user_account_roles.created_at AS user_account_roles_created_at, user_account_roles.updated_at AS user_account_roles_updated_at, user_accounts_1.email AS user_accounts_1_email, user_accounts_1.name AS user_accounts_1_name, user_accounts_1.team_name AS user_accounts_1_team_name, user_accounts_1.hashed_password AS user_accounts_1_hashed_password, user_accounts_1.avatar_url AS user_accounts_1_avatar_url, user_accounts_1.is_active AS user_accounts_1_is_active, user_accounts_1.is_superuser AS user_accounts_1_is_superuser, user_accounts_1.is_verified AS user_accounts_1_is_verified, user_accounts_1.is_demo AS user_accounts_1_is_demo, user_accounts_1.verified_at AS user_accounts_1_verified_at, user_accounts_1.joined_at AS user_accounts_1_joined_at, user_accounts_1.login_count AS user_accounts_1_login_count, user_accounts_1.premium_plan AS user_accounts_1_premium_plan, user_accounts_1.premium_plan_expiration_date AS user_accounts_1_premium_plan_expiration_date, user_accounts_1.lemonsqueezy_customer_id AS user_accounts_1_lemonsqueezy_customer_id, user_accounts_1.id AS user_accounts_1_id, user_accounts_1.sa_orm_sentinel AS user_accounts_1_sa_orm_sentinel, user_accounts_1.created_at AS user_accounts_1_created_at, user_accounts_1.updated_at AS user_accounts_1_updated_at, roles_1.slug AS roles_1_slug, roles_1.name AS roles_1_name, roles_1.description AS roles_1_description, roles_1.id AS roles_1_id, roles_1.sa_orm_sentinel AS roles_1_sa_orm_sentinel, roles_1.created_at AS roles_1_created_at, roles_1.updated_at AS roles_1_updated_at 
FROM user_account_roles JOIN user_accounts AS user_accounts_1 ON user_accounts_1.id = user_account_roles.user_id JOIN roles AS roles_1 ON roles_1.id = user_account_roles.role_id 
WHERE user_account_roles.user_id IN ($1::UUID)
INFO - 2024-05-12 23:01:43,504 - sqlalchemy.engine.Engine - base - [caching disabled (excess depth for ORM loader options) 0.00148s ] ('11db6fe2-861b-49c7-97db-6b62c753ca5f',)

Package Version

0.8.3

Platform

  • Linux
  • Mac
  • Windows
  • Other (Please specify in the description above)
@sorasful sorasful added the bug Something isn't working label May 12, 2024
@sorasful
Copy link
Author

Actually, I managed to update to the latest version 0.9.4 and now I have the same problem with the /api/access/me route.

class AccessController(Controller):
    """User login and registration."""

    path = "/api/access"
    tags = ["Access"]
    dependencies = {
        "users_service": Provide(provide_users_service),
        "roles_service": Provide(provide_roles_service),
        "magic_links_service": Provide(provide_magic_link_service),
    }
    signature_namespace = {
        "UserService": UserService,
        "RoleService": RoleService,
        "OAuth2Login": OAuth2Login,
        "RequestEncodingType": RequestEncodingType,
        "Body": Body,
        "User": User,
        "MagicLinkService": MagicLinkService,
    } 
    
    @get(
        operation_id="AccountProfile",
        name="account:profile",
        path=urls.ACCOUNT_PROFILE,
        guards=[requires_active_user],
        summary="User Profile",
        description="User profile information.",
    )
    async def profile(self, request: Request, current_user: UserModel, users_service: UserService) -> User:
        """User Profile."""
        return users_service.to_schema(current_user, schema_type=User)

Logs are very similar that the ones I provided earlier

@sorasful
Copy link
Author

Actually, it may just be something related to the way I load all my relations.

I'll keep this updated.

@sorasful
Copy link
Author

Okay, this is clearly how I loaded my relations.

My User has Devices and Groups and my Groups have Devices which made a funny result and a lot of loading and recursion depth error.

Not related to advanced_alchemy, i'm closing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant