Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

META total_found is wrong for a SQL query with HAVING #2094

Open
8 tasks
mytskine opened this issue Apr 26, 2024 · 2 comments
Open
8 tasks

META total_found is wrong for a SQL query with HAVING #2094

mytskine opened this issue Apr 26, 2024 · 2 comments
Assignees

Comments

@mytskine
Copy link

Bug Description:

This is a simplest way I could find to reproduce the bug.
After mysql --port 9306, I enter these SQL lines into an empty DB:

create table category (groupid int);
insert into category (groupid) values (1), (2), (1), (3), (4), (4);

-- The data
-- Plain GROUP BY
select groupid from category GROUP BY groupid;
show meta like 'total%';
-- | total          | 4     |
-- | total_found    | 4     |

-- With HAVING
select groupid, count(*) as c from category GROUP BY groupid HAVING c > 1;
-- +---------+------+
-- | groupid | c    |
-- +---------+------+
-- |       1 |    2 |
-- |       4 |    2 |
-- +---------+------+
show meta like 'total%';
-- +----------------+-------+
-- | Variable_name  | Value |
-- +----------------+-------+
-- | total          | 2     |
-- | total_found    | 4     |
-- | total_relation | eq    |
-- +----------------+-------+

There were only 2 rows in the response to the last query, yet the metadata state that there were 4, which is wrong.

The bug was detected with real-world data and many queries: HAVING seems to have no influence at all on the total_found value.

I didn't test with the testing release of Manticore since I saw no mention of a similar bug in the issues or the changelog.

Manticore Search Version:

6.2.12

Operating System Version:

Debian stable

Have you tried the latest development version?

  • Yes

Internal Checklist:

To be completed by the assignee. Check off tasks that have been completed or are not applicable.

  • Task estimated
  • Specification created, reviewed, and approved
  • Implementation completed
  • Tests developed
  • Documentation updated
  • Documentation proofread
  • Changelog updated
@mytskine mytskine added the bug label Apr 26, 2024
@sanikolaev
Copy link
Collaborator

@mytskine thanks for the issue. It's better to not rely on total_found as a source of a precise results count. As said in the docs:

The estimated total number of matches for the query in the index.

We'll think what we can do in case of HAVING though. One-liner MRE:

mysql> drop table if exists category; create table category (groupid int); insert into category (groupid) values (1), (2), (1), (3), (4), (4); select groupid, count(*) as c from category GROUP BY groupid HAVING c > 1; show meta;
--------------
drop table if exists category
--------------

Query OK, 0 rows affected (0.04 sec)

--------------
create table category (groupid int)
--------------

Query OK, 0 rows affected (0.00 sec)

--------------
insert into category (groupid) values (1), (2), (1), (3), (4), (4)
--------------

Query OK, 6 rows affected (0.00 sec)

--------------
select groupid, count(*) as c from category GROUP BY groupid HAVING c > 1
--------------

+---------+------+
| groupid | c    |
+---------+------+
|       1 |    2 |
|       4 |    2 |
+---------+------+
2 rows in set (0.00 sec)
--- 2 out of 4 results in 0ms ---

--------------
show meta
--------------

+----------------+-------+
| Variable_name  | Value |
+----------------+-------+
| total          | 2     |
| total_found    | 4     |
| total_relation | eq    |
| time           | 0.000 |
+----------------+-------+
4 rows in set (0.00 sec)

@sanikolaev
Copy link
Collaborator

Let's improve docs about:

  • total_found
  • HAVING, REMOVE_REPEATS etc.

so the current behaviour is not confusing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants