Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cache is created but unable to query data #1247

Open
santhoshstalin opened this issue May 6, 2024 · 5 comments
Open

cache is created but unable to query data #1247

santhoshstalin opened this issue May 6, 2024 · 5 comments
Labels
bug Something isn't working

Comments

@santhoshstalin
Copy link

this is interesting, while testing readyset with sbtest table found that the proxied queries shows supported but after creation of cache the data is not visible
tried in small table second time and same behavior is seen
the datatype in where condition is char

mysql> drop cache q_a84848222487ada9;
Query OK, 1 row affected (0.01 sec)

mysql> select count() from sbtest2 where c="17683594789-79094539502-64837025634-64401068468-93240347875-80207879399-06064830238-97247190956-35331442429-27373799568";
+----------+
| count(
) |
+----------+
| 1 |
+----------+
1 row in set (0.00 sec)

mysql> create cache from q_a84848222487ada9;
Query OK, 0 rows affected (0.01 sec)

mysql> select count() from sbtest2 where c="17683594789-79094539502-64837025634-64401068468-93240347875-80207879399-06064830238-97247190956-35331442429-27373799568";
+----------+
| count(
) |
+----------+
| 0 |
+----------+
1 row in set (0.00 sec)

mysql> show create table sbtest2;
+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| sbtest2 | CREATE TABLE sbtest2 (
id int NOT NULL AUTO_INCREMENT,
k int NOT NULL DEFAULT '0',
c char(120) NOT NULL DEFAULT '',
pad char(60) NOT NULL DEFAULT '',
PRIMARY KEY (id),
KEY k_2 (k)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=latin1

@santhoshstalin santhoshstalin added the bug Something isn't working label May 6, 2024
@santhoshstalin santhoshstalin changed the title cache is created without data cache is created but unable to query data May 6, 2024
@santhoshstalin
Copy link
Author

Please highlight if I've left anything out

@altmannmarcelo
Copy link
Contributor

Hi @santhoshstalin ,

Thanks for raising this issue. I can confirm I can reproduce it. We will start investigating it.

@santhoshstalin
Copy link
Author

santhoshstalin commented May 6, 2024

thanks , pfb more info:

Instance info: AWS RDS MySQL (8.0.35 Source distribution)
Benchmarking tool sysbench is used to populate 1M records to sbtest1 table
Client machine: Ec2 - Ubuntu 24.04 LTS (2 core / 4G running only readyset)

mysql> select count() from sbtest1;
+----------+
| count(
) |
+----------+
| 1000000 |
+----------+

mysql> show create table sbtest1;
+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| sbtest1 | CREATE TABLE sbtest1 (
id int NOT NULL AUTO_INCREMENT,
k int NOT NULL DEFAULT '0',
c char(120) NOT NULL DEFAULT '',
pad char(60) NOT NULL DEFAULT '',
PRIMARY KEY (id),
KEY k_1 (k)
) ENGINE=InnoDB AUTO_INCREMENT=1000001 DEFAULT CHARSET=latin1 |

mysql> select k from sbtest1 where c="63898592995-24128622894-65843699462-08660435971-77519216553-68029124508-89851108208-50959617462-51765555050-53840398311";
+--------+
| k |
+--------+
| 430614 |
+--------+

===

Using the caching layer:

Server version: 8.0.26-readyset Source distribution

mysql> select k from sbtest1 where c="63898592995-24128622894-65843699462-08660435971-77519216553-68029124508-89851108208-50959617462-51765555050-53840398311";
+--------+
| k |
+--------+
| 430614 |
+--------+
1 row in set (0.32 sec)

mysql> show proxied queries;
+--------------------+--------------------------------------------+--------------------+-------+
| query id | proxied query | readyset supported | count |
+--------------------+--------------------------------------------+--------------------+-------+
| q_5416b1e51211f8a | SELECT k FROM sbtest1 WHERE (c = $1) | yes | 0 |

mysql> create cache from q_5416b1e51211f8a;
Query OK, 0 rows affected (8.01 sec)

mysql>
mysql>
mysql> select k from sbtest1 where c="63898592995-24128622894-65843699462-08660435971-77519216553-68029124508-89851108208-50959617462-51765555050-53840398311";
Empty set (0.00 sec)

On creating the cache the data retrieval is not happening.


the reverse -- searching for int value using char works fine

mysql>
mysql> select c from sbtest1 where k=430614;
+-------------------------------------------------------------------------------------------------------------------------+
| c |
+-------------------------------------------------------------------------------------------------------------------------+
| 63898592995-24128622894-65843699462-08660435971-77519216553-68029124508-89851108208-50959617462-51765555050-53840398311 |
+-------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)

mysql>
mysql> show proxied queries;
+--------------------+--------------------------------------------+--------------------+-------+
| query id | proxied query | readyset supported | count |
+--------------------+--------------------------------------------+--------------------+-------+
| q_2df945b4c14f7d8f | SELECT c FROM sbtest1 WHERE (k = $1) | yes | 0 |

mysql> create cache from q_2df945b4c14f7d8f;
Query OK, 0 rows affected (7.00 sec)

mysql> select c from sbtest1 where k=430614;
+-------------------------------------------------------------------------------------------------------------------------+
| c |
+-------------------------------------------------------------------------------------------------------------------------+
| 63898592995-24128622894-65843699462-08660435971-77519216553-68029124508-89851108208-50959617462-51765555050-53840398311 |
+-------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql>

====

mysql> show caches;
+--------------------+--------------------+-------------------------------------------------------------------------------------------+-------------------+-------+
| query id | cache name | query text | fallback behavior | count |
+--------------------+--------------------+-------------------------------------------------------------------------------------------+-------------------+-------+
| q_5416b1e51211f8a | q_5416b1e51211f8a | SELECT sbtest.sbtest1.k FROM sbtest.sbtest1 WHERE (sbtest.sbtest1.c = $1) | fallback allowed | 0 |
| q_2df945b4c14f7d8f | q_2df945b4c14f7d8f | SELECT sbtest.sbtest1.c FROM sbtest.sbtest1 WHERE (sbtest.sbtest1.k = $1) | fallback allowed | 0 |
+--------------------+--------------------+-------------------------------------------------------------------------------------------+-------------------+-------+
2 rows in set (0.00 sec)

====

on dropping the cache the value is retrieved. Same is tested for a smaller table with 2 records as well

mysql> show caches;
+--------------------+--------------------+-------------------------------------------------------------------------------------------+-------------------+-------+
| query id | cache name | query text | fallback behavior | count |
+--------------------+--------------------+-------------------------------------------------------------------------------------------+-------------------+-------+
| q_5416b1e51211f8a | q_5416b1e51211f8a | SELECT sbtest.sbtest1.k FROM sbtest.sbtest1 WHERE (sbtest.sbtest1.c = $1) | fallback allowed | 0 |
| q_2df945b4c14f7d8f | q_2df945b4c14f7d8f | SELECT sbtest.sbtest1.c FROM sbtest.sbtest1 WHERE (sbtest.sbtest1.k = $1) | fallback allowed | 0 |
+--------------------+--------------------+-------------------------------------------------------------------------------------------+-------------------+-------+
2 rows in set (0.00 sec)

mysql>
mysql> select k from sbtest1 where c="63898592995-24128622894-65843699462-08660435971-77519216553-68029124508-89851108208-50959617462-51765555050-53840398311";
Empty set (0.00 sec)

mysql>
mysql> drop cache q_5416b1e51211f8a;
Query OK, 1 row affected (0.00 sec)

mysql> show caches;
+--------------------+--------------------+-------------------------------------------------------------------------------------------+-------------------+-------+
| query id | cache name | query text | fallback behavior | count |
+--------------------+--------------------+-------------------------------------------------------------------------------------------+-------------------+-------+
| q_2df945b4c14f7d8f | q_2df945b4c14f7d8f | SELECT sbtest.sbtest1.c FROM sbtest.sbtest1 WHERE (sbtest.sbtest1.k = $1) | fallback allowed | 0 |
+--------------------+--------------------+-------------------------------------------------------------------------------------------+-------------------+-------+
1 row in set (0.00 sec)

mysql> select k from sbtest1 where c="63898592995-24128622894-65843699462-08660435971-77519216553-68029124508-89851108208-50959617462-51765555050-53840398311";
+--------+
| k |
+--------+
| 430614 |
+--------+
1 row in set (0.26 sec)

@linuxgood1230
Copy link

linuxgood1230 commented May 10, 2024

Check the log, the output (SQL: Create INDEX k_2 ON sbtest2(k))
WARN replicators::mysql_connector::connector: Error extending recipe, DDL statement will not be used error=Query failed to parse: INDEX k_2 ON sbtest2(k)

fix: trigger Change in DDL requires partial resnapshot
alter table sbtest2 drop index k_2;

@altmannmarcelo
Copy link
Contributor

An update on this issue. The problem is caused by CHAR fields which require padding if the row value does not fill the column width. For example:

CREATE TABLE tb (ID INT, c CHAR(2));
INSERT INTO tb VALUES (1, 'A');

Currently we don't have a proper way to identify how many spaces we need to pad when saving this data to disk.
We have submitted a patch to the mysql rust drive we use at blackbeam/rust_mysql_common#135 to add the dictionary of collations necessary for this and other metadata information about collations.
Once this work is incorporated into the driver we will be able to properly lookup collations and calculate the necessary padding required to store those values in disk.

readysetbot pushed a commit that referenced this issue May 21, 2024
This commits adds proper collation support for CHAR and BINARY columns
in MySQL.
CHAR columns should be right padded with spaces to the column length
when storing them and BINARY should right pad zeros.

This commit fixes the issue at snapshot - During snapshot we do a
logical dump of data. MySQL removes padding spaces from CHAR columns
when retrieving them. So, we need to take the column collation into
consideration when storing them. One gotcha is with ENUM/SET columns,
they are retrieved as Strings(MYSQL_TYPE_STRING), but we should not
pad them.
During CDC, we need to retrieve proper
metadata from TME in order to validate if padding is necessary or not.

This commit also fixes an issue when storing BINARY columns. We were
storing them as TinyText/Text if the binary representation of the
columns was a valid UTF-8 string. This is not correct. We should store
them as ByteArray.

Test cases were written taking into consideration a mix of characters
from different bytes, like mixing ASCII and UTF-8 characters from
2nd and 3rd bytes.

Note: MySQL uses the terminology of charset and collation interchangeably.
In the end everything is stored as collation ID, which can be used to
determine the charset and collation.

Ref: REA-4366
Ref: REA-4383
Closes: #1247 #1259

Release-Note-Core: Added collation support for storing CHAR and BINARY
   columns in MySQL is correct padding. This fixes and issue when
   looking up CHAR/BINARY columns with values that do not match the
   column length.

Change-Id: Ibb436b99b46500f940efe79d06d86494bfc4bf30
readysetbot pushed a commit that referenced this issue May 21, 2024
This commits adds proper collation support for CHAR and BINARY columns
in MySQL.
CHAR columns should be right padded with spaces to the column length
when storing them and BINARY should right pad zeros.

This commit fixes the issue at snapshot - During snapshot we do a
logical dump of data. MySQL removes padding spaces from CHAR columns
when retrieving them. So, we need to take the column collation into
consideration when storing them. One gotcha is with ENUM/SET columns,
they are retrieved as Strings(MYSQL_TYPE_STRING), but we should not
pad them.
During CDC, we need to retrieve proper
metadata from TME in order to validate if padding is necessary or not.

This commit also fixes an issue when storing BINARY columns. We were
storing them as TinyText/Text if the binary representation of the
columns was a valid UTF-8 string. This is not correct. We should store
them as ByteArray.

Test cases were written taking into consideration a mix of characters
from different bytes, like mixing ASCII and UTF-8 characters from
2nd and 3rd bytes.

Note: MySQL uses the terminology of charset and collation interchangeably.
In the end everything is stored as collation ID, which can be used to
determine the charset and collation.

Ref: REA-4366
Ref: REA-4383
Closes: #1247 #1259

Release-Note-Core: Added collation support for storing CHAR and BINARY
   columns in MySQL using the correct padding. This fixes an issue when
   looking up CHAR/BINARY columns with values that do not match the
   column length.

Change-Id: Ibb436b99b46500f940efe79d06d86494bfc4bf30
readysetbot pushed a commit that referenced this issue May 24, 2024
This commits adds proper collation support for CHAR and BINARY columns
in MySQL.
CHAR columns should be right padded with spaces to the column length
when storing them and BINARY should right pad zeros.

This commit fixes the issue at snapshot - During snapshot we do a
logical dump of data. MySQL removes padding spaces from CHAR columns
when retrieving them. So, we need to take the column collation into
consideration when storing them. One gotcha is with ENUM/SET columns,
they are retrieved as Strings(MYSQL_TYPE_STRING), but we should not
pad them.
During CDC, we need to retrieve proper
metadata from TME in order to validate if padding is necessary or not.

This commit also fixes an issue when storing BINARY columns. We were
storing them as TinyText/Text if the binary representation of the
columns was a valid UTF-8 string. This is not correct. We should store
them as ByteArray.

Test cases were written taking into consideration a mix of characters
from different bytes, like mixing ASCII and UTF-8 characters from
2nd and 3rd bytes.

Note: MySQL uses the terminology of charset and collation interchangeably.
In the end everything is stored as collation ID, which can be used to
determine the charset and collation.

Ref: REA-4366
Ref: REA-4383
Closes: #1247 #1259

Release-Note-Core: Added collation support for storing CHAR and BINARY
   columns in MySQL using the correct padding. This fixes an issue when
   looking up CHAR/BINARY columns with values that do not match the
   column length.

Change-Id: Ibb436b99b46500f940efe79d06d86494bfc4bf30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants