Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: SQLTarget - missing automapping for FeatureSet definition #5051

Open
2 tasks done
george0st opened this issue Feb 2, 2024 · 0 comments
Open
2 tasks done

[Bug]: SQLTarget - missing automapping for FeatureSet definition #5051

george0st opened this issue Feb 2, 2024 · 0 comments

Comments

@george0st
Copy link
Collaborator

george0st commented Feb 2, 2024

MLRun Version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of MLRun CE.

Reproducible Example

import mlrun
import mlrun.feature_store as fstore
from mlrun.datastore.targets import RedisNoSqlTarget,SQLTarget
from mlrun.data_types.data_types import ValueType
import pandas as pd

def mysql_test(project_name="bb"):

    mlrun.set_env_from_file("mlrun-nonprod.env")
    project = mlrun.get_or_create_project(project_name, context='./', user_project=False)

    feature_name="basic-party"
    feature_set = fstore.FeatureSet(feature_name, entities=[fstore.Entity("party-id", value_type=ValueType.INT64)],
                                    engine="storey")
    feature_set.add_feature(fstore.Feature(name="party-type"))

    conn="mysql+pymysql://jist:jist@localhost:3306/test"

    # Sample 1. - ingest the FeatureSet works fine with schema definition
    feature_set.set_targets(targets=[SQLTarget(name="we2", db_url=conn, table_name = 'my_table',
                                               schema = {'party-id': int, 'party-type': str},
                                               create_table=True,
                                               primary_key_column='party-id')],
                            with_defaults=False)

    # Sample 2. - Error (without schema definition)
    #  File "C:\Python\test\.venv\lib\site-packages\mlrun\datastore\targets.py", line 294, in add_target_steps
    #     table = driver.get_table_object() or table
    #   File "C:\Python\test\.venv\lib\site-packages\mlrun\datastore\targets.py", line 1690, in get_table_object
    #     (db_path, table_name, _, _, primary_key, _) = self._parse_url()
    #   File "C:\Python\test\.venv\lib\site-packages\mlrun\datastore\targets.py", line 1804, in _parse_url
    #     path = self.path[len("mlrunSql:///") :]
    # TypeError: 'NoneType' object is not subscriptable
    feature_set.set_targets(targets=[SQLTarget(name="we3",
                                               db_url=conn)],
                            with_defaults=False)

    # Sample 3. - Error (without schema definition)
    #  File "C:\Python\test\.venv\lib\site-packages\pymysql\protocol.py", line 221, in raise_for_error
    #     err.raise_mysql_exception(self._data)
    #   File "C:\Python\test\.venv\lib\site-packages\pymysql\err.py", line 143, in raise_mysql_exception
    #     raise errorclass(errno, errval)
    # sqlalchemy.exc.ProgrammingError: (pymysql.err.ProgrammingError) (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ')' at line 2")
    # [SQL:
    # CREATE TABLE my_table2 (
    # )
    #
    # ]
    # (Background on this error at: https://sqlalche.me/e/14/f405)
    feature_set.set_targets(targets=[SQLTarget(name="we4",
                                               db_url=conn,
                                               table_name = 'my_table2',
                                               create_table=True)],
                            with_defaults=False)


    feature_set.save()
    data={"party-id": [1,2,3],
          "party-type": ["a1", "b2", "c3"]}
    dataFrm=pd.DataFrame(data)
    fstore.ingest(feature_set,dataFrm,overwrite=True)

if __name__ == '__main__':
    mysql_test(project)

Issue Description

I mentioned in sample three cases. First is fine and it works correctly, second generated error ' TypeError: 'NoneType' object is not subscriptable' and third generated error 'You have an error in your SQL syntax; check ...'.

It seems that SQLTarget support only manual mapping of FeatureSet to SQLTarget based on parameter schema and it is different behavioral than in e.g. RedisNoSqlTarget where the redis target works without schema mapping).

Expected Behavior

It will be very useful to support SQLTarget without manual schema definition, I think about ability to use only this easy sample
SQLTarget(name="we3", db_url=conn)

It means, the schema will be generate automatically (in background) based on FeatureSet definition (in case, that schema was not added manually).

Installation OS

Windows

Installation Method

Docker

Python Version

3.9.10

MLRun Version

1.5.2

Additional Information

Setting for MySQL see https://github.com/george0st/qgate-sln-mlrun/blob/master/docs/mysql.md

@george0st george0st changed the title [Bug]: SQLTarget - missing automapping to FeatureSet definition [Bug]: SQLTarget - missing automapping for FeatureSet definition Feb 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant