Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parser().tables fix for missing schema in ms_sql and sybase #475

Open
davlee1972 opened this issue Mar 20, 2024 · 2 comments
Open

Parser().tables fix for missing schema in ms_sql and sybase #475

davlee1972 opened this issue Mar 20, 2024 · 2 comments

Comments

@davlee1972
Copy link

davlee1972 commented Mar 20, 2024

Can we modify the two functions below?

Microsoft SQL Server and Sybase both allow empty db schema(s).
If empty it defaults to either the user login or "dbo" depending if user.table or dbo.table exists.

This works..

>>> from sql_metadata import Parser
>>> Parser("select 1 from my_db.my_schema.my_table").tables
['my_db.my_schema.my_table']

But this does not:

>>> from sql_metadata import Parser
>>> Parser("select 1 from my_db..my_table").tables
['..my_table']

The code modifications will resolve to proper table names..

>>> from sql_metadata import Parser
>>> Parser("select 1 from my_db..my_table").tables
['my_db..my_table']

Code fix: Only step back one token if schema is missing.

    def _combine_qualified_names(self, index: int, token: SQLToken) -> None:
        """
        Combines names like <schema>.<table>.<column> or <table/sub_query>.<column>
        """
        value = token.value
        is_complex = True
        while is_complex:
            value, index, is_complex = self._combine_tokens(index=index, value=value)
        token.value = value

    def _combine_tokens(self, index: int, value: str) -> Tuple[str, int, bool]:
        """
        Checks if complex identifier is longer and follows back until it's finished
        """
        if index > 1 and str(self.non_empty_tokens[index - 1]) == ".":
            prev_value = self.non_empty_tokens[index - 2].value.strip("`").strip('"')
            """
            Only step back one token if schema is missing to get database name
            Otherwise step back two tokens to get database name
            """
            if prev_value == ".":
                index = index - 1
                value = f".{value}"
            else:
                index = index - 2
                value = f"{prev_value}.{value}"
            return value, index, True
        return value, None, False
@davlee1972
Copy link
Author

Technical Docs..

From the documentation:

When database objects are referenced by using a one-part name, SQL Server first looks in the user's default schema. If the object is not found there, SQL Server looks next in the dbo schema. If the object is not in the dbo schema, an error is returned.

They say "one-part name" but this also applies when you reference an object with a pattern like:

[server].[database]..[name]

or

[database]..[name]

@macbre
Copy link
Owner

macbre commented May 8, 2024

@davlee1972 - thanks! Can you prepare a PR with the fix you've proposed above?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants