Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why custom Dialect doesn't work properly #1186

Open
groobyming opened this issue Mar 18, 2024 · 8 comments
Open

Why custom Dialect doesn't work properly #1186

groobyming opened this issue Mar 18, 2024 · 8 comments

Comments

@groobyming
Copy link
Contributor

I have extended a Dialect named MyDialect, and implemented the is_identifier_start method. is_identifier_start allows the identifier to start with a number, but why does my test code report an error? The prompt says that it cannot start with a number, and if I run it with the native HiveDialect, the code No problem again

My Code:

#[derive(Debug)]
pub struct MyDialect;

    impl Dialect for MyDialect {
    fn is_delimited_identifier_start(&self, ch: char) -> bool {
        (ch == '"') || (ch == '`')
    }

    fn is_identifier_start(&self, ch: char) -> bool {
        ch.is_ascii_lowercase() || ch.is_ascii_uppercase() || ch.is_ascii_digit() || ch == '$'
    }

    fn is_identifier_part(&self, ch: char) -> bool {
        ch.is_ascii_lowercase()
            || ch.is_ascii_uppercase()
            || ch.is_ascii_digit()
            || ch == '_'
            || ch == '$'
            || ch == '{'
            || ch == '}'
    }

    fn supports_filter_during_aggregation(&self) -> bool {
        true
    }
}

#[cfg(test)]
mod tests {
    use sqlparser::parser::Parser;

    #[test]
    pub fn test_ast() {
        let sql = "SELECT * from 1_a";
        let expected_sql = "SELECT * from 1_a"
            .replace("\n", "");
        let dialect = crate::core::ast::MyDialect {};
        let ast = Parser::parse_sql(&dialect, sql).unwrap();
        assert_eq!(ast[0].to_string(), expected_sql);
    }
}

HIveDialect Code(from sqlparser-rs)
use crate::dialect::Dialect;

/// A [Dialect] for Hive.

#[derive(Debug)]
pub struct HiveDialect {}

impl Dialect for HiveDialect {
    fn is_delimited_identifier_start(&self, ch: char) -> bool {
        (ch == '"') || (ch == '`')
    }

    fn is_identifier_start(&self, ch: char) -> bool {
        ch.is_ascii_lowercase() || ch.is_ascii_uppercase() || ch.is_ascii_digit() || ch == '$'
    }

    fn is_identifier_part(&self, ch: char) -> bool {
        ch.is_ascii_lowercase()
            || ch.is_ascii_uppercase()
            || ch.is_ascii_digit()
            || ch == '_'
            || ch == '$'
            || ch == '{'
            || ch == '}'
    }

    fn supports_filter_during_aggregation(&self) -> bool {
        true
    }
}

Error:
image

@groobyming
Copy link
Contributor Author

@alamb Hi, could you please take a look at this issue? I didn't find any more instructions about this custom Dialect

@alamb
Copy link
Collaborator

alamb commented Mar 18, 2024

I am not sure what is going on @groobyming -- your description certainly sounds like a bug.

Maybe there is code in the parser that is special casing HiveDialect rather than using a trait method:

https://github.com/search?q=repo%3Asqlparser-rs%2Fsqlparser-rs%20HiveDialect&type=code

@groobyming
Copy link
Contributor Author

@alamb Thanks, when I added the following code in src/dialect/tokenizer.rs, it worked for me, but I still don't know why it works
image

test success:
image
image

So if I want to extend my own dialect, do I need to modify the source code and then package it for release?

@alamb
Copy link
Collaborator

alamb commented Mar 19, 2024

So if I want to extend my own dialect, do I need to modify the source code and then package it for release?

that is one alternative

Another is to change the code so that rather than testing dilaect_of!(...) it would do something like self.supports_numeric_prefix() and adding fn supports_numeric_prefix to the the Dialect trait

@groobyming
Copy link
Contributor Author

groobyming commented Mar 20, 2024

@alamb Thanks for the feedback! I'll create a PR to address this issue.
PR addressing: #1188

@alamb
Copy link
Collaborator

alamb commented Mar 21, 2024

Thanks @groobyming -- I will try and take a look at the outstanding sqlparser PRs next week sometime. I don't think I'll have a chance to do so this week

@groobyming
Copy link
Contributor Author

@alamb OK,Thanks

@universalmind303
Copy link
Contributor

@groobyming I ran into a similar issue here #1227.

If you want your dialect to "extend" an existing dialect, you can implement the dialect function

impl Dialect for MyDialect {
    fn dialect(&self) -> std::any::TypeId {
        std::any::TypeId::of::<HiveDialect>()
    }

This'll allow the parser to treat it as that dialect whenever there is a check for dialect_of!.

IMO, more logic needs to move out of the dialect_of! macro, and into the Dialect trait, but considering there are 100+ uses of dialect_of in the parser, it may not be practical to add every one of those checks to Dialect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants