Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor issue regarding whitespace parsing #136

Open
igheorghita opened this issue Oct 27, 2021 · 1 comment · May be fixed by #137
Open

Minor issue regarding whitespace parsing #136

igheorghita opened this issue Oct 27, 2021 · 1 comment · May be fixed by #137

Comments

@igheorghita
Copy link
Contributor

What happened and code sample

This doesn't seem like a big deal but wanted to document it. In a situation where we write a formula having an unnecessary space between the sheet name and range like =SUM('Sheet2'! A1:A5), Excel evaluates it, but pycel gives #NAME?

Checked other variations while I was at it:

from openpyxl import load_workbook
wb = load_workbook()
ws = wb['Sheet1']
ws['A1'] = '=SUM(\'Sheet2\'!A1:A5)'    # no spaces
ws['A2'] = '=SUM(\'Sheet2\'! A1:A5)'  # Excel can compute this, but pycel can't
ws['A3'] = '=SUM (\'Sheet2\'!A1:A5)'  # this is the only one that Excel can't compute
ws['A4'] = '=SUM( \'Sheet2\'!A1:A5)'
ws['A5'] = '=SUM(\'Sheet2\'!A1: A5)'
ws['A6'] = '=SUM(\'Sheet2\'!A1 :A5)'

Here's the wb saved:
test_tokenizer.xlsx

I'm not really familiar with the tokenizer, but something like this fixes it

    def _items(self):
        ...
        next_consumed = False
        for prev_token, token, next_token in zip(t, t[1:], t[2:]):
            if next_consumed:
                next_consumed = False
                continue
            if token.type != Token.WSPACE or not prev_token or not next_token:
               ...
            # testing: case when there's a whitespace like 'Sheet'! A1:A5
            elif token.type == Token.WSPACE and (
                prev_token.matches(type_=Token.OPERAND, subtype=Token.RANGE) and
                next_token.matches(type_=Token.OPERAND, subtype=Token.RANGE)):

                tokens.pop()
                tokens.append(Token(prev_token.value + next_token.value, Token.OPERAND, Token.RANGE))
                next_consumed = True
            elif 
               ...
        return tokens

This is probably not the best way and might have unintended consequences though.

@stephenrauch
Copy link
Collaborator

Thanks for reporting this. Ideally these things would be addressed in the tokenizer which is in openpyxl. I don't remember the context for the HACK comment I put in there referencing the openpyxl pull request. openpyxl had to move from bitbucket so that documentation is no longer there.

But as far as hacks go, I see nothing immediately wrong with what you are proposing. Do you fancy doing another PR?

@igheorghita igheorghita linked a pull request Oct 30, 2021 that will close this issue
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants