Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'SyntaxError: invalid syntax' occurs during MS Project module generation with GetModule function #524

Closed
kateryna-ruzhytska opened this issue Apr 9, 2024 · 20 comments · Fixed by #536
Labels
bug Something isn't working help wanted Extra attention is needed
Milestone

Comments

@kateryna-ruzhytska
Copy link

kateryna-ruzhytska commented Apr 9, 2024

MS_PROJECT_HASH = '{A7107640-94DF-1068-855E-00DD01075445}'
com.GetModule((MS_PROJECT_HASH, 4, 0)) returns 'SyntaxError: invalid syntax' starting from 1.4.0 version:

File "C:\Python39\lib\site-packages\comtypes\gen\MSHTML.py", line 4 from comtypes.gen._3050F1C5_98B5_11CF_BB82_00AA00BDCE0B_0_4_0 import ^ SyntaxError: invalid syntax

comtypes_syntax_error_1 4 0

Current issue is reproducible periodically.
Sometimes it happens to import appropriate classes but is it expected to have the second import?

comtypes_1 4 0

@junkmd
Copy link
Collaborator

junkmd commented Apr 9, 2024

Sometimes it happens to import appropriate classes but is it expected to have the second import?

Yes, indeed, it is expected behavior that classes are imported in "the second import".

@junkmd
Copy link
Collaborator

junkmd commented Apr 9, 2024

This is the same issue that's happening with #517.
It's possible that it has been latent since comtypes==1.3.0.
I thought this was a partial file issue like #114, but it seems to be different.

What's puzzling is the situation in which this problem occurs.
As you say, it happens occasionally and sometimes it doesn't.

Please share the codebase of _3050F1C5_98B5_11CF_BB82_00AA00BDCE0B_0_4_0.py (hereafter, the wrapper module) when the codebase of MSHTML.py (hereafter, the friendly module) causes a SyntaxError.

When this problem is reproduced, the approach that can be taken should differ depending on which of the following is happening:

  • The wrapper module is empty or partial
  • Despite the content that should be defined in the wrapper module, the import part of the friendly module fails to generate

The import part of the friendly module generated by CodeGenerator is made from a set of names defined in the wrapper module.

def _make_friendly_module_import_part(self, modname: str) -> str:
# The `modname` is the wrapper module name like `comtypes.gen._xxxx..._x_x_x`
txtwrapper = textwrap.TextWrapper(
subsequent_indent=" ", initial_indent=" ", break_long_words=False
)
symbols = set(self.names)
symbols.update(self.imports.get_symbols())
symbols.update(self.declarations.get_symbols())
symbols -= set(self.enums.get_symbols())
symbols -= set(self.enum_aliases)
joined_names = ", ".join(str(n) for n in symbols)
part = f"from {modname} import {joined_names}"
if len(part) > 80:
txtwrapper = textwrap.TextWrapper(
subsequent_indent=" ", initial_indent=" ", break_long_words=False
)
joined_names = "\n".join(txtwrapper.wrap(joined_names))
part = f"from {modname} import (\n{joined_names}\n)"
return part

If there is something that makes the length of this set zero, problematic codebase will be generated.
However, at the moment, nothing comes to mind that would cause the length of this set to be zero.

Help from the community is also welcome.

@junkmd junkmd added the help wanted Extra attention is needed label Apr 9, 2024
@kateryna-ruzhytska
Copy link
Author

Can you please try to reproduce this issue by the steps below.

Precondition: comtypes version is 1.4.0

Steps to reproduce:

  1. Remove folder <your_python_path>\Lib\site-packages\comtypes\gen
  2. Run command in Command Prompt:
    python -c "import comtypes.client as com; com.GetModule('C:\Windows\System32\mshtml.tlb')"

NOTE: you may need to perform the steps above several times to reproduce.

Actual result:

Traceback (most recent call last):
File "", line 1, in
File "C:\Users\Kateryna.pyenv\pyenv-win\versions\3.9.4\lib\site-packages\comtypes\client_generate.py", line 124, in GetModule
return ModuleGenerator().generate(tlib, pathname)
File "C:\Users\Kateryna.pyenv\pyenv-win\versions\3.9.4\lib\site-packages\comtypes\client_generate.py", line 203, in generate
return self._create_friendly_module(tlib, modulename)
File "C:\Users\Kateryna.pyenv\pyenv-win\versions\3.9.4\lib\site-packages\comtypes\client_generate.py", line 222, in _create_friendly_module
return _create_module_in_file(modulename, code)
File "C:\Users\Kateryna.pyenv\pyenv-win\versions\3.9.4\lib\site-packages\comtypes\client_generate.py", line 172, in _create_module_in_file
return _my_import(modulename)
File "C:\Users\Kateryna.pyenv\pyenv-win\versions\3.9.4\lib\site-packages\comtypes\client_generate.py", line 27, in my_import
return importlib.import_module(fullname)
File "C:\Users\Kateryna.pyenv\pyenv-win\versions\3.9.4\lib\importlib_init
.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1030, in _gcd_import
File "", line 1007, in _find_and_load
File "", line 986, in _find_and_load_unlocked
File "", line 680, in _load_unlocked
File "", line 790, in exec_module
File "", line 228, in _call_with_frames_removed
File "C:\Users\Kateryna.pyenv\pyenv-win\versions\3.9.4\lib\site-packages\comtypes\gen\MSHTML.py", line 2882, in
class _htmlInput(IntFlag):
File "C:\Users\Kateryna.pyenv\pyenv-win\versions\3.9.4\lib\enum.py", line 264, in new
enum_member = new(enum_class, *args)
TypeError: int() argument must be a string, a bytes-like object or a number, not '_coclass_meta'

@junkmd
Copy link
Collaborator

junkmd commented Apr 9, 2024

2. python -c "import comtypes.client as com; com.GetModule('C:\Windows\System32\mshtml.tlb')"

class _htmlInput(IntFlag):
File "C:\Users\Kateryna.pyenv\pyenv-win\versions\3.9.4\lib\enum.py", line 264, in new
enum_member = new(enum_class, *args)
TypeError: int() argument must be a string, a bytes-like object or a number, not '_coclass_meta'

I encountered the same error in my environment.
I deleted .../comtypes/gen/... and run the command several times again, but the error always occurred.

Upon checking the source code and going to the definition, it seems that the cause is that the name htmlInputImage exists on the COM type library as "values for enumeration '_htmlInput'" and also as CoClass.
(I don't know why an error occurs when you pass mshtml.tlb to generate MSHTML.py, and sometimes there is no error when (MS_PROJECT_HASH, 4, 0) is passed to generate MSHTML.py as a side effect in your environment)

Such COM type libraries were overlooked when implementing this feature.

I think the way to deal with this kind of problem is to assign numerical literals directly, rather than referring to the values of enumeration members from __wrapper_module__.

junkmd added a commit to junkmd/comtypes that referenced this issue Apr 9, 2024
@junkmd
Copy link
Collaborator

junkmd commented Apr 9, 2024

I have made changes to numerical literals to be members of the enumeration.
This should prevent raising the TypeError.
https://github.com/junkmd/comtypes/tree/assign_numerical_literals_to_enum_members

  • pip install https://github.com/junkmd/comtypes/archive/refs/heads/assign_numerical_literals_to_enum_members.zip

Please install it in your environment and give it a try.

@junkmd
Copy link
Collaborator

junkmd commented Apr 9, 2024

From #524 (comment),

Please share the codebase of _3050F1C5_98B5_11CF_BB82_00AA00BDCE0B_0_4_0.py (hereafter, the wrapper module) when the codebase of MSHTML.py (hereafter, the friendly module) causes a SyntaxError.

I am also waiting for information on this matter.

@kateryna-ruzhytska
Copy link
Author

Attached the _3050F1C5_98B5_11CF_BB82_00AA00BDCE0B_0_4_0.py when SyntaxError occurs (converted it to txt to be able to link it)
_3050F1C5_98B5_11CF_BB82_00AA00BDCE0B_0_4_0.txt

@junkmd
Copy link
Collaborator

junkmd commented Apr 10, 2024

I've looked at the code generated in your environment, and it seems that there are no elements in the codebase of the wrapper module itself that would cause a SyntaxError.
If you do from comtypes.gen import _3050F1C5_98B5_11CF_BB82_00AA00BDCE0B_0_4_0 in your environment, it probably won't cause an error.
Therefore, I'm guessing that the problem is not with the generation of the wrapper module, but with the generation of the friendly module.

I have made changes to numerical literals to be members of the enumeration. This should prevent raising the TypeError. https://github.com/junkmd/comtypes/tree/assign_numerical_literals_to_enum_members

  • pip install https://github.com/junkmd/comtypes/archive/refs/heads/assign_numerical_literals_to_enum_members.zip

Please install it in your environment and give it a try.

Please share the results about this as well.

Also, when sharing such a codebase, it would be helpful if you could upload it to your public repository and let us know the permalink, because attached files can't be checked immediately and GitHub's syntax highlight can't be used.

@junkmd
Copy link
Collaborator

junkmd commented Apr 10, 2024

There has been a comment about mshtml.tlb in this project since its inception.

# these do NOT work:
# XXX infinite loop?
# path = r"mshtml.tlb" # has propputref

In your case, it’s not that the tlbparser.py is stuck in infinite loops, but rather, the error occurs at code generation after parsing typelibs.

Therefore, it seems that the compatibility between mshtml.tlb and comtypes has improved since the time this comment was committed (although it is not certain what the cause is).

@kateryna-ruzhytska
Copy link
Author

I have made changes to numerical literals to be members of the enumeration. This should prevent raising the TypeError. https://github.com/junkmd/comtypes/tree/assign_numerical_literals_to_enum_members

  • pip install https://github.com/junkmd/comtypes/archive/refs/heads/assign_numerical_literals_to_enum_members.zip

Please install it in your environment and give it a try.

Those changes helps.

@junkmd
Copy link
Collaborator

junkmd commented Apr 10, 2024

The cause of this SyntaxError problem could be that the wrapper module already exists, but the friendly module does not.

def generate(
self, tlib: typeinfo.ITypeLib, pathname: Optional[str]
) -> types.ModuleType:
# create and import the real typelib wrapper module
mod = self._create_wrapper_module(tlib, pathname)
# try to get the friendly-name, if not, returns the real typelib wrapper module
modulename = codegenerator.name_friendly_module(tlib)
if modulename is None:
return mod
# create and import the friendly-named module
return self._create_friendly_module(tlib, modulename)

def _create_friendly_module(
self, tlib: typeinfo.ITypeLib, modulename: str
) -> types.ModuleType:
"""helper which creates and imports the friendly-named module."""
try:
mod = _my_import(modulename)
except Exception as details:
logger.info("Could not import %s: %s", modulename, details)
else:
return mod
# the module is always regenerated if the import fails
logger.info("# Generating %s", modulename)
# determine the Python module name
modname = codegenerator.name_wrapper_module(tlib)
code = self.codegen.generate_friendly_code(modname)
if comtypes.client.gen_dir is None:
return _create_module_in_memory(modulename, code)
return _create_module_in_file(modulename, code)

def _create_wrapper_module(
self, tlib: typeinfo.ITypeLib, pathname: Optional[str]
) -> types.ModuleType:
"""helper which creates and imports the real typelib wrapper module."""
modulename = codegenerator.name_wrapper_module(tlib)
if modulename in sys.modules:
return sys.modules[modulename]
try:
return _my_import(modulename)
except Exception as details:
logger.info("Could not import %s: %s", modulename, details)
# generate the module since it doesn't exist or is out of date
logger.info("# Generating %s", modulename)
p = tlbparser.TypeLibParser(tlib)
if pathname is None:
pathname = tlbparser.get_tlib_filename(tlib)
items = list(p.parse().values())
code = self.codegen.generate_wrapper_code(items, filename=pathname)
for ext_tlib in self.codegen.externals: # generates dependency COM-lib modules
GetModule(ext_tlib)
if comtypes.client.gen_dir is None:
return _create_module_in_memory(modulename, code)
return _create_module_in_file(modulename, code)

ModuleGenerator does not generate a wrapper module with CodeGenerator if the wrapper module already exists. Friendly modules are generated using the names of each object defined when generating the wrapper module. In other words, if the step to generate the wrapper module is not taken, the friendly module will become partial.

I think this is similar to issue #114 in terms of the event and cause. If Python terminates after the wrapper module is generated by calling comtypes.client.GetModule, and before the friendly module is generated, this error can occur the next time comtypes.client.GetModule is called.

By determining if the wrapper module already exists and the friendly module does not, and displaying an appropriate error message, we can help users understand what to do next to resolve the error.

Also, by not generating the wrapper module file until the friendly module file is generated, and making them both files at the same time when they are ready, I think we can reduce the occurrence of this error.

@junkmd
Copy link
Collaborator

junkmd commented Apr 10, 2024

Those changes helps.

Thank you.

I’m planning to apply a patch to at least resolve the TypeError and release it as 1.4.1.

I would like a little more time to respond to the SyntaxError.

@kateryna-ruzhytska
Copy link
Author

The cause of this SyntaxError problem could be that the wrapper module already exists, but the friendly module does not.

I've got both modules present when this issue occurs.
Also, I uploaded both files to the public repository kateryna-ruzhytska/comtypes_syntax_error_1_4_0

@junkmd
Copy link
Collaborator

junkmd commented Apr 10, 2024

Please keep the _3050F1C5_98B5_11CF_BB82_00AA00BDCE0B_0_4_0.py intact, delete only the MSHTML.py, and then try calling comtypes.client.GetModule.

Let us know the result.

@junkmd junkmd added the bug Something isn't working label Apr 10, 2024
@junkmd junkmd added this to the 1.4.2 milestone Apr 11, 2024
@kateryna-ruzhytska
Copy link
Author

Please keep the _3050F1C5_98B5_11CF_BB82_00AA00BDCE0B_0_4_0.py intact, delete only the MSHTML.py, and then try calling comtypes.client.GetModule.

Let us know the result.

In this case we still get the error and MSHTML.py is generated the same way again.

@junkmd
Copy link
Collaborator

junkmd commented Apr 11, 2024

Please keep the _3050F1C5_98B5_11CF_BB82_00AA00BDCE0B_0_4_0.py intact, delete only the MSHTML.py, and then try calling comtypes.client.GetModule.

Let us know the result.

In this case we still get the error and MSHTML.py is generated the same way again.

The same thing happened in my environment.

@junkmd
Copy link
Collaborator

junkmd commented Apr 11, 2024

I noticed that the condition for whether the ModuleGenerator imports an existing module or creates a new one, and the order of operations for (re)generating a module, are not appropriate.

This is the same kind of thing that I discussed in #116 (comment).

Recreating both the wrapper module and the friendly module unless both modules already exist might make it less producing such partial files.

In the current implementation, the wrapper module file is generated, then the codebase for the friendly module is generated, and then the friendly module file is generated.
If Python crashes while generating the codebase for the friendly module, only the wrapper module will exist.
After that, if we call comtypes.client.GetModule, only a partial friendly module will be created.

By generating both the codebase for the wrapper module and the codebase for the friendly module, and then generating the module files for both, the time between generating the two files can be reduced, which could increase stability.

@junkmd
Copy link
Collaborator

junkmd commented Apr 12, 2024

@kateryna-ruzhytska

Based on the considerations I made in #524 (comment), I changed the implementation of ModuleGenerator.

  • pip install https://github.com/junkmd/comtypes/archive/refs/heads/fix_issue_524_syntaxerror.zip

Please install this in your environment, try the following, and let us know the result:

  • Call comtypes.client.GetModule('mshtml.tlb') when only __init__.py exists in comtypes/gen/...
  • When both _3050F1C5_98B5_11CF_BB82_00AA00BDCE0B_0_4_0.py and MSHTML.py exist in comtypes/gen/...
    • Delete _3050F1C5_98B5_11CF_BB82_00AA00BDCE0B_0_4_0.py and call comtypes.client.GetModule('mshtml.tlb')
    • Delete MSHTML.py and call comtypes.client.GetModule('mshtml.tlb')

Thank you.

@junkmd
Copy link
Collaborator

junkmd commented Apr 16, 2024

@kateryna-ruzhytska

In addition to the branch mentioned earlier, I would like you to pip install https://github.com/junkmd/comtypes/archive/refs/heads/fix_issue_524_syntaxerror_and_more.zip and test in your environment.

This branch is the one that has merged the contents of #527, #528, and #529 into https://github.com/junkmd/comtypes/archive/refs/heads/fix_issue_524_syntaxerror.zip.

@junkmd
Copy link
Collaborator

junkmd commented Apr 23, 2024

I executed the contents of fix_issue_524_syntaxerror_and_more in my environment.

  • If BOTH the wrapper module and the friendly module do NOT exist, calling comtypes.client.GetModule('mshtml.tlb') will create both.
  • If the wrapper module exists and the friendly module does NOT, calling comtypes.client.GetModule('mshtml.tlb') will regenerate not only the friendly module but also the wrapper module.
  • If the wrapper module does NOT exist and the friendly module does, calling comtypes.client.GetModule('mshtml.tlb') will regenerate not only the wrapper module but also the friendly module.

I confirmed that even if either the wrapper module or the friendly module file existed, both files and modules would be regenerated.

Testing this requires diving into Python's import system, which is difficult.

However, even with this implementation, all the existing unit tests were able to run and they all passed.
Afterwards, I deleted some of the .py files under comtypes.gen and ran the tests again. Deleted .py files were regenerated and all tests passed. And, when both the friendly module file and the wrapper module file existed, they were not regenerated.
Therefore, I recognize it as fully functional.

In early May, I plan to merge this content and release it as comtypes==1.4.2.
If there are opinions from the community by then, there is a possibility that the plan may change.

Any opinions would be appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants