Type annotations in canopen #358

sveinse · 2023-03-26T22:54:44Z

With reference to the discussion in #339 with regards to type hints, I have started looking at the work needed for type annotating canopen. What is the overall ambitions with regards to implementation of type hints? What level of hinting and static code analysis are we aiming for? What tool? mypy, flake8, pyright/pylance ?

I'm a big proponent of type annotated python. It helps development, navigation in the editor works and it shows the expected types for calls and operations and thereby reducing bugs and misunderstandings. I think adding typing to canopen is the right thing to do. The benefits outweighs the disadvantages.

Having a full compliant type annotated library will reduce the degrees of freedom in the codebase. You can do more "tricks" with python than a linter is able to understand, and thus coding for it requires a degree of code conformity. canopen relies on a lot of duck typing and dynamic types (especially variables that conditionally contain None). This results a quite a bit of warnings and/or code constructs to avoid the warnings. - And rightly so, as the code in many cases is not executing any type checks to verify the datatype before invoking an operation. Lastly, many of these linting tools are very opinionated on code structure, so we must expect some functional changes.

I've given a stab at implementing type annotation in canopen (see link below). I'm using mypy and pyright. After doing this I think we will have to face a few discussions on the correct and intended behavior on a few details. E.g. the use None as optional value.

sveinse@4e8925e

The text was updated successfully, but these errors were encountered:

acolomb · 2023-03-27T06:23:07Z

I'm fine with and definitely welcome type hinting, especially if it means that structured type info lands in the documentation as well.

Regarding None as "unspecified" or "not supported" etc., I think that is a very valid and concise expression. Even Python itself uses it as such, e.g. by returning None from a function not returning anything explicitly. As for argument default values, I don't care much whether for example an empty string "" or None is used, as long as the intention is clear. That means probably an Optional type hint in most cases, but I think reading def frobnicate(foo: str, bar: str = None) is very clear and adding Optional to that only makes it longer without real benefit. IDE code parsers do usually understand that a keyword argument with default value can be omitted. In this case, def frobnicate(foo: str, bar: str = "") is slightly less obvious, but also fine with me. Any code relying on None and "" being different things here is ill-designed anyway.

In your fork, why do you have the from __future__ import annotations thing? Isn't that a Python2 only thing, which we're not aiming for in new code anymore?

sveinse · 2023-03-27T08:19:02Z

None (or any other falsey expressions) is a valid way to declare optional values, and it is used quite often in canopen. However, dealing with multiple or optional types, also requires the code to handle all variants of the values in all places.

Let me give an example: class NmtBase doesn't initalize self.network and set it to None. The proper value of self.network is injected later from e.g. RemoteNode.associate_network(). Functionally this is fine. Calling NmtBase.send_command() with self.network is None will crash the program. But the type checker complains about this construct since calling self.network.something() when self.network is None is invalid. The principal question here becomes: Is this how we want it? If it is None let it crash? Let the type checker complain. Or is the type checker pointing out a valid weakness in the code?

class NmtBase(object):
    def __init__(self, node_id: int):
        self.id = node_id
        self.network: Optional[Network] = None

    def send_command(self, code: int):
        super(NmtMaster, self).send_command(code)
        logger.info(
            "Sending NMT command 0x%X to node %d", code, self.id)
        assert self.network  # <-- This is needed to instruct the type checker to ignore None
        self.network.send_message(0, [code, self.id])   # <-- Can't do this without checking self.network

Using from __future__ import annotation gives postponed evaluation in type annotations. E.g.

class NmtSlave(NmtBase):

    def __init__(self, node_id: int, local_node: LocalNode):  # With __future__ import annotations

    def __init__(self, node_id: int, local_node: "LocalNode"):  # Without

It is optional from py3.7 and is not yet made mandatory. The feature is described in PEP 563. See https://docs.python.org/3/library/__future__.html

christiansandberg · 2023-04-01T05:50:34Z

Go for it! Static analysis will definitely find some issues, but hopefully the fixes are fairly easy like asserting that attributes are not None like in your example.

sveinse · 2023-04-07T15:06:15Z

What is reasonable accept criterias for type annotations in canopen? What is the definition of done? What tools should we aim compliance for (and with what settings)?

Based on what I usually do I usually use mypy and pyright (the latter because I'm using VSCode) for type checking and pylint and/or flake8 for general linting. As pointed out in #346 , pylint is a commonly used tool.

sveinse · 2023-04-07T22:37:00Z

I've been working a lot with type annotating canopen lately as can be found in PR #360 . It has taken a lot more time than I expected, but I hope it can be of use. canopen uses a lot of dynamic types in variables and it was much work to figure that out. It is mostly done, but there is a few remaining errors to deal with. Please review the PR and give comments. Large or small.

I've collected a series of comments and questions. I'm sorry for the length of this post, but it would help a lot if we could get some insight into this.

1) Use of multiple types

This is a general type annotation advice: When using vars that can hold multiple types, e.g. int, str or None, it is important to always expect all variants of the variable in the code. The type checker has (rightfully so) problems understanding which type might be passed and complains a lot if assumptions are made.

There is a lot of this in canopen, and this is what's what have taken the most time to resolve. Its also harder to code with, because one have to test everything all the time. I'm sure there are lots of ´Optional` uses in canopen that can be removed, but its not always clear if the optional value serves a function or merely a convenience or habit.

2) Type annotating OD variable access is tedious

Due to the dynamic nature of (canopen OD) objects (variable, array, record) and .raw that might return all sorts of object types, it make it inherently difficult to do properly type annoation of sdo requests.

    # This creates erros in type checkers
    var: int = sdo[0x6061].raw

    # This is what's needed to do full type checking
    obj = sdo[0x6061]     # obj might be Variable, Array, Record
    assert isinstance(obj, Variable)    # Limit type to Variable
    data = obj.raw    # data might be bool, int, float, str, bytes
    assert isinstance(obj, int)    # Limit type to int
    var = data   # Now its guaranteed that var is int

Is the current design too flexible? It is impossible for the type checker to know that OD 0x6061 is a Variable (it might not be if the object dictionary is misconfigured), nor can the type checker understand that sdo[0x6061].raw return an int. What about having explicit typed methods? I know its not everyone's cup of tea, but its explicit and it can be type checked. It can be checked run-time, so in case the OD is wrong, it will clearly give error. I suppose the fundamental question is: What is expected to happen if the parameter isn't Variable, or that the type returned by .raw is not int?

Some quick proposals as food for thought:

    sdo[0x6061]  # Only returns `Variable`. Fails if used on an `Array` or `Record`
    sdo.get_container(0x1234)   # Returns `Array` or `Record`. Fails on `Variable`

    sdo.get_variable(0x6061)  # Named methods for getting variable only
    sdo.get_array(0x1234)

    var.raw_int   # Only returns int. Fails if any other types are returned

3) Needed to check variable type when encoding objects

The encoding of a parameter is stored in objectdictionary.Variable.data_type with the encoding is done in Variable.encode_raw. However, the input type of the value this funcion is very broad. I have added a runtime type check and it will raise TypeError if there is mismatch between data_type and the expected type.

4) Attribute injection

In objectdictionary.import_eds() there is a rather nasty attribute injection hack that inserts __edsFileInfo into the od object for storage. This is considered very smelly and I think we need to find a better way to save these data.

5) Is a node_id value optional or not?

Are there any use cases where initalization of LocalNode or RemoteNode is not made with a node_id value? It is declared as Optional that allows None, but a lot of logic in these classes crash if node_id is None.

6) How should `class Bits` work?

How does the variable.Bits class work with the corresponding objectdictionary.Variable.decode_bits() and .encode_bits()? I fail to understand the logic behind the item get/set and the way the slicing works (if at all). Nor are there any examples of this in use, so any help would be appreciated.

7) Random comments

pdo.base.PdoBase.__getitem__() is a bit inconsistent. For some key values it returns Map type for others Variable. PdoBase is a Mapping containing Map, so __getitem__() deviates the expected type.
pdo.base.Variable.get_data() and .set_data() doesn't handle float datatypes at all
Shall float values be used with objectdictionary.Variable.decode_phys() and .encode_phys()? Should decode_phys() return int or float?
.phys is intended to give a numerical value. However the current implementation will pass through any other types as well, leaving the type very broad. Could this be changed such that it will fail unless its an numerical type?
objectdictionary.eds._convert_variable() doesn't deal with BOOLEAN. Is that intentional?

sveinse linked a pull request Apr 4, 2023 that will close this issue

Add type annotations [WIP] #360

Draft

sveinse changed the title ~~Ambitions for type annotations in canopen~~ Type annotations in canopen Apr 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Type annotations in canopen #358

Type annotations in canopen #358

sveinse commented Mar 26, 2023

acolomb commented Mar 27, 2023 •

edited

sveinse commented Mar 27, 2023

christiansandberg commented Apr 1, 2023

sveinse commented Apr 7, 2023

sveinse commented Apr 7, 2023

Type annotations in canopen #358

Type annotations in canopen #358

Comments

sveinse commented Mar 26, 2023

acolomb commented Mar 27, 2023 • edited

sveinse commented Mar 27, 2023

christiansandberg commented Apr 1, 2023

sveinse commented Apr 7, 2023

sveinse commented Apr 7, 2023

1) Use of multiple types

2) Type annotating OD variable access is tedious

3) Needed to check variable type when encoding objects

4) Attribute injection

5) Is a node_id value optional or not?

6) How should class Bits work?

7) Random comments

acolomb commented Mar 27, 2023 •

edited

6) How should `class Bits` work?