-
Notifications
You must be signed in to change notification settings - Fork 196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Type annotations in canopen #358
Comments
I'm fine with and definitely welcome type hinting, especially if it means that structured type info lands in the documentation as well. Regarding In your fork, why do you have the |
Let me give an example: class NmtBase(object):
def __init__(self, node_id: int):
self.id = node_id
self.network: Optional[Network] = None
def send_command(self, code: int):
super(NmtMaster, self).send_command(code)
logger.info(
"Sending NMT command 0x%X to node %d", code, self.id)
assert self.network # <-- This is needed to instruct the type checker to ignore None
self.network.send_message(0, [code, self.id]) # <-- Can't do this without checking self.network Using class NmtSlave(NmtBase):
def __init__(self, node_id: int, local_node: LocalNode): # With __future__ import annotations
def __init__(self, node_id: int, local_node: "LocalNode"): # Without It is optional from py3.7 and is not yet made mandatory. The feature is described in PEP 563. See https://docs.python.org/3/library/__future__.html |
Go for it! Static analysis will definitely find some issues, but hopefully the fixes are fairly easy like asserting that attributes are not None like in your example. |
What is reasonable accept criterias for type annotations in canopen? What is the definition of done? What tools should we aim compliance for (and with what settings)? Based on what I usually do I usually use mypy and pyright (the latter because I'm using VSCode) for type checking and pylint and/or flake8 for general linting. As pointed out in #346 , pylint is a commonly used tool. |
I've been working a lot with type annotating canopen lately as can be found in PR #360 . It has taken a lot more time than I expected, but I hope it can be of use. canopen uses a lot of dynamic types in variables and it was much work to figure that out. It is mostly done, but there is a few remaining errors to deal with. Please review the PR and give comments. Large or small. I've collected a series of comments and questions. I'm sorry for the length of this post, but it would help a lot if we could get some insight into this. 1) Use of multiple typesThis is a general type annotation advice: When using vars that can hold multiple types, e.g. There is a lot of this in canopen, and this is what's what have taken the most time to resolve. Its also harder to code with, because one have to test everything all the time. I'm sure there are lots of ´Optional` uses in canopen that can be removed, but its not always clear if the optional value serves a function or merely a convenience or habit. 2) Type annotating OD variable access is tediousDue to the dynamic nature of (canopen OD) objects (variable, array, record) and # This creates erros in type checkers
var: int = sdo[0x6061].raw
# This is what's needed to do full type checking
obj = sdo[0x6061] # obj might be Variable, Array, Record
assert isinstance(obj, Variable) # Limit type to Variable
data = obj.raw # data might be bool, int, float, str, bytes
assert isinstance(obj, int) # Limit type to int
var = data # Now its guaranteed that var is int Is the current design too flexible? It is impossible for the type checker to know that OD Some quick proposals as food for thought: sdo[0x6061] # Only returns `Variable`. Fails if used on an `Array` or `Record`
sdo.get_container(0x1234) # Returns `Array` or `Record`. Fails on `Variable`
sdo.get_variable(0x6061) # Named methods for getting variable only
sdo.get_array(0x1234)
var.raw_int # Only returns int. Fails if any other types are returned 3) Needed to check variable type when encoding objectsThe encoding of a parameter is stored in 4) Attribute injectionIn 5) Is a node_id value optional or not?Are there any use cases where initalization of 6) How should
|
We should break this up in multiple smaller issues and not forget about it. |
It's not forgotten per se, but more recent activity supersedes much of this. It's a "inbetween"-task for me in PR #360, and it needs an update. |
Sorry for not considering this more thoroughly in the past year. Other things needed my attention more urgently, but I think we're on a good path to ramp up type annotations in canopen gradually. Some questions from your list above have been answered as part of different discussions, and frankly I had forgot about them here. If still needed, let me go back and summarize my stance (and current project practice) for each point raised. Which of these do you think still need clarification, @sveinse? |
The hard part is not to type annotate our library: It's deciding how to deal with the type inconsistencies it uncovers. Particularly surrounding implicit use of So, how do we approach type annotation in canopen in general? Adding them via PRs is straight forward. Can we agree on which type checker to use, e.g. mypy? If we do, should we implement a github action step for it? Is typing errors considered PR-blocking errors, or is it more "nice to have"? How do we validate type annotations? It is extremely easy to get carried away in a deep, deep rabbit hole when starting down the type annotation path. The type annotation language has limitations and python is far more flexible. I know we will hit a few of them in canopen. When these "inflexibilities" are encountered, its easy that the code gets littered with what I've dubbed "type checker aids" (e.g. asserts and or type ignores) and not functional code. TL;DR: It's very important that we agree on how deep or complete we're willing to go with type checking. |
FWIW, here are my thoughts:
This is a broad question, but regarding how to add them, I would definitely suggest using gradual typing over several PRs, and definitely don't fall into the trap of mixing bugfixes (discovered by type annotations) and adding annotations.
+1 for mypy
Yes.
Make it a required check and fix things right away. Using
Can you expand on this?
IMO, it's ok to add An approach I've found usable is:
Of course, each iteration of step 2) involves possibly a series of PRs. |
Agreed.
+1 from me as well.
I doubt we will ever get to strict mode in canopen. It deals with flexible data types and the current design is very hard to describe with type hints. E.g.
If we chose mypy, then that's the answer. |
One possible solution for such problems is to introduce new APIs and deprecate the old ones. Another is to make breaking behavioural changes. Yet another is to let the status quo win and never enable strict typing. |
Yes, agreed that adding vie PR in chunks is review-friendly.
Also +1 for mypy from my side. Therefore we have consensus and mypy it shall be.
I don't care much about this point. Sure it's nice to see, but IMO type checkers are most valuable during coding.
Given the nature of this API, I would definitely not try to enforce it. Python is not a type-safe language, and this library embraces that fact by design. Which is a good thing IMHO, as it makes the code easy to use and mirrors the design of CANopen protocols themselves. Making sure the passed / returned objects are of a certain type should be done outside the library, by the application code that knows which objects it tries to access. Any wrapper functions for specific types should be added on top of the current generic API. It should be really easy and not much additional code. But it would enlarge our library API surface quite a bit, which I don't think is worthwhile just because of type checking.
I'd avoid those "type checker aids" usually, rather treating the type hints as a helpful, best-effort option. But that needs to be decided on a case-by-case basis. |
ISTM a gradual typing CI check would be a good compromise; you'd type up the parts of the code base that makes sense, and a lenient mypy check should be able to help you keeping those functional without being too much of a nuisance. I believe a CI check is important, as type annotations are susceptible to bit rot, like any other code snippet. Without CI checks, you risk that they become incorrect, which can end up being a problem for users of this library. |
Accepted, since mypy seems to explicitly support such a use case. Go ahead and PR the check configuration if you like. I've never really touched all the CI and GitHub related parts. |
What I'd like to do next is to add an absolute minimal mypy configuration to canopen which I'm working on now. mypy have excellent guides on how to approach existing projects: https://mypy.readthedocs.io/en/stable/existing_code.html#existing-code mypy throws ~107 errors right off the bat with a minimal configuration. If we were to introduce this into a GH action as a blocker, we will have to address all of those issues first. If we chose not to make them blocking in GH actions, I think there is no point in running them in GH at all! Who inspects the actions logs unless they fail? |
The only reason I see is if it supports pointing out type errors only related to the current PR, so that we can be sure a PR doesn't introduce new type check errors. Don't know enough about the "gradual typing" feature in mypy though to tell if it's possible. |
I don't know if its possible to setup partial mypy checking of PRs, like codecov is doing for code coverage. I know commercial tools, such as sonarcloud is doing that. To be able to do so we'd need two sets of configurations: One for the whole project (which needs to be lenient in the beginning) and one stricter for a PR. I don't know if that's possible with actions or mypy. |
Well, if that gets hard to get right, let's skip on the CI integration for now. But if @erlend-aasland sees value in it, I'm not opposed to adding the CI action (without enforcing the check). |
All of this is possible using GH and mypy. I can look in to it in a couple of days, unless @sveinse beats me to it. |
@erlend-aasland I'm curious if you have a setup for running incremental mypy and with stricter rules than the overall project. I've found the mypy settings that gives no errors. We can use this and remove error exception one by one while fixing the code base. Not sure if that's the best approach to do this in steps thou. [tool.mypy]
files = "canopen"
strict = "False"
ignore_missing_imports = "True"
disable_error_code = [
"return-value",
"var-annotated",
"attr-defined",
"union-attr",
"annotation-unchecked",
"operator",
"arg-type",
"assignment",
"override",
"index",
"return",
"has-type",
"no-redef",
"misc",
] |
What good is a checker tool if all its warnings are silenced? I'd rather not silence them, but accept the fact that it shows a red light as long as there is something to fix. Remember, the goal is not the green light, but a working code-base with a decent amount of support for the users, so they can spot errors before the application hits them at runtime. |
A silenced tool have no value, but it works as a strategy to enable incrementally option by option if we think the PR for getting all 101 errors fixed in one go is a bit too much. |
But I want to see the errors while fixing them. Sorry, but silencing warnings is the opposite of right in my experience. Keep them visible, work on fixing bit by bit. It helps nobody to see a green light when it was reached by looking away. Overall, I think this is getting too focused on the tools rather than the workpiece. Let's spend time (which isn't abundant) discussing what needs fixing instead of how to configure the tools. |
I was under the assumption that we'd want them plugged into the CI/CD and we're starting from scratch not using the tools. We need to either put the bar so low that everything passes and tighten the knot PR by PR, or we need to do a rather large PR to get the level sufficiently up to level for what we consider the acceptable minimum. I repeat my opinion that there is no use for running mypy in CI/CD unless it fails the build. Otherwise it'll never get used. ...if we don't agree on that ambition, perhaps we should begin there. |
I'm not pushing for mypy to be included in CI runs. But if we do, then it should show its true result, not just pass because we basically told it to lie to us. What's the benefit of having it if any possible warnings are hidden anyway? Newly introduced errors will not be noticed, and it gives a false sense of compliance seeing these checks pass although nothing is actually fixed yet. There will be no sign of progress either, because we start off with a green light. I do use mypy locally already. But find it's okay to see some warnings there, while focusing on different work. Sure they should be fixed at some point, but not distract from ongoing work and the fact that we do have a working code-base even without perfect typing. Remember it's just helpful pointers to avoid pitfalls caused by wrong assumptions. Summing up, we should enforce CI checks once we have a state that passes without any silencing options. The road to that point either has them enabled but not enforced (accepting the red flag), or only running mypy locally. The former has the advantage that the analysis results are easy to look at without checking out the affected branch and running locally. If someone goes through that effort, they might as well fix what they see, since the IDE will already be open. |
OOI what settings do you use where you only get warnings from mypy? |
Sorry, I should have been more precise. I didn't mean to differentiate between errors and warnings. To me, both categories are just "warnings", because the code works nonetheless. So whenever I speak about warnings in type checking, I also mean errors. I don't use any special mypy configuration. Only what we have in |
@erlend-aasland do you have any experience with using mypy with GH actions? I know its straight forward to have it run as the action job and that it must pass without errors to pass the CI/CD flow. But do you know of any system where the errors/warnings are fed back to the contributor? |
I use
Note that you can configure your GitHub project so it's not a required check. It does not need to pass without errors in order for the CI to be green.
IMO, examining the CI logs is sufficient, but I'm not sure if I understood your question correctly. Footnotes
|
canopen type annotation support is very rudimentary. My opinion is that type annotation is equally valuable to a project as unit testing is: It helps determine the expected behavior and uncovers errors. Type annotation is not just "sticking some annotations on things", it imposes a stricter, perhaps more conscious, way of coding. Examples from this projects are typically the use of default value of We need to agree on what level of annotation support we aim for. E.g. mypy strict support is a very different ballgame than e.g. using mypy for development guidance only and allowing errors to prevail. For reference: our current master gives 106 mypy errors in standard rules mode and 630 errors in strict mode. Furthermore, I strongly believe that type annotations and functional changes go hand in hand and not separately, much like how functional and unit testing goes together. E.g. PR #513 I've been spending a lot of time of my spare time on improving the type annotation support in canopen. I've tried a few different approaches through various PRs already, but none of them seems to resonate. They are too big. While I agree that we need to go through the discussions this unlocks, I'm really afraid that going very granular is going to halt progress to a very slow pace. Going for typing annotation is a big job. Yet, it'll take a long time making a bread from individual grains. So perhaps full type annotation support is not within our ambition? I've come to a point where I don't know how to proceed and there is certainly no point in wasting your time on PRs that won't get accepted. I'm sorry, I'm strongly considering to cease my type annotation efforts. Since we want to go very granular on the PRs I'm genuinely afraid it'll take too much efforts than I'm able to offer up on spare time. |
I OTOH think it is a reasonable goal. With a code base of this size, I'd expect it to take between 5 and 8 months given the number of currently active contributors. |
Thanks @sveinse for your efforts so far. I think they are definitely valuable and not wasted. The fact that so far it hasn't resonated much is from my perspective a misunderstanding / disagreement on how to approach the thing overall. Yes, #456 by @friederschueler was put down pretty quickly, unfortunately! Thanks as well for that effort, it was just too hard to review. We can still use it to see how @friederschueler had solved some problems, without re-inventing the wheel while doing smaller steps. And for my part, I haven't had enough time to actually look at the recent suggestions. Too much going on in this project with better tests, fixes, etc. that I just deferred further typing efforts a little. But I do intend to get back to them of course. Yes, it will take a lot more time to get typing up to an acceptable level in this library. Here's what next steps I had in mind:
My vision of this process is to really embrace these facts: Right now we have about 5 or 10 percent "OK". The rest is missing (not annotated) or buggy (needs code adaptations). The more we annotate, the more there will be to fix. But that doesn't mean that any annotation must also immediately result in fixing all newly introduced warnings. They can very well be left piling up for some later PR or even someone else to fix them. But this way we will make progress, in small steps. There is a lot of space between our current state and whatever 100 percent we aim for. Let's reach it incrementally. That means in essence, I don't want anything silenced, but see the stuff we are talking about. I want discussions about concrete improvements, whether they just add annotations or also add fixes. But not too big a chunk, rather stop at some point early and let that be reviewed and merged. See #451 which went through with little discussion and didn't make anything perfect, but improved a small part in a comprehensible, non-controversial way. That's roughly the size I'd be comfortable reviewing regularly. Does that sound like a plan? Who's up for bullet number 1 to add the ugly red light? P.S. All of these changes should be of very low risk, as we will mostly add extra safety checks. As long as the code itself keeps working, just without these added precautions, we're not worse off than right now where stuff will simply explode at runtime when stepping on a type-confusion mine. P.P.S. And I don't think we will target strict mode anytime soon, if ever. That would be a goal for a different API, not this one with its heritage and exploiting dynamic typing for ease of use as we currently do. |
I humbly disagree. I think its a better approach to get the project up to a minimum level and begin tightening the reigns from there. PR #512 is doing precisely that, where it starts at a mypy default minimum. The objective: Get to the green, like we do for any pytest checks. How do we easily detect when there are introduced new typing errors? With a test that normally pass, we'd see it immediately. If it's in a list of 100 other errors, it'll drown.
Just running through it with "not required" doesn't create a ugly red light, does it? AFAIK the only way to get a red light is to fail the job and a red light will block any merge, doesn't it? Maybe it possible to run it in a separate GHA workflow and that the branch protection rules are updated NOT to require the job successful? Is this possible @erlend-aasland ?
Yes, this is the intent of #512.
mypy have an excellent guide on how to migrate existing projects from default settings towards stricter https://mypy.readthedocs.io/en/stable/existing_code.html. I'm following the approach it is proposing with the start in #512. IMHO going for untyped
I can see that it would be valuable to do "case" based improvements. E.g. the Network fix as proposed in PR #513. This is where annotations and functional improvements goes hand in glove.
Well, I have some experience with the road ahead and I erroneously thought that #512 was a slow and gentle start. Because in the volume of changes to come it is small. Yet, as said, this PR is too broad and needs to be split up into pieces. #512 is definitely not a perfect fix. It's a start point.
Agreed. But I hope, not too slow.
This is why it is so great that the test coverage and unittests are being improved in parallel! Yet, at some point we have to break an egg to make the omelet. If we are to fix and improve the library, we must be able to look forwards, not just curate the existing. And by this I don't mean that we should disregard stability, but we need somewhere to allow ourselves to more future looking that history glazing. My 2 cents.
I agree. It'd be a stretch to get to strict. We'd have to accept far more functional changes that what we've deliberated so far. I propose we do the following:
|
We are several people that are actively working on it and that's really awesome. It's cool when working on a project with may contributors 💪 |
Yes, we agree about the disagreement :-)
I see a logic error in your analysis. If the test normally passes because it is muted, then it will NOT raise a flag on new errors. They will simply be muted as well. The warnings are there to be seen and fixed, not to be hidden. Better to concentrate on getting that list of 100 errors shorter. But we need a list for starters.
What I mean should look like this: There are required and non-required checks. An optional one does get a red cross, but doesn't fail the CI run completely. (Source: syncthing/syncthing#9175)
My stance on that is very clear: It is OK (right now) because it is the status quo (many typing failures) but we do release working code nevertheless. We don't need to raise that bar from many failures to zero in one single step. Let there be errors, as they still do not affect the code functionality. It's just a linter. As a reference, have you ever checked out what the Debian GNU/Linux distribution does with their Regarding your proposal #512, sorry I simply haven't looked at it yet. Need some time which I didn't find today. That doesn't mean it's not considered at all. But it is more than my outlined first step, i.e. it goes too far yet. I just think this discussion is the bottom line of defining how we want to proceed, therefore I spent my time here instead of looking at the PR. To get the basics squared and agreed upon. Let's try the American approach in this case: move fast, break stuff. Whoever first finds an opportunity to dive into GitHub checks configuration, please proceed with adding an optional mypy check, as outlined above. If it gets too annoying or problematic, we can always just back out of it again. In the meantime / additionally, everyone is free to run mypy locally as they see fit, with or without any silencing options. |
No worries, I won't hold it against you that you haven't looked at the PR yet 👍. FTR the PR doesn't mute anything. Nor does it suppress any errors, it is using the default settings of mypy. And the propsed PR (albeit too functional-y) fixes all of the open issue mypy address. When that PR is done we will be compliant and passing the mypy default. THEN we can start tightening the rope by enabling new settings that draws us towards strict(er) compliance.
Ah nice. Perfect.
Type annotation is not just a linter, it's a methodology. It's a more restrictive application of python. Conceptually it's the same as applying black to python. Black adds opinions to how things shall be formatted. Type annotations does the same.
PR #512 contains two things: A set of annotations and a set of functional changes. From the discussions I agree it should be divided into smaller groups of changes. However, I stand by the goal and wanted end-result of implemented the essence of PR #512: Getting mypy passing into the green with default settings.
Yes, we need the GHA in place here anyways. |
Precisely, it concerns only the how, but not the what does the code do. That's what I meant.
Of course, I don't think the end result was ever disputed. Just the size of the first step(s). What I want is to see the progress made, i.e. run mypy in GHA so everyone looks at the same output. After that is public, any improvement can be easily measured against that baseline. |
FTR, the project owners must configure GHA checks as not required in the project settings; I cannot do this with a PR. I can help with other CI related stuff, though. Regarding pace; I don't see any reason to rush things. We'll get there eventually. I'll be focusing on test coverage for now; IMO, that's the most important task. |
I'd like to make make a progress-tracking issue like #514 for type annotations and I started outlining one. However, I stopped because I think we do not agree on its definition of done1 so stating the issue objective and its tasks is not yet possible. More so, if there is little interest and/or capacity to consider type annotation improvement in parallel with the test improvements, its no point doing this work now. Footnotes
|
As a minimum, you will need to configure the minimum supported Python version for your annotations, so something like this:
|
Good catch. This is a comment that should be posted in #512. Luckily adding this to the mypy config didn't change anything. Everything in that PR is compatible with 3.8. 😓 |
I closed #456 because it was a mess with too many changes as we agreed. I am currently very busy in my dayjob, but I plan to commit some single PRs from this work. |
Estalish a minimal mypy config that ensures annotations are compatible with Python 3.8 (the oldest supported Python version). Add a forgiving CI job that does not affect the status of the GitHub workflow run.
I'd like to revisit one option, discussed previously in this thread, namely how to deal with all the optional members etc. Instead of a blanket change of sprinkling the code with (potentially) performance degrading behavioural changes ( Use Pro:
Con: code smell (but so is the currently proposed alternative ⚖️) Using
Pro: ? I urge you to reconsider this, as I believe using Since everyone agrees What do you think? |
Estalish a minimal mypy config that ensures annotations are compatible with Python 3.8 (the oldest supported Python version). Add a forgiving CI job that does not affect the status of the GitHub workflow run.
With reference to the discussion in #339 with regards to type hints, I have started looking at the work needed for type annotating canopen. What is the overall ambitions with regards to implementation of type hints? What level of hinting and static code analysis are we aiming for? What tool? mypy, flake8, pyright/pylance ?
I'm a big proponent of type annotated python. It helps development, navigation in the editor works and it shows the expected types for calls and operations and thereby reducing bugs and misunderstandings. I think adding typing to canopen is the right thing to do. The benefits outweighs the disadvantages.
Having a full compliant type annotated library will reduce the degrees of freedom in the codebase. You can do more "tricks" with python than a linter is able to understand, and thus coding for it requires a degree of code conformity. canopen relies on a lot of duck typing and dynamic types (especially variables that conditionally contain
None
). This results a quite a bit of warnings and/or code constructs to avoid the warnings. - And rightly so, as the code in many cases is not executing any type checks to verify the datatype before invoking an operation. Lastly, many of these linting tools are very opinionated on code structure, so we must expect some functional changes.I've given a stab at implementing type annotation in canopen (see link below). I'm using mypy and pyright. After doing this I think we will have to face a few discussions on the correct and intended behavior on a few details. E.g. the use
None
as optional value.sveinse@4e8925e
The text was updated successfully, but these errors were encountered: