Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup of DTS processing in the build process #78159

Merged
merged 5 commits into from
Sep 25, 2024

Conversation

benediktibk
Copy link
Collaborator

@benediktibk benediktibk commented Sep 9, 2024

This PR includes a number of cleanups in the edtlib, as well as a split of the gen_defines.py into gen_edt.py and gen_defines.py.

This ensures a clear responsibility of the scripts. So far the pickled edt was generated as a byproduct by gen_defines and later on used as an input by another script.

@benediktibk benediktibk changed the title Cleanup of DTS processing the build process Cleanup of DTS processing in the build process Sep 10, 2024
@benediktibk benediktibk marked this pull request as ready for review September 10, 2024 09:18
# This doesn't violate any devicetree.h API guarantees about
# instance ordering, since we make no promises that instance
# numbers are stable across builds.
for compat, nodes in self.compat2nodes.items():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fgrandel why do you believe that this belongs in edtlib instead of gen_defines main? also, if you really think it belongs here, should we take the opportunity to convert it to an insertion sort?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @decsny , thanks for reviewing this PR.

why do you believe that this belongs in edtlib instead of gen_defines main?

Conceptually this property should be read-only, because it is just a materialized view of internal EDT state. IMO it is bad OO practice to change a derived property owned by another object from the outside (breaks encapsulation, introduces path dependency to other usage sites). And the sorting concern seems to fit the general EDT concern well, too.

Do you see any disadvantage in having the same sorting for all clients?

should we take the opportunity to convert it to an insertion sort?

Yes, why not. As we only have a 0/1 sort, we could change

                self.compat2nodes[compat].append(node)
                if node.status == "okay":
                    self.compat2okay[compat].append(node)

to something like

                if node.status == "okay":
                    self.compat2okay[compat].append(node)
                    self.compat2nodes[compat].insert(0, node)
                else
                    self.compat2nodes[compat].append(node)

@benediktibk As it's your PR, probably better if you change that and I rebase later.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @fgrandel about this being an internal detail of the class. Therefore I think the sorting should happen within the class.

Using an insertion sort is not a bad idea. Probably not a big game changer, but it also doesn't hurt.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm not sure this is the correct location for this.

The purpose edtlib (and dtlib.py) is to have a standalone abd reusable python library for working with devicetrees, see:

I don't see the sorting as a necessity for working with devicetrees in general.
True that it's used by gen_defines for assigning instance numbers, and for that purpose the it wants to sort the nodes.

But instance numbers are not a devicetree thing: https://github.com/devicetree-org/devicetree-specification/releases/download/v0.4/devicetree-specification-v0.4.pdf but a Zephyr code thing on how we access nodes.

As it should be clear from the comments, this sorting is purely done because gen_defines assigns instance numbers based on the sorted collection.

Ref:

# As a way to satisfy people's intuitions about instance numbers,
# we sort this list so enabled instances come first.

Also, edtlib.py suddenly mentions devicetree.h, but edtlib should not know anything about generate source code, that's the responsibility of gen_defines.

# This doesn't violate any devicetree.h API guarantees about
# instance ordering, since we make no promises that instance
# numbers are stable across builds.

scripts/dts/python-devicetree/src/devicetree/edtlib.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@tejlmand tejlmand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for this cleanup, generally good looking.

Main concern is the moving of sort into edtlib.
The sorting is done due to instance numbering (a Zephyr thing), but the devicetree spec doesn't have the concept of instance number, and therefore this sorting cannot be argued from a devicetree point of view to be required in edtlib.
In fact doing a sort inside edtlib starts to make promises on something which is not a core devicetree thing.

Based on above and because the sorting is an internal Zephyr requirement, then I believe this sorting should still happen in Zephyr's devicetree code, and not in edtlib.py.

# This doesn't violate any devicetree.h API guarantees about
# instance ordering, since we make no promises that instance
# numbers are stable across builds.
for compat, nodes in self.compat2nodes.items():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm not sure this is the correct location for this.

The purpose edtlib (and dtlib.py) is to have a standalone abd reusable python library for working with devicetrees, see:

I don't see the sorting as a necessity for working with devicetrees in general.
True that it's used by gen_defines for assigning instance numbers, and for that purpose the it wants to sort the nodes.

But instance numbers are not a devicetree thing: https://github.com/devicetree-org/devicetree-specification/releases/download/v0.4/devicetree-specification-v0.4.pdf but a Zephyr code thing on how we access nodes.

As it should be clear from the comments, this sorting is purely done because gen_defines assigns instance numbers based on the sorted collection.

Ref:

# As a way to satisfy people's intuitions about instance numbers,
# we sort this list so enabled instances come first.

Also, edtlib.py suddenly mentions devicetree.h, but edtlib should not know anything about generate source code, that's the responsibility of gen_defines.

# This doesn't violate any devicetree.h API guarantees about
# instance ordering, since we make no promises that instance
# numbers are stable across builds.

scripts/dts/gen_edt.py Outdated Show resolved Hide resolved
cmake/modules/dts.cmake Outdated Show resolved Hide resolved
cmake/modules/dts.cmake Outdated Show resolved Hide resolved
scripts/dts/python-devicetree/src/devicetree/edtlib.py Outdated Show resolved Hide resolved
scripts/dts/gen_edt.py Show resolved Hide resolved
scripts/dts/edtlib_logger.py Show resolved Hide resolved
@fgrandel
Copy link
Contributor

fgrandel commented Sep 13, 2024

Hi @tejlmand ,

thanks for your review.

The sorting is done due to instance numbering (a Zephyr thing), but the devicetree spec doesn't have the concept of instance number, and therefore this sorting cannot be argued from a devicetree point of view to be required in edtlib.

It is one of the defining distinctions between dtlib and edtlib that dtlib tries to be "client neutral" while edtlib introduces Zephyr-specific concepts (ordinals and bindings among others).

Please note, that edtlib has the responsibility to generate ordinals, too, so why not instance numbers? From the current docs:

dep_ord2node: A dict that maps an ordinal to the node with that dependency ordinal.
dep_ordinal: A non-negative integer value such that the value for a Node is less than the value for all Nodes that depend on it.

gen_defines.py uses these ordinals unchanged from edtlib:

out_comment("Node's dependency ordinal:")
out_dt_define(f"{node.z_path_id}_ORD", node.dep_ordinal)
out_dt_define(f"{node.z_path_id}_ORD_STR_SORTABLE", f"{node.dep_ordinal:0>5}")

I think along the lines of this argument instance numbers even SHOULD be in edtlib (just like ordinals) for improved re-use rather than in some client using them - but the concern should be properly documented as it is for ordinals. Backround: We intent to re-use edtlib for more clients (currently working on a settings subsys target) and there we want to re-use ordinals and instance numbers, too.

So what you actually discovered (and this should be fixed - but not in this PR) is that the README (but not the API doc) is outdated and misleading.

Also note that the version of edtlib/dtlib in PiPy/readthedocs you point to is very old and not relevant any more. The corresponding github repository has been retired quite some time ago because Zephyr withdrew from the non-Zephyr client projects still mentioned in the README (which was basically Lopper AFAIK).

@benediktibk I think @tejlmand is right, though, that this inline comment is really misleading and largely wrong. After all no sorting along instance numbers or source code generation is done here. We only sort by "okay" vs. the rest - I should have realized that and updated that comment. Plus we should mention the sorting (and maybe the intent to represent ordered instances) in the API docs similarly to how it is documented for ordinals. Can you change this (iif @tejlmand agrees) or should I do so in my own commit and you cherry-pick again?

@tejlmand Could you please re-review on this basis and also please mark optional change proposals as such as required by our reviewing standards. I like your proposals, but some of them are unrelated to the PR/cosmetic AFAICS and should therefore be marked as optional.

Edited to fix s/ordinals/instance numbers/ where appropriate.

@benediktibk
Copy link
Collaborator Author

Will take care of these things when I'm back from vacation.

@tejlmand
Copy link
Collaborator

It is one of the defining distinctions between dtlib and edtlib that dtlib tries to be "client neutral" while edtlib introduces Zephyr-specific concepts (ordinals and bindings among others).

True, but bindings is still a general concept which is also used outside of Zephyr, such as the Linux kernel
https://www.kernel.org/doc/html/latest/devicetree/index.html#devicetree-bindings

The ordinal is well described in the sense that they have a clear purpose and how they are constructed:

dep_ordinal:
A non-negative integer value such that the value for a Node is
less than the value for all Nodes that depend on it.

Please note, that edtlib has the responsibility to generate ordinals, too, so why not instance numbers? From the current docs:

That highly depends on how they are generated and how deterministic they are.
For example for ordinals it is clear why an ordinal for a given node is different from another node (see above).

But currently, the current instance numbers are simply chosen as:

The nodes within a compatible are sorted by enabled-status.
This is a way to satisfy peoples intuitions about instance numbers.
It also ensures that drivers and applications which do not use instance numbers can use their singleton with index 0.

to me, that doesn't look as a very good argument for the instance number in a public api.
If a better reasoning / way to describe how instance numbers are chosen which has some more "deterministic" way how a given instance number is determined, then please provide such proposal and I will take a second look at it.

Alternative is to keep the sorting in edtlib, and then simple describe that the collection is sorted, for example here:

compat2nodes:
A collections.defaultdict that maps each 'compatible' string that appears
on some Node to a list of Nodes with that compatible.
by appending:

The collection is sorted so that enabled nodes appears first in the collection

Also note that the version of edtlib/dtlib in PiPy/readthedocs you point to is very old

correct, but the text haven't changed and is still the same. It was only re-ordered in #56849, but not modified in any way relevant for this discussion.

and not relevant any more.

wrong, text is actually still the same, and afaik the link provide is the only rendered docs we have, hence the reason it's the easiest link to use.

The corresponding github repository has been retired quite some time ago ...

yep, but this doesn't change the fact that the docs is still matches what we have in the repo today.

please mark optional change proposals as such

sure, feel free to consider the two extra line proposals in python as optional.

@fgrandel
Copy link
Contributor

fgrandel commented Sep 22, 2024

Thanks @tejlmand for taking the time for such a detailed response. This makes your position much clearer to me and gives good hints how to proceed!

@fgrandel
Copy link
Contributor

Just as an addition to my initial comment in the light of @tejlmand's arguments:

Alternative is to keep the sorting in edtlib, and then simple describe that the collection is sorted, for example here:
zephyr/scripts/dts/python-devicetree/src/devicetree/edtlib.py

Lines 1928 to 1930 in 4b2c030

 compat2nodes: 
   A collections.defaultdict that maps each 'compatible' string that appears 
   on some Node to a list of Nodes with that compatible. 

by appending:
The collection is sorted so that enabled nodes appears first in the collection

@benediktibk Do you agree that we follow this proposal by @tejlmand initially? Maybe we could also add a TODO to remind us that we should make the overall instance id ordering more deterministic in the future?

@tejlmand I'll provide a proposal for deterministic instance ID ordering in a future PR. I first have to think about sensible ordering criteria. Is that ok?

Introduces type hints to all functions for improved static type checking
and IDE support.

Also equalizes spacing between functions as the lines are being touched
anyway.

Signed-off-by: Florian Grandel <[email protected]>
Removes redundant sorting of nodes.

Signed-off-by: Florian Grandel <[email protected]>
Moves node sorting concern into EDT.

Signed-off-by: Florian Grandel <[email protected]>
@benediktibk
Copy link
Collaborator Author

I've kept the actually sorting now in edtlib, but just changed the comments to state that they are sorted. I also tried to avoid instances numbers and such in edtlib, as these concepts do not belong there.

@tejlmand Did I understand you correctly, that this is acceptable for you?

@benediktibk
Copy link
Collaborator Author

@benediktibk Do you agree that we follow this proposal by @tejlmand initially? Maybe we could also add a TODO to remind us that we should make the overall instance id ordering more deterministic in the future?

I currently to do not see an actual necessity to have a deterministic instance ordering, but in general I agree that it feels more correctly to have these kind of things defined and adhered to.

Separate the pickled EDT generation from the C-Macro header
generation in gen_defines.py to have a more clear responsibility
of the scripts in the DTS parsing process.

Signed-off-by: Benedikt Schmidt <[email protected]>
Sort the elements in the lists of compat2nodes already during insertion.

Signed-off-by: Benedikt Schmidt <[email protected]>
Copy link
Collaborator

@tejlmand tejlmand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approved, and thanks for the patience and cleanup 👍

just note that it's a bad practice to have changes in one commit being removed in a later commit, as this makes reviewing commit-by-commit harder.
Example, comment is move in this commit: eed5a59
just to be completely remove in this later commit:
4a10b18

@benediktibk
Copy link
Collaborator Author

just note that it's a bad practice to have changes in one commit being removed in a later commit, as this makes reviewing commit-by-commit harder.

I actually thought about this, but for two reasons I decided against squashing them:

  • There was a dedicated thought process behind removing the comment (as can be seen in the discussion), separated from the actual move of the comment.
  • I didn't want to tinker around with the commit from @fgrandel and rather keep it as I cherry picked it.

But anyway, a rather cosmetic detail.

Thanks for the review, hopefully this is now more clear for the next person working in this area.

@tejlmand
Copy link
Collaborator

I actually thought about this, but for two reasons I decided against squashing them:

I fully understand your reasons, and as I have been part of the discussion then it's not impacting my review process.
But there are 14 assigned reviewers in this PR who still haven't conducted a review(afaict), and I don't expect any late arriving reviewer to go through all old discussions.

This PR is not so bad, but you can find PRs with >100 comments where you can't expect everyone to read all discussions, hence the reason why I just wanted to mention some good practice 😉 No reason to do any changes at this point.

@nashif
Copy link
Member

nashif commented Sep 25, 2024

image

Please do not assign additional people that are not maintainers of any of the areas being changed.
See https://docs.zephyrproject.org/latest/project/project_roles.html#teams-and-supporting-activities.

@mmahadevan108 mmahadevan108 merged commit e7bf414 into zephyrproject-rtos:main Sep 25, 2024
38 checks passed
@benediktibk benediktibk deleted the gen_edt branch September 25, 2024 18:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants