Make it easier to create python stub files #3268
Open
+100
−41
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR makes it easier to create python stub files, which are used for IDE completion and static type analysis. The direct benefactors of this change are a very niche group, but we're a group that support the community by providing tools that make developing with OpenUSD a more productive and pleasurable experience.
I'm using these changes to create stubs which are published to pypi here. You can peruse the stub files here. For the past year or two I've had to maintain my own fork of OpenUSD to make this possible.
In order to sell you on this change, I'm going to demonstrate that the function signatures in the stub files that I'm creating are superior to those included with docstrings that ship with OpenUSD, and I'd like to plant the idea that this PR is the first step towards an eventual integration of stub generation into the native OpenUSD build process.
Background
As someone who wants very accurate python stubs, I've faced a handful of challenges:
object
. However, it does give us the actual number of python overloads, and is reliable about certain result types likelist
andtuple
, and this extra information can help fill the gap between live inspection of python objects and the rich information in the C++ docs.Comparison between python signatures: yours (native) vs mine (cg-stubs)
Overview
As part of its build process, OpenUSD generates docstrings in side-car python files named
__DOC.py
, which are loaded at import time. In order to make use of these docstrings within your IDE, it must be configured to be able to import the OpenUSD python modules to inspect them.By comparison, pyi stub files are a lightweight approach to load not only function and class documentation into your IDE but also type information, which is informative, and is used to drive code completion. Additionally, stub files can be used by static type analysis tools like mypy to validate your code. Providing access to stub files within your IDE is typically as simple as running
pip install types-usd
within your project's virtualenv.Quality
This is the docstring that OpenUSD generates for the method
Usd.ClipsAPI.GetClipAssetPaths
:These are my stubs generated for the same methods:
You'll notice that the docstrings present in yours and mine are equivalent (and share the same whitespace idiosyncracies) because I'm using the same
doxygenlib
modules to process docstrings.The primary difference is in the overload signatures. Here's a more direct side-by-side comparsion:
First overload
Second overload
tl;dr The native signatures are wrong in many ways. Below is a table explaining the sources of these differences:
std::vector
andstd::sequence
std::set
,std::unordered_set
,std::function
,std::map
,std::unordered_map
,std::optional
Ar.ResolvedPath
typedef
andusing
statements to substitute these aliases for their actual typesself
orcls
argsself
andcls
args for methodsOk, I'm convinced. What next?
Here's my proposed 3-step plan:
Step 1
Merge this PR.
Step 2
I will contribute type annotations to several open source project which I can use to further validate my stubs, and continue to broadcast the existence of these stubs far and wide to drive adoption. I've already created a PR for the great USD Cookbook repo. Ideally, if Pixar is open to it, I would like to contribute type annotations to OpenUSD itself, for example in
Usdviewq
. To ease into this, I have a followup to this PR to add annotations todoxygenlib
.Step 3
If we are all in agreement, then the final step would be to move the pyi stub generation process into OpenUSD itself. I'm a pretty busy guy and moving the project into OpenUSD would ensure that it undergoes more regular releases than I can manage. I will be around to help out with maintenance of the generator.
Q & A
What has changed under the hood?
There are two main changes in this PR, both limited to the build process:
doxygenlib
is updated to hide these signatures from users.Tf.PreparePythonModule
overrides function__doc__
attributes with values from__DOC.py
. This behavior has remained unchanged. What has changed is the functions that are included in__DOC.py
files during the build process. Prior to this change,cdWriterDocstring
omitted entries in__DOC.py
for functions which had a docstring provided by Boost, because an existing docstring indicated that it had been manually authored in the C++ wrapper.After this PR, all functions have docstrings generated by Boost (because of the new Boost signatures), so we have to do a little extra work to determine if the Boost docstrings are just the auto-generated python signatures, or custom docstrings from the wrappers. Now when
cdWriterDocstring
processes an existing docstring, it removes the Boost signatures and if there's anything left it writes it to__DOC.py
.The upshot is that the additional docstrings make the python lib directory slightly larger, 25MB, or about 0.6% to a full OpenUSD build:
What are the differences for the user?
As mentioned, I've taken great pains to preserve the original behavior as much as possible, however, there are some edge cases.
Functions for which
cdWriterDocstring
found a C++ match in doxygen will remain the same, because they will recieve an entry in__DOC.py
. However, there are functions whichcdWriterDocstring
never visits and which therefore do not receive an override in__DOC.py
. These functions will now have docstrings generated by Boost.For example, before this PR, the following function had no docstring:
$ python3.9 -c "import pxr.Sdf as Sdf;print(Sdf.ListEditorProxy_SdfNameKeyPolicy.ContainsItemEdit.__doc__)" None
After this PR, it has a docstring generated by Boost:
Why enable the boost signatures by default?
If we follow through with the 3-step plan above, building with signatures will need to be the default, because we will not want to build one version of OpenUSD with signatures enabled to create the stubs, and then rebuild with signatures disabled.
Why not provide an option to enable the Boost signatures?
I'm open to doing this, but adding an option to enable the Boost signatures adds complexity to the doxygenlib code because it needs to produce parity in output with both scenarios.
I know this is a lot to read, but I thought I'd front load all of the context necessary to make a decision on this PR. Take your time and ask as many questions as you like.
Thanks!