You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In general, I understand and agree with the pdoc use of __all__ to determine what to document.
However, when it comes to submodules within the __init__.py of a parent module, there are drawbacks to importing all submodules to the parent level.
Specifically, an import of submodules can increase memory footprint (in theory) and can cause otherwise unrepresented circular dependency issues (confirmed through my own experience).
There may also be implications when considering submodules that have imports that are only valid given the inclusion of certain python "extras". There's a risk that recursively importing submodules at runtime could cause unexpected failures in those cases.
I have loosely/tentatively confirmed that putting submodule imports in if typing.TYPE_CHECKING: seems to avoid these issues, but I'm not sure of the other implications.
Proposal
Either by promoting the TYPE_CHECKING workaround, or by another method, I'd like to find a path forward where submodules would be available for docs generation without risking adverse runtime impacts on the package itself.
Alternatives
A second option would be for pdoc to adopt a behavior of always documenting submodules if they are not prefixed with "_".
A third option would be for me to cleverly adapt my own docs/generate.py script to accomplish the same effect. (Not sure if this is possible.)
A fourth option would be to come up with html template to accomplish the same effect. (I'm pretty sure this is not possible.)
It's a lot of code, but basically we recurse all subdirectories manually and use naming conventions to add submodule files and folders explicitly in the docs/generate.py script.
Note: Submodules that don't have __all__ declared at all will receive a warning notification for their sub-submodules are being declared twice. This is because when __all__ is omitted, pdoc does auto-document submodules.
docs/generate.py
importosimportpathlibimportshutilfromtypingimportcastimportpdocdefrun() ->None:
public_modules= [
"airbyte_cdk",
]
# Walk all subdirectories and add them to the `public_modules` list# if they do not begin with a "_" character.forparent_dir, dirs, filesinos.walk(pathlib.Path("airbyte_cdk")):
fordir_nameindirs:
if"/."inparent_diror"/_"inparent_dir:
continueifdir_name.startswith((".", "_")):
continueprint(f"Found module dir: {parent_dir+'|'+dir_name}")
# Check if the directory name does not begin with a "_"module= (parent_dir+"."+dir_name).replace("/", ".")
if"._"notinmoduleandnotmodule.startswith("_"):
public_modules.append(module)
forfile_nameinfiles:
ifnotfile_name.endswith(".py"):
continueiffile_namein ["py.typed"]:
continueiffile_name.startswith((".", "_")):
continueprint(f"Found module file: {'|'.join([parent_dir, file_name])}")
module=cast(str, ".".join([parent_dir, file_name])).replace("/", ".").removesuffix(".py")
public_modules.append(module)
# recursively delete the docs/generated folder if it existsifpathlib.Path("docs/generated").exists():
shutil.rmtree("docs/generated")
pdoc.render.configure(
template_directory="docs",
show_source=True,
search=True,
logo="https://docs.airbyte.com/img/logo-dark.png",
favicon="https://docs.airbyte.com/img/favicon.png",
mermaid=True,
docformat="google",
)
nl="\n"print(f"Generating docs for public modules: {nl.join(public_modules)}")
pdoc.pdoc(
*set(public_modules),
output_directory=pathlib.Path("docs/generated"),
)
if__name__=="__main__":
run()
Problem Description
In general, I understand and agree with the
pdoc
use of__all__
to determine what to document.However, when it comes to submodules within the
__init__.py
of a parent module, there are drawbacks to importing all submodules to the parent level.Specifically, an import of submodules can increase memory footprint (in theory) and can cause otherwise unrepresented circular dependency issues (confirmed through my own experience).
There may also be implications when considering submodules that have imports that are only valid given the inclusion of certain python "extras". There's a risk that recursively importing submodules at runtime could cause unexpected failures in those cases.
I have loosely/tentatively confirmed that putting submodule imports in
if typing.TYPE_CHECKING:
seems to avoid these issues, but I'm not sure of the other implications.Proposal
Either by promoting the
TYPE_CHECKING
workaround, or by another method, I'd like to find a path forward where submodules would be available for docs generation without risking adverse runtime impacts on the package itself.Alternatives
A second option would be for
pdoc
to adopt a behavior of always documenting submodules if they are not prefixed with "_".A third option would be for me to cleverly adapt my own
docs/generate.py
script to accomplish the same effect. (Not sure if this is possible.)A fourth option would be to come up with html template to accomplish the same effect. (I'm pretty sure this is not possible.)
Additional context
Here is a ChatCPT conversation that explores implications and options: https://chatgpt.com/share/672ecbd4-47e0-8004-bf55-269227d96b9f
The text was updated successfully, but these errors were encountered: