Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python POC #282

Closed
wants to merge 445 commits into from
Closed

Python POC #282

wants to merge 445 commits into from

Conversation

lucasmcdonald3
Copy link
Contributor

@lucasmcdonald3 lucasmcdonald3 commented Aug 15, 2023

Python codegen for Smithy-Dafny.

This generates conversion code between Dafny-generated Python code and

  1. Polymorph LocalServices
  2. Polymorph wrapped LocalServices
  3. AWS SDKs (boto3 wrapper)

The generated code works for all TestModels, the MPL, and MPL TestVectors.
MPL PR: aws/aws-cryptographic-material-providers-library#171

Sub-PRs

These smaller PRs focus on parts of this PR. These are not functional on their own. These are solely intended for reviewer readability.

Pending work

  • "ExternV2"
    • Replace PYTHONPATH hacks with Dafny ModuleName. These are identified with TODO-Python-PYTHONPATH.
    • Change generated extern names back from _ to . (this will fix Java and Net AWS SDK GHA failures)
  • Use PyPi DafnyRuntime
  • Timestamp shapes (not used in MPL I believe; will be needed for DBESDK)

Build instructions

Set up project:

  • Check out fork with submodules
  • Build Dafny-nightly-latest from source; set as your default Dafny
  • Install Python 3.11

Build a TestModel:

make polymorph_dafny
make polymorph_python
make transpile_python
make test_python

GHA Failures

  • NET/Java/Verification AWS SDKs: I committed a name change to AWS SDKs to help Python out. I will remove this as part of Dafny ModuleName work ("ExternV2").
  • smithy-dafny-codegen-cli: This should solve itself once Smithy-Python is on Maven central, so I'm not planning to debug it (unless this becomes blocking).

Misc

  • This PR is WIP. Its primarily provides a testbed for me during development.
  • I won't merge this in until all CI passes to avoid blocking mainline Smithy-Dafny on in-dev work.
  • Smithy-Python is included as a submodule. It points to my fork of Smithy-Python. I am not sure how to proceed with having this fork of Smithy-Python. Ideally, we would include Smithy-Python's mainline as a submodule. I will clean up the changes on my fork, get those changes looked at by us, then open a PR against Smithy-Python.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Copy link
Contributor

@robin-aws robin-aws left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sticking my foot in the door as I really want to review this - feel free to pester me if I haven't started in a week :)

Copy link
Contributor

@texastony texastony left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Praise: I have only viewed 15 files, but I like what I have read so far.

.github/workflows/smithy-dafny-conversion.yml Outdated Show resolved Hide resolved
TestModels/Aggregate/smithy-build.json Outdated Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I am skipping over the rest of the test Models for now.

Comment on lines 397 to 410
# Hacky workaround until Dafny supports per-language extern names.
# Replaces `.`s with `_`s in strings like `{:extern ".*"}`.
# This is flawed logic and should be removed, but is a reasonable band-aid for now.
# TODO: Once Dafny supports per-language extern names, remove and replace with Pythonic extern names.
# This is tracked in https://github.com/dafny-lang/dafny/issues/4322.
# This may require new Smithy-Dafny logic to generate Pythonic extern names.
_python_underscore_extern_names:
find src -regex ".*\.dfy" -type f -exec sed -i $(SED_PARAMETER) '/.*{:extern \".*\".*/s/\./_/g' {} \;
find Model -regex ".*\.dfy" -type f -exec sed -i $(SED_PARAMETER) '/.*{:extern \".*\.*"/s/\./_/g' {} \;
find test -regex ".*\.dfy" -type f -exec sed -i $(SED_PARAMETER) '/.*{:extern \".*\".*/s/\./_/g' {} \;

_python_underscore_dependency_extern_names:
$(MAKE) -C $(STANDARD_LIBRARY_PATH) _python_underscore_extern_names
$(patsubst %, $(MAKE) -C $(PROJECT_ROOT)/% _python_underscore_extern_names;, $(LIBRARIES))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Praise/Note:
Nice way to "Make it Work".
This is very close to solution written by an ARG Intern from 2019.

Comment on lines 9 to 14
Note:
Python's `.encode('utf-8')` does not handle surrogates.
These methods are expected to handle surrogates (e.g. "\uD808\uDC00").
To work around this, we encode Dafny strings into UTF-16-LE (little endian)
and decode them before encoding into UTF-8 (`_strict_tostring`).
To decode, we reverse the encode step. (`_reverse_strict_tostring`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Tell me about the performance characteristics of this implementation.
It appears that we parse every character three times.

For Encoding:

  1. Encode to UTF-16-LE
  2. Decode to ?
  3. Encode to UTF-8

For Decode:

  1. Decode from UTF-8

Is this correct?

Copy link
Contributor Author

@lucasmcdonald3 lucasmcdonald3 Oct 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are parsing each character 3 times on decode as well (reverse of encode).

(Obviously I'm not happy about this, but I struggled to handle surrogate characters, and this seemed to be the most commonly suggested pattern to handle those...)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not need to block development on this.
But we should not ship a product with this deficiency.

I do not know how often we convert strings,
but I imagine it's often.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be a good time to at least discuss the eventual path to using --unicode-char:true in Dafny. A big part of supporting that mode is avoiding exactly this annoying edge case.

@texastony
Copy link
Contributor

Note: https://github.com/smithy-lang/smithy-dafny/blob/lucmcdon/python-poc/.github/workflows/test_models_dafny_verification.yml#L87

It looks like the Python GHA uses Dafny 4.3 but the Dafny Verification uses 4.1.

@robin-aws
Copy link
Contributor

Smithy-Python is included as a submodule. It points to my fork of Smithy-Python. I am not sure how to proceed with having this fork of Smithy-Python. Ideally, we would include Smithy-Python's mainline as a submodule. I will clean up the changes on my fork, get those changes looked at by us, then open a PR against Smithy-Python.

What changes did you have to make to smithy-python? If we can get those contributed upstream or find a way to not need them, you could just depend on the software.amazon.smithy.python:smithy-python-codegen package from Maven Central (I'm not sure when they are planning on publishing it there but it's likely pretty soon).

@lucasmcdonald3
Copy link
Contributor Author

What changes did you have to make to smithy-python? If we can get those contributed upstream or find a way to not need them, you could just depend on the software.amazon.smithy.python:smithy-python-codegen package from Maven Central (I'm not sure when they are planning on publishing it there but it's likely pretty soon).

Changes are here.

@lucasmcdonald3 lucasmcdonald3 deleted the lucmcdon/python-poc branch August 27, 2024 20:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants