-
-
Notifications
You must be signed in to change notification settings - Fork 581
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Faster pprint #1145
Faster pprint #1145
Conversation
Hi there, thanks for the PR.
This sounds slightly odd -- exceptions are only formatted when they're printed (or otherwise when Regardless though, this PR likely can't go forward, because it isn't the case that exception contents will be JSON serializable! The instance or schema may have been deserialized using a custom |
Hi, thank you for comment!
Yes, we output errors as log entries. Generally, they don't cause harm, but they can be time-consuming when outputting a JSON with numerous elements involved.
Agreed. So the |
I still wouldn't want to use Is there a reason in your logs you don't simply format exceptions in whatever way you'd like by pulling off whatever attributes you're interested in? You might also be interested in #243 which is about making this precise function ( |
Points taken.
No, we can format our log ourselves. But I thought using json.dumps() is a reasonable default to improve jsonschema. I'm closing the PR. Thank you! |
Fair enough, I appreciate the offer(s) regardless! Thanks for giving it a shot. |
jsonschema can sometimes be very slow when validation errors occur. This is due to the slow formatting of large schemas or data with pprint.pformat(). In this PR, I propose replacing pprint.pformat() with json.dumps(), which is expected to perform more than 10 times faster.
The following samples measure the time taken to format 300 GitHub PRs.
pprint.pformat()
json.dumps()
(Memo: If it's possible to add as a dependency, using orjson would make it more than 10 times faster.)
json.dumps() does not perform wordwrapping or other formatting pformat() does. However, such formatting can insert unnecessary newline characters into the data, sometimes making debugging hard. I believe it's more appropriate not to perform such modifications.
📚 Documentation preview 📚: https://python-jsonschema--1145.org.readthedocs.build/en/1145/