Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Ranges so that they are easier to use #190

Merged
merged 6 commits into from
Jul 6, 2023

Conversation

popematt
Copy link
Contributor

Issue #, if available:

N/A

Description of changes:

Current implementation of Range has unnecessary structs/enums that make it confusing to use and maintain. Range was an enum of the possible types of ranges, and was used in places where there should only be one type of range. The Rust-y thing to do would be to rely on generics/monomorphization of RangeImpl. (This is also problem in the Kotlin implementation that was ported over when ion-schema-rust was in its early days.)

In addition, TypedRangeBoundaryValue and RangeBoundaryValue had a lot of redundancy, and RangeBoundaryType models inclusivity/exclusivity, but it's not truly orthogonal with the boundedness/unboundedness (which is modeled in RangeBoundaryValue and TypedRangeBoundaryValue).

Major changes in this PR include:

  • There's no longer an untyped range
  • Collapsed the exclusivity and the boundedness into a single enum (Limit)
  • Updated all range-based constraints to have the appropriate type of range (e.g. Utf8ByteLength(Range) => Utf8ByteLength(UsizeRange))
  • Change ValidValue to be an enum of Element, NumberRange, and TimestampRange – this is required because there's no longer a single range type that covers all ranges.
  • Change ValidValue::Element to hold a Value. Now it's impossible to construct a valid values constraint that has annotations on a ValidValue::Element. (May as well use the type system to our advantage.)

There's also PR tour notes in the PR diff that call out smaller changes.

Some things that I don't have a strong opinion about and/or want feedback:

  • Naming of Limit – it could also be Boundary and the variants could be Unbounded, Inclusive, and Exclusive.
  • Should the precision constraint use NonZeroU64 instead of u64? I wasn't sure if the benefit or the non-zero type would outweigh the loss of the convenience of using a primitive (u64) for this.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

ion-schema/src/constraint.rs Show resolved Hide resolved
})?;

isl_require!(occurs.upper() != &Limit::Closed(0usize) => "Occurs cannot be 0")?;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ isl_require! is a macro I created along the lines of Kotlin's require() function to make it more concise to check ISL syntax. It takes a condition that must be true, and format args for an error message if it's false. You'll see the definition of the macro further down in result.rs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

@@ -730,338 +716,6 @@ mod isl_tests {
values.iter().cloned().map(|v| v.into()).collect()
}

#[rstest(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ These tests are for ranges, so when I rewrote the Range implementation, I moved/ported them from here to the new ranges module.

.any(|p| p == file_name)
}

#[test_resources("ion-schema-tests/**/*.isl")]
#[test_resources("ion-schema-schemas/**/*.isl")]
fn test_write_to_isl(file_name: &str) {
if is_skip_list_path(&file_name) {
if is_skip_list_path(file_name) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ This is a Clippy fix.

@@ -1075,14 +729,14 @@ mod isl_tests {
fn is_skip_list_path(file_name: &str) -> bool {
SKIP_LIST
.iter()
.map(|p| p.replace('/', &MAIN_SEPARATOR.to_string()))
.map(|p| p.replace('/', std::path::MAIN_SEPARATOR_STR))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ I'm pretty sure this was a Clippy fix.

Comment on lines +99 to +100
macro_rules! isl_require {
($expression:expr => $fmt_string:literal $(, $($tt:tt)*)?) => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ Nifty little macro to make it more concise to validate the ISL.

Example usage:

isl_require!(foo.is_bar() => "Foo must be bar, but was {foo}")?;

I'm not very keen on the => to separate the condition from the format args, but there's a very limited set of characters that are allowed to follow an expr designator. (I originally wanted else, but the other options are , and ;.)

Feel free to tell me that it should be , or ; instead if you have an opinion on the matter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think with the limited options => sounds good to me.

ion-schema/src/types.rs Show resolved Hide resolved
Comment on lines +869 to +870
"`timestamp_offset` values must be non-null strings, found null",
);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ Automated change from cargo fmt.

Comment on lines +875 to +876
"`timestamp_offset` values must be non-null strings, found {e}"
));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ Automated change from cargo fmt.

Comment on lines -1375 to -1387
/// Returns IonSchemaError whenever annotations are provided within ValidValue::Element
/// only `range` annotations are accepted for ValidValue::Element
pub fn new(valid_values: Vec<ValidValue>, isl_version: IslVersion) -> IonSchemaResult<Self> {
let valid_values: IonSchemaResult<Vec<ValidValue>> = valid_values
.iter()
.map(|v| match v {
ValidValue::Range(r) => Ok(v.to_owned()),
ValidValue::Element(e) => ValidValue::from_ion_element(e, isl_version),
})
.collect();
Ok(Self {
valid_values: valid_values?,
})
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ ValidValue::Element can no longer have annotations, so all of this logic is no longer necessary.

@popematt popematt requested review from desaikd and zslayton and removed request for desaikd June 28, 2023 16:50
@popematt popematt marked this pull request as ready for review June 28, 2023 16:54
ion-schema/src/constraint.rs Show resolved Hide resolved
Comment on lines +99 to +100
macro_rules! isl_require {
($expression:expr => $fmt_string:literal $(, $($tt:tt)*)?) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think with the limited options => sounds good to me.

ion-schema/src/isl/mod.rs Outdated Show resolved Hide resolved
/// The end of a [`Range`].
#[derive(Debug, PartialEq, Clone)]
pub enum Limit<T> {
Unbounded,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does unbound represent here? It would be nice to add some document comments for these enum variants and how they are used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find Inclusive and Exclusive more intuitive than Closed and Open, but I think which pair you're familiar with depends on the domain(s) you're coming from.

Copy link
Contributor

@desaikd desaikd Jul 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think we should add some comments about what Unbounded represents here and when to use it? I see that you have used it to represent min/max but it would be helpful to add some doc comments for it so the usecase is clear.

})?;

isl_require!(occurs.upper() != &Limit::Closed(0usize) => "Occurs cannot be 0")?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

ion-schema/src/constraint.rs Show resolved Hide resolved
}

/// Defines a type that can be "ranged over" using a [`Range`].
pub trait RangeOverType: PartialEq + PartialOrd + Clone
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

Comment on lines 238 to 239
// TODO: consider changing to Range<NonZeroU64> since this is only used for the `precision` constraint
pub type U64Range = Range<u64>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not understand what its suggesting. But I think even if its only used by precision we should still keep it.

@desaikd
Copy link
Contributor

desaikd commented Jun 28, 2023

Overall this is a really nice improvement! Just added few minor comments and related questions.

ion-schema/src/constraint.rs Show resolved Hide resolved
ion-schema/src/constraint.rs Outdated Show resolved Hide resolved
Constraint::Precision(PrecisionConstraint::new(Range::NonNegativeInteger(
precision,
)))
pub fn precision(precision: U64Range) -> Constraint {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point in my readthrough, it's not obvious to me why we need both a U64Range and a UsizeRange.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not obvious, but here's why. All of the *_length constraints check a len() function that returns a usize. However, Decimal::precision() returns a u64. Rather than deal with a potentially lossy conversion if usize is not 64 bits, I'm just using both. I'll add a comment to that effect.

Now I'm curious to find out whether u64 is the the best return type for precision(). (Maybe one of the non-zero types, even.)

Comment on lines +17 to +23
fn as_usize(&self) -> Option<usize> {
match self.value() {
Value::Int(Int::I64(i)) => i.to_usize(),
Value::Int(Int::BigInt(i)) => i.to_usize(),
_ => None,
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocking concern: for these conversions, note that it will not be possible to tell the difference between "The Element was not an integer" and "The Element was an integer that couldn't fit in a usize."

Something I plan to eventually add to Element is a series of expect_TYPE() -> IonResult<T> methods so the caller can receive a more detailed error if things go wrong. It also makes it easy to propagate errors with ? where needed.

Also: the next release of ion-rs makes Int and UInt into opaque types with From and TryFrom implementations for the various integer types. Nothing to do for the moment though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocking concern: for these conversions, note that it will not be possible to tell the difference between "The Element was not an integer" and "The Element was an integer that couldn't fit in a usize."

I think that scenario is unlikely to be a problem—it will only occur if people are using numbers that are too large for u64 and usize as arguments to constraints in their ISL. We can revisit it once the expect_TYPE() functions are available. For now, I'll make the documentation a little more explicit.

ion-schema/src/ion_extension.rs Outdated Show resolved Hide resolved
ion-schema/src/ion_extension.rs Show resolved Hide resolved
/// The end of a [`Range`].
#[derive(Debug, PartialEq, Clone)]
pub enum Limit<T> {
Unbounded,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find Inclusive and Exclusive more intuitive than Closed and Open, but I think which pair you're familiar with depends on the domain(s) you're coming from.

ion-schema/src/isl/ranges.rs Outdated Show resolved Hide resolved
Comment on lines 9 to 12
//! using `u64`. It would be ideal to use [`NonZeroU64`][std::num::NonZeroU64] for this case, but
//! `NonZeroU64` is difficult to work with because it doesn't implement core ops (such as `Add`)
//! and as a result, it cannot even implement the checked ops traits (such as `CheckedAdd`) from
//! the `num_traits` crate. See https://github.com/rust-num/num-traits/issues/274.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "It would be ideal [...]" portion of this seems like it might be better suited to a developer comment, but I don't feel strongly about it.

/// Checks if two limits would result in an empty range of integers. Returns true if there are
/// no integer values between `start` and `end`, taking into consideration the exclusivity of
/// the limits.
pub fn _is_int_range_empty<T: CheckedAdd + One + PartialOrd>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Rust, names with a leading _ indicate that the item is never used but you wanted to give it a meaningful name anyway. Unless there's another convention of which I'm unaware, consider renaming this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops... too much python recently. I'll fix it.

ion-schema/src/types.rs Outdated Show resolved Hide resolved
/// The end of a [`Range`].
#[derive(Debug, PartialEq, Clone)]
pub enum Limit<T> {
Unbounded,
Copy link
Contributor

@desaikd desaikd Jul 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think we should add some comments about what Unbounded represents here and when to use it? I see that you have used it to represent min/max but it would be helpful to add some doc comments for it so the usecase is clear.


/// Creates a new range with inclusive endpoints.
/// [start] must be less than or equal to [end].
pub fn new_inclusive(start: T, end: T) -> IonSchemaResult<Self> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Suggestion) I should have probably brought this up in the first revision but should we name this new_*() methods without the new keyword. e.g. (new_inclusive() -> inclusive())? For constraints/types I have used similar convention (e.g. IslConstraint::byte_length(), IslType::named()). Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😅 I actually think that IslConstraint::byte_length() should cease to exist. Instead, to get a byte length variant of IslConstraint, you should use IslConstraint::ByteLength(ByteLength::new(...)) or ByteLength::new(...).into(). Same thing with IslType.

That being said, if you feel strongly about getting rid of the new_ prefix, I will do it. I just figured it would be a helpful way to signal that it is a constructor.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to follow this design pattern example(although this is for a builder pattern ):

impl FooBuilder {
    pub fn new(/* ... */) -> FooBuilder {
        // Set the minimally required fields of Foo.
        FooBuilder {
            ...
        }
    }

    pub fn name(mut self, bar: String) -> FooBuilder {
        // Set the name on the builder itself, and return the builder by value.
        ...
    }
}

I don't think this needs to block the PR though, but I do think whatever we decide to do it should be consistent overall. I have created this issue to make changes for it according to what we decide.

@popematt popematt merged commit 5c099af into amazon-ion:main Jul 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants