Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add assertions to support MetricWriter #129

Open
wants to merge 2 commits into
base: ms_metric-writer-feature-branch
Choose a base branch
from

Conversation

msto
Copy link
Contributor

@msto msto commented Jun 6, 2024

I split the remaining code in #107 into two PRs to facilitate review.

This PR introduces several of the assertion methods used when constructing the MetricWriter. (See #107 for how they are used in practice).

@msto msto mentioned this pull request Jun 6, 2024
6 tasks
@msto msto force-pushed the ms_metric-writer-assertions branch from 35593a1 to 30447a9 Compare June 6, 2024 13:34
@msto msto marked this pull request as ready for review June 6, 2024 13:44
@msto msto requested review from nh13 and tfenne as code owners June 6, 2024 13:44
Copy link
Member

@nh13 nh13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor comments I'd like to see addressed. Can you also add tests (pass and fail) for:

  • _assert_fieldnames_are_metric_attributes
  • _assert_file_header_matches_metric

fgpyo/util/metric.py Show resolved Hide resolved
if dataclasses.is_dataclass(metric_class):
return [f.name for f in dataclasses.fields(metric_class)]
elif attr.has(metric_class):
return [f.name for f in attr.fields(metric_class)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use metric.header() method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Short answer: this is an artifact from the initial proof-of-concept for this class being implemented in https://github.com/msto/dataclass_io/ (i.e. only for dataclasses, not considering Metric)

Long answer: metric.header() and inspect.get_fields() have a lot of type: ignores , and I'd rather use an implementation that's type-safe. (I have a mental TODO to clean up the volume of type: ignores in the repo, and I can leave a TODO that this function should be deprecated/merged into inspect once that happens.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use it and fix it?

assert False, "Unreachable"


def _assert_file_header_matches_metric(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just saying out loud this will not work if the path is a stream, like standard input, since it consumes the header.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add an assertion that the path is not /dev/stdin. Are there other streams that are likely to be represented as a Path?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just streams I think

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bump

fgpyo/util/metric.py Outdated Show resolved Hide resolved
fgpyo/util/metric.py Outdated Show resolved Hide resolved
fgpyo/util/metric.py Outdated Show resolved Hide resolved
fgpyo/util/metric.py Show resolved Hide resolved
fgpyo/util/tests/test_metric.py Show resolved Hide resolved
@msto msto force-pushed the ms_metric-writer-feature-branch branch from 032388d to 78f2246 Compare June 6, 2024 14:46
Copy link
Member

@tfenne tfenne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stopping since you're actively working on it. Once you've addressed @nh13's comments I'll take another pass.

fgpyo/util/metric.py Show resolved Hide resolved
fgpyo/util/metric.py Outdated Show resolved Hide resolved
@msto msto force-pushed the ms_metric-writer-assertions branch from 30447a9 to 0fd747b Compare June 6, 2024 14:53
fgpyo/util/metric.py Show resolved Hide resolved
fgpyo/util/metric.py Outdated Show resolved Hide resolved
TypeError: If the given class is not a Metric.
"""
if not _is_metric_class(cls):
raise TypeError(f"Not a dataclass or attr decorated Metric: {cls}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we feel about _assert functions raising errors other than AssertionError? I would probably be surprised by this.

Copy link
Contributor Author

@msto msto Jun 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think assert should be used outside of unit tests, and I prefer to raise more specific exceptions.

The Google style guide recommends/requires this as well:

Do not use assert statements in place of conditionals or validating preconditions. They must not be critical to the application logic. A litmus test would be that the assert could be removed without breaking the code. assert conditionals are not guaranteed to be evaluated. For pytest based tests, assert is okay and expected to verify expectations.

The primary reason to avoid the use of assert statements is that they are intended for debugging and can be disabled.

Additionally, using assert makes it easy to fail to cover all branches of a program:

# Covered by any unit test that calls the parent function
assert is_ok(foo), "Foo was not ok"

if not is_ok(foo):
    # Not covered unless the condition where `is_ok(foo)` is False is tested explicitly
    raise ValueError("Foo was not ok")  

I'm happy to rename the functions to something other than assert_* (I've used validate_* in the past), I was just trying to remain consistent with naming convention in the project.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be a good case for having a custom exception:

class ValueAssertion(AssertionError, ValueError):
    pass

Then you would catch the exception with either AssertionError or ValueError.

I do like having the explicit branch with if since code coverage analysis will identify it as a covered/uncovered branch depending on actual test coverage. With single-line statements (e.g. assert), you don't get any branch information so it is easy to forget to test the exception paths.

msto added 2 commits June 6, 2024 12:00
doc: update doc

refactor: try-except read header

doc: rm raises block from get_fieldnames

refactor: set

refactor: avoid dup assert
@msto msto force-pushed the ms_metric-writer-assertions branch from 0fd747b to beee947 Compare June 6, 2024 16:00
Copy link
Member

@nh13 nh13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more item to address, then lgtm

assert False, "Unreachable"


def _assert_file_header_matches_metric(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bump

@msto msto self-assigned this Sep 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants