Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix core utils #1967

Merged
merged 8 commits into from
Oct 6, 2024
Merged

Fix core utils #1967

merged 8 commits into from
Oct 6, 2024

Conversation

ternaus
Copy link
Collaborator

@ternaus ternaus commented Oct 5, 2024

Fixes #1958

Summary by Sourcery

Fix numerical label handling in the LabelEncoder class, refactor label processing logic, enhance utility functions for better type handling, and update tests to cover new scenarios. Update pre-commit configuration to the latest Ruff version.

Bug Fixes:

  • Fix the handling of numerical labels in the LabelEncoder class to correctly process and transform numerical data without encoding.

Enhancements:

  • Refactor the label processing logic by introducing helper methods for validation, encoding, and decoding of label fields.
  • Improve the to_tuple function to handle various input types and apply optional low bounds or biases, with added overloads for type safety.
  • Enhance the create_symmetric_range function with overloads for better type handling of integer and float inputs.

Tests:

  • Add tests for the LabelEncoder to ensure correct handling of numpy arrays and 2D arrays, verifying the shape and content of encoded and decoded labels.

Chores:

  • Update the pre-commit configuration to use Ruff version v0.6.9.

Copy link
Contributor

sourcery-ai bot commented Oct 5, 2024

Reviewer's Guide by Sourcery

This pull request implements several improvements and bug fixes to the core utils of the Albumentations library, focusing on enhancing the LabelEncoder class and the to_tuple function. The changes aim to improve type handling, add support for numerical labels, and refactor code for better maintainability.

Updated class diagram for LabelEncoder

classDiagram
    class LabelEncoder {
        - classes_: dict[str | int | float, int]
        - inverse_classes_: dict[int, str | int | float]
        - num_classes: int
        - is_numerical: bool
        + fit(y: Sequence[Any] | np.ndarray) LabelEncoder
        + transform(y: Sequence[Any] | np.ndarray) np.ndarray
        + fit_transform(y: Sequence[Any] | np.ndarray) np.ndarray
        + inverse_transform(y: Sequence[Any] | np.ndarray) np.ndarray
    }
Loading

Updated class diagram for DataProcessor

classDiagram
    class DataProcessor {
        - data_fields: list[str]
        - label_encoders: dict[str, dict[str, LabelEncoder]]
        - is_sequence_input: dict[str, bool]
        - is_numerical_label: dict[str, dict[str, bool]]
        + add_label_fields_to_data(data: dict[str, Any]) dict[str, Any]
        + remove_label_fields_from_data(data: dict[str, Any]) dict[str, Any]
        + _process_label_fields(data: dict[str, Any], data_name: str) np.ndarray
        + _validate_label_field_length(data: dict[str, Any], data_name: str, label_field: str) void
        + _encode_label_field(data: dict[str, Any], data_name: str, label_field: str) np.ndarray
        + _handle_empty_data_array(data: dict[str, Any]) void
        + _remove_label_fields(data: dict[str, Any], data_name: str) void
        + _decode_label_field(data_name: str, label_field: str, encoded_labels: np.ndarray) np.ndarray
    }
Loading

Updated class diagram for to_tuple function

classDiagram
    class to_tuple {
        + validate_args(low: ScaleType | None, bias: ScalarType | None) void
        + process_sequence(param: Sequence[ScalarType]) tuple[ScalarType, ScalarType]
        + process_scalar(param: ScalarType, low: ScalarType | None) tuple[ScalarType, ScalarType]
        + apply_bias(min_val: ScalarType, max_val: ScalarType, bias: ScalarType) tuple[ScalarType, ScalarType]
        + ensure_int_output(min_val: ScalarType, max_val: ScalarType, param: ScalarType) tuple[int, int] | tuple[float, float]
        + to_tuple(param: ScaleType, low: ScaleType | None = None, bias: ScalarType | None = None) tuple[int, int] | tuple[float, float]
    }
Loading

File-Level Changes

Change Details Files
Enhance LabelEncoder to handle numerical labels
  • Add is_numerical flag to determine if labels are numerical
  • Modify fit method to handle numerical labels
  • Update transform and inverse_transform methods to process numerical labels
  • Add support for numpy array inputs
albumentations/core/utils.py
tests/test_core_utils.py
Refactor and improve to_tuple function
  • Add type hints and overloads for better type checking
  • Improve error handling and input validation
  • Split functionality into smaller, more focused functions
  • Enhance documentation with examples and more detailed explanations
albumentations/core/utils.py
albumentations/core/pydantic.py
Update type definitions and imports
  • Add new type definitions for ScaleFloatType and ScaleIntType
  • Update imports to include new types and functions
  • Add overloads for create_symmetric_range function
albumentations/core/types.py
albumentations/core/pydantic.py
Enhance test coverage
  • Add new test cases for LabelEncoder with numpy arrays
  • Include tests for 2D array inputs
  • Update existing tests to cover new functionality
tests/test_core_utils.py
Update dependencies
  • Update ruff-pre-commit from v0.6.8 to v0.6.9
.pre-commit-config.yaml

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time. You can also use
    this command to specify where the summary should be inserted.

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @ternaus - I've reviewed your changes and they look great!

Here's what I looked at during the review
  • 🟢 General issues: all looks good
  • 🟢 Security: all looks good
  • 🟡 Testing: 4 issues found
  • 🟢 Complexity: all looks good
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

tests/test_core_utils.py Show resolved Hide resolved
tests/test_core_utils.py Show resolved Hide resolved
tests/test_core_utils.py Show resolved Hide resolved
tests/test_core_utils.py Show resolved Hide resolved
albumentations/core/utils.py Show resolved Hide resolved
albumentations/core/utils.py Show resolved Hide resolved
@ternaus ternaus merged commit b8648ff into main Oct 6, 2024
17 checks passed
@ternaus ternaus deleted the fix_core_utils branch October 6, 2024 21:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TypeError with v1.4.15 and newer for masks augmentation
1 participant