Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose Distance Information in KMedoids #168

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

skfaysal
Copy link

Description

This pull request enhances the KMedoids implementation by exposing the distances of data points to their respective medoids. Previously, this information was internally computed during the clustering process but not exposed to users. The addition of the distances_ attribute allows users to access these distances without the need for additional pairwise distance calculations, which can be computationally expensive.

Changes Made

Addition of distances_ Attribute:

A new attribute, distances_, has been introduced to store the distances of each data point to its assigned medoid.
Modification of fit Method:

The distances are now computed using the existing transform method and stored in the distances_ attribute.
The self.inertia_ attribute is updated to use the distances directly, avoiding redundant pairwise distance calculations.

Motivation

The motivation behind this enhancement is to provide users with direct access to the distances between data points and their respective medoids. This information can be valuable for users who wish to perform additional statistical analyses, such as identifying the closest data points to medoids, without incurring the cost of recomputing pairwise distances.

Example Usage

Users can now access the distances using the distances_ attribute after fitting the model:

kmedoids_model = KMedoids(n_clusters=3)
kmedoids_model.fit(data)
distances_to_medoids = kmedoids_model.distances_

This information can be utilized for various purposes, enhancing the flexibility and utility of the KMedoids implementation.

@TimotheeMathieu
Copy link
Contributor

Thanks, this looks good. The tests are failing for now but this should be fixed with PR #167, when PR#167 is merged we can merge here and check that everything is ok.

Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would significantly increase the size of the estimator. We should instead have a constructor argument to enable exposing this attribute, which is false by default and the user can enable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants