Skip to content

InterDim is a streamlined package for interactive exploration of latent data dimensions

License

Notifications You must be signed in to change notification settings

MShinkle/interdim

Repository files navigation

InterDim

Docs and Tests Python Versions License Docs

Interactive Dimensionality Reduction, Clustering, and Visualization

InterDim is a Python package for interactive exploration of latent data dimensions. It wraps existing tools for dimensionality reduction, clustering, and data visualization in a streamlined interface, allowing for quick and intuitive analysis of high-dimensional data.

Features

  • Easy-to-use pipeline for dimensionality reduction, clustering, and visualization
  • Interactive 3D scatter plots for exploring reduced data
  • Support for various dimensionality reduction techniques (PCA, t-SNE, UMAP, etc.)
  • Multiple clustering algorithms (K-means, DBSCAN, etc.)
  • Customizable point visualizations for detailed data exploration

Installation

You can install from PyPI via pip (recommended):

pip install interdim

Or from source:

git clone https://github.com/MShinkle/interdim.git
cd interdim
pip install .

Quick Start

Here's a basic example using the Iris dataset:

from sklearn.datasets import load_iris
from interdim import InterDimAnalysis

iris = load_iris()
analysis = InterDimAnalysis(iris.data, true_labels=iris.target)
analysis.reduce(method='tsne', n_components=3)
analysis.cluster(method='kmeans', n_clusters=3)
analysis.show(n_components=3, point_visualization='bar')

3D Scatter Plot with Interactive Bar Charts

This will reduce the Iris dataset to 3 dimensions using t-SNE, clusters the data using K-means, and displays an interactive 3D scatter plot with bar charts for each data point as you hover over them.

However, this is just a small example of what you can do with InterDim. You can use it to explore all sorts of data, including high-dimensional data like language model embeddings!

Demo Notebooks

For more in-depth examples and use cases, check out our demo notebooks:

  1. Iris Species Analysis: Basic usage with the classic Iris dataset. Iris Species Analysis

  2. DNN Latent Space Exploration: Visualizing deep neural network activations. DNN Latent Space Exploration

  3. LLM Token Analysis: Exploring language model token embeddings and layer activations. LLM Token Analysis

Documentation

For detailed API documentation and advanced usage, visit our GitHub Pages.

Contributing

We welcome discussion and contributions!

License

InterDim is released under the BSD 3-Clause License. See the LICENSE file for details.

Contact

For questions and feedback, please open an issue on GitHub.

About

InterDim is a streamlined package for interactive exploration of latent data dimensions

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published