Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataLogger/DataMonitor callback for logging input, pred, and target #103

Open
ibro45 opened this issue Jan 30, 2024 · 1 comment
Open

DataLogger/DataMonitor callback for logging input, pred, and target #103

ibro45 opened this issue Jan 30, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@ibro45
Copy link
Collaborator

ibro45 commented Jan 30, 2024

🚀 Feature Request

image

🔈 Motivation

After reverting to PL loggers instead of LighterLogger, we lost the ability to log input/pred/target.

🛰 Alternatives

📎 Additional context

@ibro45 ibro45 added the enhancement New feature or request label Jan 30, 2024
Copy link

dosubot bot commented Jan 30, 2024

🤖

Sure, it's possible to add a DataLogger/DataMonitor callback to the Lighter framework to log input, prediction, and target data. This can be achieved by extending the LighterBaseWriter class and implementing the writers property and write method.

In the on_predict_batch_end method, the outputs dictionary contains the predictions and the batch variable contains the input data. You can extend this method to also log the input and target data.

Here is an example of how you can do this:

class DataLogger(LighterBaseWriter):
    @property
    def writers(self) -> Dict[str, Callable]:
        return {
            "csv": self.write_csv,
            # Add other writers here
        }

    def write(self, tensor: torch.Tensor, id: int) -> None:
        # Implement your writing strategy here
        pass

    def write_csv(self, tensor: torch.Tensor, id: int) -> None:
        # Implement your CSV writing strategy here
        pass

    def on_predict_batch_end(
        self, trainer: Trainer, pl_module: LighterSystem, outputs: Any, batch: Any, batch_idx: int, dataloader_idx: int = 0
    ) -> None:
        super().on_predict_batch_end(trainer, pl_module, outputs, batch, batch_idx, dataloader_idx)

        # Log input and target data
        input_data = batch["input"]
        target_data = batch["target"]
        for id, input, target in zip(outputs["id"], input_data, target_data):
            # Log input and target data here
            pass

In this example, DataLogger is a new callback that logs the input, prediction, and target data. The writers property returns a dictionary of writer functions, and the write method defines how a tensor should be saved. The on_predict_batch_end method is extended to also log the input and target data.

You can find more details about the LighterBaseWriter class in the Lighter codebase.

As for the PyTorch Lightning logger, it doesn't directly handle the logging of input, prediction, and target data. Instead, it uses a callback system to handle these tasks. The on_predict_batch_end method is a callback method that is executed at the end of each prediction batch/step. If the IDs are not provided, it generates global unique IDs based on the prediction count. Finally, it writes the predictions using the specified writer.

I hope this helps! If you have any other questions or need further clarification, feel free to ask.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant