Skip to content

Commit

Permalink
Setting normalize_scores default to False and adding some documentati…
Browse files Browse the repository at this point in the history
…on about the parameter
  • Loading branch information
NohTow committed Oct 15, 2024
1 parent ddaf8f8 commit 7a0671f
Show file tree
Hide file tree
Showing 2 changed files with 7 additions and 1 deletion.
6 changes: 6 additions & 0 deletions docs/documentation/training.md
Original file line number Diff line number Diff line change
Expand Up @@ -176,6 +176,12 @@ trainer.train()

Refer to this [documentation](https://sbert.net/docs/sentence_transformer/training/distributed.html) for more information.

Note that the Distillation also support min-max normalizing the output scores, which have been shown to improve results if the teacher scores are also normalized in [JaColBERTv2.5](https://arxiv.org/pdf/2407.20750) but the gains are not guaranteed as shown in [Jina-ColBERT-v2](https://arxiv.org/abs/2408.16672).
To normalize the output scores, simply use the ```normalize_scores``` parameter when creating the loss object (you still have to normalize the scores in your dataset):
```python
train_loss = losses.Distillation(model=model, normalize_scores=True)
```

## ColBERT parameters
All the parameters of the ColBERT modeling can be found [here](https://lightonai.github.io/pylate/api/models/ColBERT/#parameters). Important parameters to consider are:

Expand Down
2 changes: 1 addition & 1 deletion pylate/losses/distillation.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ def __init__(
model: ColBERT,
score_metric: Callable = colbert_kd_scores,
size_average: bool = True,
normalize_scores: bool = True,
normalize_scores: bool = False,
) -> None:
super(Distillation, self).__init__()
self.score_metric = score_metric
Expand Down

0 comments on commit 7a0671f

Please sign in to comment.