Postprocessing Session.Run output #15452

lilhoser · 2023-04-10T20:48:21Z

lilhoser
Apr 10, 2023

The C# tutorial is very helpful, but it loses me at the postprocessing step. The underlying LLM I'm using is Alpaca LORA and the output is an array of logit values, so the algorithm in the tutorial doesn't work. I need to replicate the generate function here:

https://github.com/tloen/alpaca-lora/blob/630d1146c8b5a968f5bf4f02f50f153a0c9d449d/generate.py

or for LLAMA, here: https://github.com/facebookresearch/llama/blob/main/llama/generation.py

Does ONNX runtime provide support for converting the logit values to token IDs I can pass to my decoder?

Answered by wschin

Apr 12, 2023

Not sure which part is the question. If you want to edit the exported ONNX, you can try creating your own ONNX node and insert that node to ONNX model (e.g., via onnx_model.graph.node.append(new_node) and onnx_model.graph.output.append(new_node.output[0])). If you want to figure out the word corresponding to an index (e.g., 124 -> hello), you need to check the original dictionary used to train the model -- that dictionary is not captured.

View full answer

wschin · 2023-04-11T20:10:58Z

wschin
Apr 11, 2023
Collaborator

This kind of post-processing can be done by modifying the original model and export to the modified model to ONNX again. For example, you have

class Origin(torch.nn.Module):
  def __init__(self, ...):
    ...
  def forward(self, ...):
    ...
    return logit

To get the actual index with max probability, you can do

class Modified(torch.nn.Module):
  def __init__(self, ...):
    self.origin_model = Origin(...)
    ...
  def forward(self, ...):
    ...
    logit = self.origin_model(...)
    return torch.argmax(logit, dim=-1)

Finally you can call torch.onnx.export(Modified(...), ...) again.

1 reply

lilhoser Apr 11, 2023
Author

Is there a way to post-process the logits/predictions without modifying the model in python?

The model is outputting a tensor of logits: float[batch_size, 1, 32000] which just need to be converted to softmax predictions and then decoded. I think it's just a matter of my misunderstanding of how to map these predictions to vocabulary for decoding. Any help here would be appreciated!

wschin · 2023-04-12T17:17:39Z

wschin
Apr 12, 2023
Collaborator

Not sure which part is the question. If you want to edit the exported ONNX, you can try creating your own ONNX node and insert that node to ONNX model (e.g., via onnx_model.graph.node.append(new_node) and onnx_model.graph.output.append(new_node.output[0])). If you want to figure out the word corresponding to an index (e.g., 124 -> hello), you need to check the original dictionary used to train the model -- that dictionary is not captured.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Postprocessing Session.Run output #15452

{{title}}

Replies: 2 comments 1 reply

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Postprocessing Session.Run output #15452

lilhoser Apr 10, 2023

Replies: 2 comments · 1 reply

wschin Apr 11, 2023 Collaborator

lilhoser Apr 11, 2023 Author

wschin Apr 12, 2023 Collaborator

lilhoser
Apr 10, 2023

Replies: 2 comments 1 reply

wschin
Apr 11, 2023
Collaborator

lilhoser Apr 11, 2023
Author

wschin
Apr 12, 2023
Collaborator