Skip to content

A script for inference using T5 in Java #17432

Discussion options

You must be logged in to vote

For the next step the input ids should be your start of sequence token (which you can get by loading in the sentencepiece protobuf and querying it for the start of sequence id), then whatever token you sampled from the logits. The encoder_hidden_states should be the same, those won't change. You'll want the version that accepts past_key_values and supply those from the outputs of the decoder.

It's a little messy as you don't want to reallocate the buffer every time. It might be better to allocate a single direct LongBuffer (with ByteBuffer.allocateDirect(seqLength*8).order(ByteOrder.nativeOrder()).asLongBuffer()) then set the position to 0 and increment the limit each time you wrap it int…

Replies: 1 comment 2 replies

Comment options

You must be logged in to vote
2 replies
@AayushSameerShah
Comment options

@Craigacp
Comment options

Answer selected by AayushSameerShah
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants