how to configure tokenization for inference time with rest api #155

tmills · 2023-07-18T00:14:42Z

The hierarchical model has special data preparation to chunk the data into a certain number of chunks of a certain length each. The maximum sequence length is the product of these two numbers. But the length is constrained only by the base encoder (say ~512) and the number of chunks isn't built into the network because attention will average over them. So it isn't strictly required to process the data the same time at inference as during train, and so we don't even put those parameters in the model config. Without them in the config, it's hard to even suggest good numbers, but we maybe want to stay flexible enough to allow them to change? Eh, maybe not.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

how to configure tokenization for inference time with rest api #155

how to configure tokenization for inference time with rest api #155

tmills commented Jul 18, 2023

how to configure tokenization for inference time with rest api #155

how to configure tokenization for inference time with rest api #155

Comments

tmills commented Jul 18, 2023