Cross-platform Near-zero Overhead Grammar-guided Generation for LLMs
- G1: Universal: support any common tokenizer, and common grammar
- G2: Efficient: Grammar should not cause additional burden for generation
- G3: Cross-platform: pure C++ impl, portable for every platform, construct E2E pipeline on every platform
- G4: Easy to understand and maintain