Skip to content

Commit

Permalink
Create @huggingface/jinja for parsing and rendering Jinja chat temp…
Browse files Browse the repository at this point in the history
…lates (#352)

This PR introduces the `@huggingface/jinja` library, which is a
minimalistic JavaScript implementation of the Jinja templating engine,
specifically designed for parsing ML chat templates.

Although it was [originally
created](huggingface/transformers.js#408) for (and
integrated into) transformers.js, it became clear that others can use
this functionality too, without the overhead of the transformers.js
library.


**Example usage:** Loading a `tokenizer_config.json` from the HF hub and
render a list of messages

```js
import { Template } from "@huggingface/templates";
import { downloadFile } from "@huggingface/hub";

const config = await (await downloadFile({
    repo: "mistralai/Mistral-7B-Instruct-v0.1",
    path: "tokenizer_config.json"
})).json();

const chat = [
    { "role": "user", "content": "Hello, how are you?" },
    { "role": "assistant", "content": "I'm doing great. How can I help you today?" },
    { "role": "user", "content": "I'd like to show off how chat templating works!" },
];

const template = new Template(config.chat_template);
const result = template.render({
    messages: chat,
    bos_token: config.bos_token,
    eos_token: config.eos_token,
});
// "<s>[INST] Hello, how are you? [/INST]I'm doing great. How can I help you today?</s> [INST] I'd like to show off how chat templating works! [/INST]"
```

---------

Co-authored-by: coyotte508 <[email protected]>
  • Loading branch information
xenova and coyotte508 authored Dec 14, 2023
1 parent 2410462 commit 355dfe5
Show file tree
Hide file tree
Showing 21 changed files with 3,838 additions and 0 deletions.
54 changes: 54 additions & 0 deletions .github/workflows/jinja-publish.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
name: Jinja - Version and Release

on:
workflow_dispatch:
inputs:
newversion:
description: "Semantic Version Bump Type (major minor patch)"
default: patch

concurrency:
group: "push-to-main"

defaults:
run:
working-directory: packages/jinja

jobs:
version_and_release:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
with:
token: ${{ secrets.BOT_ACCESS_TOKEN }}
- run: corepack enable
- uses: actions/setup-node@v3
with:
node-version: "18"
cache: "pnpm"
cache-dependency-path: |
packages/jinja/pnpm-lock.yaml
# setting a registry enables the NODE_AUTH_TOKEN env variable where we can set an npm token. REQUIRED
registry-url: "https://registry.npmjs.org"
- run: pnpm install
- run: git config --global user.name machineuser
- run: git config --global user.email [email protected]
- run: |
PACKAGE_VERSION=$(node -p "require('./package.json').version")
BUMPED_VERSION=$(node -p "require('semver').inc('$PACKAGE_VERSION', '${{ github.event.inputs.newversion }}')")
# Update package.json with the new version
node -e "const fs = require('fs'); const package = JSON.parse(fs.readFileSync('./package.json')); package.version = '$BUMPED_VERSION'; fs.writeFileSync('./package.json', JSON.stringify(package, null, '\t') + '\n');"
git commit . -m "🔖 @hugginface/jinja $BUMPED_VERSION"
git tag "jinja-v$BUMPED_VERSION"
- run: pnpm publish --no-git-checks .
env:
NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
- run: git push --follow-tags
# hack - reuse actions/setup-node@v3 just to set a new registry
- uses: actions/setup-node@v3
with:
node-version: "18"
registry-url: "https://npm.pkg.github.com"
- run: pnpm publish --no-git-checks .
env:
NODE_AUTH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
1 change: 1 addition & 0 deletions packages/hub/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@
"format:check": "prettier --check .",
"prepublishOnly": "pnpm run build",
"build": "tsup",
"prepare": "pnpm run build",
"test": "vitest run",
"test:browser": "vitest run --browser.name=chrome --browser.headless --config vitest-browser.config.mts",
"check": "tsc"
Expand Down
1 change: 1 addition & 0 deletions packages/jinja/.eslintignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
dist
2 changes: 2 additions & 0 deletions packages/jinja/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
types
dist
69 changes: 69 additions & 0 deletions packages/jinja/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
# Jinja

A minimalistic JavaScript implementation of the Jinja templating engine, specifically designed for parsing and rendering ML chat templates.

## Usage

### Load template from a model on the Hugging Face Hub

First, install the templates and hub packages:

```sh
npm i @huggingface/templates
npm i @huggingface/hub
```

You can then load a tokenizer from the Hugging Face Hub and render a list of chat messages, as follows:

```js
import { Template } from "@huggingface/templates";
import { downloadFile } from "@huggingface/hub";

const config = await (
await downloadFile({
repo: "mistralai/Mistral-7B-Instruct-v0.1",
path: "tokenizer_config.json",
})
).json();

const chat = [
{ role: "user", content: "Hello, how are you?" },
{ role: "assistant", content: "I'm doing great. How can I help you today?" },
{ role: "user", content: "I'd like to show off how chat templating works!" },
];

const template = new Template(config.chat_template);
const result = template.render({
messages: chat,
bos_token: config.bos_token,
eos_token: config.eos_token,
});
// "<s>[INST] Hello, how are you? [/INST]I'm doing great. How can I help you today?</s> [INST] I'd like to show off how chat templating works! [/INST]"
```

### Transformers.js (coming soon)

First, install the `@huggingface/templates` and `@xenova/transformers` packages:

```sh
npm i @huggingface/templates
npm i @xenova/transformers
```

```js
import { AutoTokenizer } from "@xenova/transformers";

const tokenizer = await AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.1");

const chat = [
{ role: "user", content: "Hello, how are you?" },
{ role: "assistant", content: "I'm doing great. How can I help you today?" },
{ role: "user", content: "I'd like to show off how chat templating works!" },
];

const text = tokenizer.apply_chat_template(chat, { tokenize: false });
// "<s>[INST] Hello, how are you? [/INST]I'm doing great. How can I help you today?</s> [INST] I'd like to show off how chat templating works! [/INST]"

const input_ids = tokenizer.apply_chat_template(chat, { tokenize: true, return_tensor: false });
// [1, 733, 16289, 28793, 22557, 28725, 910, 460, 368, 28804, 733, 28748, 16289, 28793, 28737, 28742, 28719, 2548, 1598, 28723, 1602, 541, 315, 1316, 368, 3154, 28804, 2, 28705, 733, 16289, 28793, 315, 28742, 28715, 737, 298, 1347, 805, 910, 10706, 5752, 1077, 3791, 28808, 733, 28748, 16289, 28793]
```
56 changes: 56 additions & 0 deletions packages/jinja/package.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
{
"name": "@huggingface/jinja",
"packageManager": "[email protected]",
"version": "0.0.1",
"description": "A minimalistic JavaScript implementation of the Jinja templating engine, specifically designed for parsing and rendering ML chat templates.",
"repository": "https://github.com/huggingface/huggingface.js.git",
"publishConfig": {
"access": "public"
},
"type": "module",
"main": "./dist/index.cjs",
"module": "./dist/index.js",
"types": "./dist/index.d.ts",
"exports": {
".": {
"types": "./dist/index.d.ts",
"require": "./dist/index.cjs",
"import": "./dist/index.js"
}
},
"engines": {
"node": ">=18"
},
"source": "src/index.ts",
"scripts": {
"lint": "eslint --quiet --fix --ext .cjs,.ts .",
"lint:check": "eslint --ext .cjs,.ts .",
"format": "prettier --write .",
"format:check": "prettier --check .",
"prepublishOnly": "pnpm run build",
"build": "tsup src/index.ts --format cjs,esm --clean --dts",
"test": "vitest run",
"test:browser": "vitest run --browser.name=chrome --browser.headless",
"check": "tsc"
},
"files": [
"src",
"dist",
"README.md",
"tsconfig.json"
],
"keywords": [
"huggingface",
"jinja",
"templates",
"hugging",
"face"
],
"author": "Hugging Face",
"license": "MIT",
"devDependencies": {
"typescript": "^5.3.2",
"@huggingface/hub": "workspace:^",
"@xenova/transformers": "^2.9.0"
}
}
Loading

0 comments on commit 355dfe5

Please sign in to comment.