[DRAFT] Support multiple tokenizers and other layers with assets #1860

mattdangerw · 2024-09-22T21:42:48Z

Preset saving and loading does not currently generalize to multiple tokenizers (or other preprocessor with static assets), this is a work in progress PR towards adding it, specifically for stable diffusion.

The high-level api would allow something like this

# High-level loading.
image_to_text = keras_hub.models.ImageToText.from_preset(
    "sd3_preset_name",
)
# Low-level tokenizer loading.
clip_l_tokenizer = kersa_hub.tokenizers.Tokenizer.from_preset(
    "sd3_preset_name", config_file="clip_l_tokenizer.json",
)
clip_g_tokenizer = kersa_hub.tokenizers.Tokenizer.from_preset(
    "sd3_preset_name", config_file="clip_g_tokenizer.json",
)

During conversion, we would need to make sure each tokenizer was created with a separate config_file passed to the constructor. Then when calling task.save_to_preset("path"), you would get the following structure.

assets/clip_l_tokenizer/...
assets/clip_g_tokenizer/...
assets/t5_tokenizer/...
clip_l_tokenizer.json
clip_g_tokenizer.json
t5_tokenizer.json

This is just a WIP commit to share with @james77777778 , tests will not pass yet.

Preset saving and loading does not currently generalize to multiple tokenizers (or other preprocessor with static assets), this is a work in progress PR towards adding it, specifically for stable diffusion. The high-level api would allow something like this ```python # High-level loading. image_to_text = keras_hub.models.ImageToText.from_preset( "sd3_preset", ) # Low-level tokenizer loading. clip_l_tokenizer = kersa_hub.tokenizers.Tokenizer.from_preset( "sd3_preset", config_file="clip_l_tokenizer.json", ) clip_g_tokenizer = kersa_hub.tokenizers.Tokenizer.from_preset( "sd3_preset", config_file="clip_g_tokenizer.json", ) ``` During conversion, we would need to make sure each tokenizer was created with a separate `config_file` passed to the constructor. Then when calling `task.save_to_preset("path")`, you would get the following structure. ``` assets/clip_l_tokenizer/... assets/clip_g_tokenizer/... assets/t5_tokenizer/... clip_l_tokenizer.json clip_g_tokenizer.json t5_tokenizer.json ```

mattdangerw · 2024-09-22T21:44:01Z

@james77777778 just creating this to share a possible solution to #1820 (comment), feel free to patch in and play around. Though I would still merge #1820 without a solution here so we can stay incremental.

james77777778 · 2024-09-24T03:25:21Z

I will try this PR when porting the weights of SD3.
Should I take over this PR, or will you merge it?

divyashreepathihalli · 2024-09-24T05:51:41Z

I will try this PR when porting the weights of SD3. Should I take over this PR, or will you merge it?

Hi James! Please feel free to take this PR up.

mattdangerw · 2024-09-25T20:30:29Z

Yes sorry for the delay, but please take this PR up. Let's test this out with sd3

mattdangerw · 2024-09-25T20:30:39Z

Yes sorry for the delay, but please take this PR up. Let's test this out with sd3

mattdangerw mentioned this pull request Sep 22, 2024

Add StableDiffusion3 #1820

Merged

4 tasks

divyashreepathihalli assigned james77777778 Sep 24, 2024

james77777778 mentioned this pull request Sep 26, 2024

Add StableDiffusion3 preset #1884

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DRAFT] Support multiple tokenizers and other layers with assets #1860

[DRAFT] Support multiple tokenizers and other layers with assets #1860

mattdangerw commented Sep 22, 2024 •

edited

Loading

mattdangerw commented Sep 22, 2024

james77777778 commented Sep 24, 2024

divyashreepathihalli commented Sep 24, 2024

mattdangerw commented Sep 25, 2024

mattdangerw commented Sep 25, 2024

[DRAFT] Support multiple tokenizers and other layers with assets #1860

Are you sure you want to change the base?

[DRAFT] Support multiple tokenizers and other layers with assets #1860

Conversation

mattdangerw commented Sep 22, 2024 • edited Loading

mattdangerw commented Sep 22, 2024

james77777778 commented Sep 24, 2024

divyashreepathihalli commented Sep 24, 2024

mattdangerw commented Sep 25, 2024

mattdangerw commented Sep 25, 2024

mattdangerw commented Sep 22, 2024 •

edited

Loading