Skip to content

Latest commit

 

History

History
58 lines (40 loc) · 2.75 KB

README.md

File metadata and controls

58 lines (40 loc) · 2.75 KB

Logo

Graphic Design with Large Multimodal Model

Yutao Cheng* , Zhao Zhang* , Maoke Yang*
Hui Nie, Chunyuan Li, Xinglong Wu, and Jie Shao
[arXiv 📚] [Layout Results 🖼️] [Bibtex 🔗]


Graphist is a design model based on Large Multimodal Model (LMM), designed for Hierarchical Layout Generation (HLG). Unlike traditional graphic layout generation (GLG) tasks that require a predefined sequence of layers, HLG generates graphic compositions from unordered sets of elements. The following figure illustrates the distinction between the two tasks. In HLG, the accuracy of layer ordering and spatial arrangement is crucial for the effectiveness of the final graphic composition.

shikra_case_1

The following poster are created by volunteers using our Graphist web demo. They can upload design elements, and Graphist will automatically generate a variety of graphic compositions.

shikra_case_1

shikra_case_1

Graphist effectively reinterprets HLG by treating it as a sequence generation problem. It accepts RGB-A images as input and produces a JSON draft protocol that specifies the coordinates, dimensions, and sequence of each design element. For an in-depth explanation, please consult our manuscript.

shikra_case_1

News

[2024/04/23] Our manuscript is now available on arXiv.

To-Do List

  • Release the Graphist checkpoint trained with the Crello dataset
  • Publish layout results on the Crello dataset

Cite

If you find this work beneficial, please cite it. We look forward to more researchers paying attention to the HLG task.

@article{graphist2023hlg,
  title={Graphic Design with Large Multimodal Model},
  author={Cheng, Yutao and Zhang, Zhao and Yang, Maoke and Hui, Nie and Li, Chunyuan and Wu, Xinglong and Shao, Jie},
  journal={arXiv preprint arXiv:2404.14368},
  year={2024}
}