Skip to content

The official code for DriveDiTFi-Fine-tuning-Diffusion-Transformers-for-Autonomous-Driving.

Notifications You must be signed in to change notification settings

TtuHamg/DriveDiTFit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

DriveDiTFit: Fine-tuning Diffusion Transformers for Autonomous Driving.

Introduction

In autonomous driving, deep models have shown remarkable performance across various visual perception tasks with the demand of high-quality and huge-diversity training datasets. Such datasets are expected to cover various driving scenarios with adverse weather, lighting conditions and diverse moving objects. However, manually collecting these data presents huge challenges and expensive cost. With the rapid development of large generative models, we propose DriveDiTFit, a novel method for efficiently generating autonomous Driving data by FineTuning pre-trained Diffusion Transformers (DiTs). Specifically, DriveDiTFit utilizes a gap-driven modulation technique to carefully select and efficiently fine-tune a few parameters in DiTs according to the discrepancy between the pre-trained source data and the target driving data. Additionally, DriveDiTFit develops an effective weather and lighting condition embedding module to ensure diversity in the generated data, which is initialized by a nearest-semantic-similarity initialization approach. Through progressive tuning scheme to refined the process of detail generation in early diffusion process and enlarging the weights corresponding to small objects in training loss, DriveDiTFit ensures high-quality generation of small moving objects in the generated data. Extensive experiments conducted on driving datasets confirm that our method could efficiently produce diverse real driving data.

Discrpancy between Driving Scenarios and Classification Datasets.

alt text

Method

alt text

Results

alt text alt text

Requirements

Environment

  1. torch 2.1.2
  2. torchvision 0.16.2
  3. timm 0.9.12
  4. cuda 12.2

Datasets

We compare DriveDiTFit and other effecient fine-tuning methods on the Ithaca365 dataset and BDD100K dataset.

Preprocessing

Coming soon!

Fine-Tuning

Coming soon!

Acknolegment

The implentation of Diffusion Transformer is based on DiT and DiffFit.

Many thanks to its contributors!

Citation

If you find our work helpful for your research, please consider citing our work.

@article{tu2024driveditfit,
  title={DriveDiTFit: Fine-tuning Diffusion Transformers for Autonomous Driving},
  author={Tu, Jiahang and Ji, Wei and Zhao, Hanbin and Zhang, Chao and Zimmermann, Roger and Qian, Hui},
  journal={arXiv preprint arXiv:2407.15661},
  year={2024}
}

About

The official code for DriveDiTFi-Fine-tuning-Diffusion-Transformers-for-Autonomous-Driving.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published