Finetune_segment_anything_tutorial

This is a tutorial of fine-tuning segment anything based on VOC2007 dataset

Main Results

Point prompt

Fine-tuning specifically for point prompts. Used ISAT_with_segment_anything tool, which uses SAM for automatic annotation, to compare results.

Before fine-tuning, multiple clicks on several points were needed for good segmentation.

After fine-tuning, a single click can segment the corresponding category effectively.

Before fine-tuning

After fine-tuning

Box prompt

Fine-tuning for point prompts and box prompts.

Getting Started

Installation

The code requires python>=3.8, as well as pytorch>=1.7 and torchvision>=0.8. Please follow the instructions here to install both PyTorch and TorchVision dependencies. Installing both PyTorch and TorchVision with CUDA support is strongly recommended.

A Python-3.9 virtual environment using conda is recommended

conda create --name finetuneSAM python=3.9
conda activate finetuneSAM 
git clone https://github.com/xzyun2011/finetune_segment_anything_tutorial.git
cd finetune_segment_anything_tutorial
pip install -r requirements.txt

❗❗❗If your CUDA version is below 11.7, use the following command to install torch environment

conda create --name finetuneSAM python=3.8
conda activate finetuneSAM 
conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.3 -c pytorch
git clone https://github.com/xzyun2011/finetune_segment_anything_tutorial.git
cd finetune_segment_anything_tutorial

Usage

Fine-tuning

Step 0, download a model checkpoint from segment-anything github repo.
Step 1, prepare VOC2007 dataset, you can download from the official website or from BaiduNetdisk(Link: https://pan.baidu.com/s/1vkk3lMheUm6IjTXznlg7Ng Password: 44mk)

This repo provides a mini demo VOC2007 in “data_example”.

Step 2, start fine-tuning

❗ Note change to your own model and data path ❗

python3 finetune_sam_voc.py --w weights/sam_vit_b_01ec64.pth --type vit_b --data data_example/VOCdevkit

Show result

You could load a finetuned decoder for inference using following command:

python3 predict_show.py --w weights/sam_vit_b_01ec64.pth  --type vit_b --decoder weights/sam_decoder_finetune_pointbox.pth  --data data_example/VOCdevkit

Some finetuned decoder(point prompt based and point/box prompt based) could find in BaiduNetdisk(Link: https://pan.baidu.com/s/1sDQu5Oth4FqNYbIY2qreKw Password: 36ou)

A Chinese blog document

Segment-anything学习到微调系列3_SAM微调decoder

Acknowledgement

License

This project is released under the MIT License. Please also adhere to the Licenses of models and datasets being used.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
assets		assets
data_example/VOCdevkit/VOC2007		data_example/VOCdevkit/VOC2007
segment_anything		segment_anything
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README-cn.md		README-cn.md
README.md		README.md
finetune_sam_voc.py		finetune_sam_voc.py
predict_show.py		predict_show.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Finetune_segment_anything_tutorial

Main Results

Point prompt

Box prompt

Getting Started

Installation

Usage

Fine-tuning

Show result

A Chinese blog document

Acknowledgement

License

About

Releases

Packages

Languages

License

xzyun2011/finetune_segment_anything_tutorial

Folders and files

Latest commit

History

Repository files navigation

Finetune_segment_anything_tutorial

Main Results

Point prompt

Box prompt

Getting Started

Installation

Usage

Fine-tuning

Show result

A Chinese blog document

Acknowledgement

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages