Advice on Optimal Configuration for Fine-Tuning PP-OCRv3 #13883

O-Bugra · 2024-09-18T08:29:46Z

O-Bugra
Sep 18, 2024

I am working on fine-tuning the PP-OCRv3 model for text recognition with a focus on adding new characters to the dictionary. For this purpose, I have added new characters to the dictionary and collected a dataset around of 10,000 images. After performing data augmentation, the dataset has grown to 33,000 images, and would like to get advice on the optimal configuration parameters for this task. Specifically:

1.Optimizer and Learning Rate:
I am currently using the Adam optimizer with a learning rate of 0.001. Would it be beneficial to try other optimizers like NAdam or SGD? What learning rate values would you recommend? Also, I am considering switching from the OneCycle learning rate scheduler to options such as CosineAnnealingLR, Piecewise, or a constant learning rate. What are your thoughts?

Optimizer:
name: Adam
beta1: 0.9
beta2: 0.999
lr:
name: OneCycle
#learning_rate: 0.001
max_lr: 0.001
warmup_epoch: 10
regularizer:
name: L2
factor: 3.0e-05

Model Architecture:
he current architecture is set to rec with the SVTR_LCNet algorithm. Are there alternative architectures or adjustments you would recommend for better performance? For example, would models like CRNN or ResNet be more effective?

Architecture:
model_type: rec
algorithm: SVTR_LCNet

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Advice on Optimal Configuration for Fine-Tuning PP-OCRv3 #13883

{{title}}

Replies: 0 comments

Select a reply

Advice on Optimal Configuration for Fine-Tuning PP-OCRv3 #13883

O-Bugra Sep 18, 2024

Replies: 0 comments

O-Bugra
Sep 18, 2024