Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change the feature extraction setting of vctk config #433

Open
bondio77 opened this issue Aug 21, 2024 · 2 comments
Open

change the feature extraction setting of vctk config #433

bondio77 opened this issue Aug 21, 2024 · 2 comments
Labels
question Further information is requested

Comments

@bondio77
Copy link

Hi, first of all, thank you so much for providing pre-trained models through many experiments. But what I want to ask is, I want to fine-tune the pre-trained VCTK model with my multi-speaker dataset. In the VCTK config file, fft_size = 2048, hop_length = 300, win_length = 1024, but the config of the TTS model I trained is 1024, 256, 1024. When fine-tuning, will it work if I change the config file to 1024, 256, 1024 to match my TTS model? The sampling rate is 24000. Thank you!

@kan-bayashi
Copy link
Owner

Sorry for the late reply.
I think you should train from scratch for the following reasons:

  • The difference of fft size or window size might be OK if you finetune the model
  • The difference of hop length is critical since the hop length determines the upsampling layer structure.
  • If you want to change hop length, you need to change the upsample layers as well.

Example:
hop length = 300 -> 5 * 5 * 4 * 3

upsample_scales: [5, 5, 4, 3] # Upsampling scales.

@kan-bayashi kan-bayashi added the question Further information is requested label Aug 27, 2024
@bondio77
Copy link
Author

bondio77 commented Sep 6, 2024

thank you so much for answering me
i will try your recommend. thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants