Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions regarding the use of Sopare #77

Open
shandrios opened this issue Jul 9, 2019 · 10 comments
Open

Questions regarding the use of Sopare #77

shandrios opened this issue Jul 9, 2019 · 10 comments
Labels

Comments

@shandrios
Copy link

Hello again, still a big fan. I've been tinkering around a bit with your program, and I've got some questions.

  1. By default, the frequency range used in the speech analysis is 20-600, even though human speech goes much higher than that. Does increasing this range have an effect on the accuracy of the analysis? Also, it seems like the built-in FFT plot function's x axis caps out at 2000Hz by default, even if the frequency range is e.g 20-5000. Does the actual analysis still take the entire range into account, and can the axis limits be increased?

  2. I noticed the master branch hasn't had a commit since January 2018. Have there been any significant improvements in the testing branch that would warrant using that over the master branch for my own project?

  3. In my own project, I would like my Raspberry to recognise simple voice commands from a handful of different people. I can't necessarily get word samples from all of them, so I'm wondering what settings in the config file I should tinker with in order to improve the rate of success.

Thanks in advance, have a good day.

@bishoph
Copy link
Owner

bishoph commented Jul 9, 2019

  1. Yes, more frequencies means normally more precision. But the default range gives decent results already even with normal hardware and works also for some range.
    The plot scales automatically and shows the time domain as well as the frequency domain.

  2. Testing branch is ahead of master. Mostly smaller bugfixes. You get the full view right here: https://github.com/bishoph/sopare/compare/testing
    Switching branches is easy so you can give it a try.

  3. Test. Adapt. Repeat. Can't really give better advice without details ;)

@shandrios
Copy link
Author

My issue is that the plots don't seem to scale past 2000Hz. In the attached image, you can see that I've set the HIGH_FREQ and START_PROGRESSIVE_FACTOR to 5000, but the plots still only shows up to 2000/400. Is there something else I need to do to get the full range?
graphs

@bishoph
Copy link
Owner

bishoph commented Jul 10, 2019

Check if your hardware limits the input. Other than that it could be that you are using a different configuration while you are using plot. Or it is a bug ;).

@shandrios
Copy link
Author

Probably a bug then, because that is the only config file I have, and the Full FFT graph shows frequencies all the way up to 20000.
fullfft

On another note, is it possible to use the volume of the analyzed sound in a plugin? I noticed that the plugin's run function takes three parameters, but only readable_results is used in your examples. Can the volume of the sound be gotten from the rawbuf or data parameters?

@kimgenegaby
Copy link

I noticed you concentrated lots of time in YouTube in the frequency domain. How to get frequency domain with a wav file?

@dumblob
Copy link

dumblob commented Dec 28, 2019

The question about frequency range would interest me as well. Any insights?

@bishoph
Copy link
Owner

bishoph commented Dec 28, 2019

What question about frequency range is unanswered?

@dumblob
Copy link

dumblob commented Dec 28, 2019

Question about how is the following possible:

Probably a bug then, because that is the only config file I have, and the Full FFT graph shows frequencies all the way up to 20000.

(see 3 comments above in #77 (comment) )

@bishoph
Copy link
Owner

bishoph commented Dec 28, 2019

Only the full FFT graph shows all frequencies. Single token graphs are, as the word states, tokenized and inherit only parts of the frequencies. Like a single piece of cake don't contain all the ingredients of the full cake...

@dumblob
Copy link

dumblob commented Dec 28, 2019

Single token graphs are, as the word states, tokenized and inherit only parts of the frequencies.

That didn't appear to me. Thanks for clarification and a great project overall!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants