Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recognize with a file #47

Open
mario-nasc opened this issue Jan 3, 2019 · 10 comments
Open

Recognize with a file #47

mario-nasc opened this issue Jan 3, 2019 · 10 comments
Labels

Comments

@mario-nasc
Copy link

Hello and happy new year, I would like to know if there is any way I can recognize the voice through a file instead of being by the microphone (loop)? I know I can use files to train, but can I use files to recognize them?

@bishoph
Copy link
Owner

bishoph commented Jan 3, 2019

Sure, as documented. You can write and read files. With the files you can run normal operations like training or recognition. I use these features for testing and to get reproduceable results.

@mario-nasc
Copy link
Author

Hello, I was able to do the training and the test through files. Contextualizing, I'm doing a college activity where I want the program's response to be just "yes" or "no", I'm training through four different people's audio files and also testing through files, but the accuracy is very low , I would like to know if you have had some similar experience and where I would probably be wrong, if it is something in default.ini, or if the precision in training with files is smaller than directly from the microphone, or even if you suspect be my database.

@bishoph
Copy link
Owner

bishoph commented Jan 7, 2019

As I no nothing about your training nor your configuration I can only guess. For many different voices 4 people seems to be ok. Mind that the file input may has data upfront and at the end that may change the results. Therefor precise recording is key! View your input and cross check the data with the plot option. Config wise I would try to focus on the dominant frequency and add SIMILARITY_NORM if needed. Later on decrease distances to avoid false positives. You will still get many as yes and no are very short words. not sure if the length check must be enabled or disabled. Requires testing and adjusting.

@chuddy1
Copy link

chuddy1 commented Oct 11, 2022

Hi how exactly can I run the recognition through files? Would it read a wav file for recognition or a raw file? thank you

@bishoph
Copy link
Owner

bishoph commented Oct 11, 2022

I added recognition through files for debugging and testing purpose. Only raw files are supported. The options are:

-w --write [file] : write raw to [dir/filename]

-r --read [file] : read raw from [dir/filename]

@chuddy1
Copy link

chuddy1 commented Oct 12, 2022

Hi thanks for your reply. I tried to use the write function but it does not allow me to write files for a duration longer than 5 seconds. I am trying to record up to 10mins sound so that I could perform sensitivity analysis on the characteristics which I have used in Sopare. I am trying to get the best possible combination of characteristics I need to detect my key word. I also tried reading a raw file which contains my keyword in it but Sopare did not detect it. As a matter of fact it analyzes every raw file I read as not containing my keyword. I am not sure if I am doing this correctly.

Thank you very much for your time!

@bishoph
Copy link
Owner

bishoph commented Oct 13, 2022

The name "raw file" is misleading as the content is processed by SOPARE and modified based on the config parameters. This means the length of the raw file is determined by the config parameters like MAX_TIME etc. One conclusion of this is that raw files are heavily based on a specific config. As soon as you change the config you should delete the raw files and create new ones or store raw files config specific. Hope that clarifies the topic.

@chuddy1
Copy link

chuddy1 commented Oct 24, 2022

Thank you bishoph for the reply. I understand now.

@chuddy1
Copy link

chuddy1 commented Oct 24, 2022

I also made an observation when running tests with my Sopare. I have 5 Raspberry Pis which are the exact clones of themselves. I used Sopare on all 5 Raspberries within the same environment but I got different predication results between all 5 Raspberries. The furthest distance between 2 Raspberries in my setup is 29 inches. I am not sure why I am getting different prediction results. Is this an expected behavior from the software?

@bishoph
Copy link
Owner

bishoph commented Oct 25, 2022

Depends on the environment. Echos, obstructions and even the volume can vary for the same sound in a room for different machines a lot. Try to analyse one sound on all 5 Pis and you may find the differences even in the graphical representation easily.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants