Skip to content
AMontagu edited this page Aug 24, 2015 · 8 revisions

Voice

Class Overview

The Voice class is a class based on pocketSphinx library for some vocal recognition. This allow you to use you voice for execute some software action.

Public Constructors

  • Voice();

The more simple constructor. All the config variable are pre-written. Use my roboticModel language model and the french hmm (stored in data/modelPocketSphinx/roboticModel) and the config of my windows microphone

  • Voice(const char hmm, const char lm, const char* dict, const char* samprate, const char* nfft);**

Constructor that allow you to use your own hmm and language model for vocal recoginition.

hmm: Path to the acoustic model folder.

lm: Path to the langage model.

dict: path to the dictionnary.

samprate: The samprate used by your microphone. If you don't know record some sentence and open the .raw data with audacity to find wich samprate your microphone use.

nfft: adapting the buffer based on the samprate.

  • Voice(const char hmm, const char lm, const char* dict, const char* samprate, const char* nfft, const char* pathToDirForData);**

Constructor that allow you to use your own hmm and language model for vocal recoginition and to store your vocal recording in a specified folder. usefull for check in which samprate your microphone work.

hmm: Path to the acoustic model folder.

lm: Path to the langage model.

dict: path to the dictionnary.

samprate: The samprate used by your microphone. If you don't know record some sentence and open the .raw data with audacity to find wich samprate your microphone use.

nfft: adapting the buffer based on the samprate.

_pathToDirForData: Path to a directory where you could save your record for find some error in recorder.

Public Methods

  • void recognizeFromMicrophone();

The sample function of pocketSphinx renamed. Wait until you talk begin to recording until you stop and print what he recognized of what you said.

  • const char recognizeFromMicrophoneWhileTime(int timeToWait);*

An upgraded function who do the same that recognizeFromMicrophone but with a timeout and only for one command and not in an infinity loop. Return the world recognized.

timeToWait: Time in seconds for the listening process.

  • const char recognizeFromFile(char fname);

Recognize world in a .wav file. Be careful to recording a file with the good settings, especially the samperate.

fname: Name of the audio file we want to recognize.

  • std::string processOnRecognition(std::string dataToProcess);

Take all the word recognized and look for the world robotlab and the following command. If the command it's not possible it see it and return invalide or a commande that do nothing.

dataToProcess: All the word recognized that need to be traited.

Sample

#include "Voice.h"

/*###############################################################################################################################
#																																#
#												Sample for voice recognition													#
#																																#
#################################################################################################################################*/


int main(int argc, const char** argv)
{
	std::string again = "a";
	Voice myVoice;
	std::string retour(myVoice.recognizeFromFile("../../../data/sound44100Hz/direction.wav"));
	std::cout << "finit direction : " << retour << std::endl;
	std::string retour2(myVoice.recognizeFromFile("../../../data/sound44100Hz/recoSourires.wav"));
	std::cout << "finit recoSmile : " << retour2 << std::endl;
	std::string retour3(myVoice.recognizeFromFile("../../../data/sound44100Hz/recoYeux.wav"));
	std::cout << "finit reco eyes : " << retour3 << std::endl;
	std::string retour4(myVoice.recognizeFromFile("../../../data/sound44100Hz/recoFacial.wav"));
	std::cout << "finit reco facial : " << retour4 << std::endl;
	std::string retour5(myVoice.recognizeFromFile("../../../data/sound44100Hz/stop.wav"));
	std::cout << "finit reco stop : " << retour5 << std::endl;
	getchar();

	//myVoice.recognizeFromMicrophone();

	while (again == "a")
	{
		std::string retour(myVoice.recognizeFromMicrophoneWhileTime(10));
		std::cout << retour << std::endl;
		std::cin >> again;
	}
	return 0;
}

How to personnalize

First you need to test pocket sphynx with your recorder. Call a the constructor with -rawlogdir options or uncomment the options in the default constructor.


Then you need to use your own language model for add your own command. For create a new language model please go to : http://cmusphinx.sourceforge.net/wiki/tutoriallm

You need to create your own dictionnary or to use an existent. For informations more the dictionnary is accurate less the false positives appears but you recognize your worl less often.

Personnaly I used the french dictionnary used by the community for find only the world I needed and I extract them with the nearest vocel neighbors for create mine.


You can also adapting the default accousting model to your voice following these instructions : http://cmusphinx.sourceforge.net/wiki/tutorialadapt.


Add a function that retrieves the result of the function that recognize the voice for parsing every world and see if it's logic or not and adjust the data recognized according.

Clone this wiki locally