Friday, October 5th, 2007

Voice Biometrics - The future is near

I had the opportunity to work on a Voice Biometrics related project for almost a year in which my team had to evaluate the feasibility and usability, and to implement a work flow to incorporate the use of it in a banking environment. As it appears, there are several vendors from different parts of the world - UK, Germany, Australia, Etc. ABN AMRO has since introduced the technology for their customers in Netherlands (Read More here) and Here (from Ars Technica).

 

In a summary, ABN AMRO requires customer to first register their voice by saying their preferred PIN 3 times, and thus, they will be required to say the same string later on if they have to be authenticated. This analog phrase is than converted into gibberish numbers and alphabets like this “243ddf333480w-4443043kk0l….” through certain algorithms, and are called voiceprints. The conversion takes into account nasal cavity, soft palette, vocal chords, diaphragm and thus each voiceprint will be unique, just like thumbprints.

 

Im sure you have got at least one of the burning questions below.
1) Does the technology really works?
2) What if I had a sore throat or the place is noisy?
3) What if someone plays a tape recorder with the string?
4) Is it even reliable and credible?

 

 

1) It depends. Firstly, it really depends on how you want your solution to be - if you expect the technology to just authenticate your voice plainly, without proper registered pass phrase, this technology will not work well (This is also called Text Independent Solution, will discuss about that later). Next, it depends on which vendor solutions you picked. While all vendors more or less use similar algorithms, they would have their unique blend of combining different algorithms, and the onus would also lie on the clients (us in this case) to tweak it accordingly to our local context. Tweaking in this sense may be in finding a best balance in a scoring system where it minimize both the number of false negative and false positive (Equal Error Rate). E.g. a score of 700 over 1000 may be good enough for you to decide this person is who he really is. Out of a sample of 100 people, 2 people who scored over 700 are not who they are (false positive), and 2 people who scored under 700 are who they are (false negative). In this case, 2% would be the EER.

 

2) Circumstances such as sore throat and external noises affect the scoring. In fact, even with different mobile phones, as we realised, the score can differ slightly. That said, these are not important considerations. This is because if your throat is sore, you will merely not be authenticated, and no one else will be falsely authenticated anyway. Also, how often does one get a sore throat so bad that the voice changes, and how often does one changes mobile phone? External noises can also be filtered out with certain technologies today.

 

3) There are distinct and comparable difference in frequency, pitch, etc between a wav file played from a voice recorder, and one from the actual person, and the biometric engine would be able to detect such intrusions.

 

4) Voice Biometrics come a long way. Govt Agencies ( I will not name them) started research and usage since 1960s, and today, some police forces use Voice Biometrics to identify prank callers, or to keep track of ex convicts. It is definitely a fast maturing technology, but again, reliability is in the eye of the beholder - it really depends on what you want to achieve out of it. Agencies are also working closely with the lawmakers in accepting Voice Biometrics evidence in courts.

 

 

Text Dependent & Text Independent Authentications
If you ever got to work with Voice Biometrics, you are likely to encounter the following terms:

 

Text Dependent simply means that when one tries to authenticate himself, he has to repeat the same set of phrase he used (preferably at least 5 - 7 seconds, and the longer the better). This would means if he forgets his password, there is no way he could gain entry into the system.

 

Text Independent, on the contrary, does not require remembering the password you registered. You will speak as per usual, and the system will return a score. Needless to say, this is a weaker form of authentication, and is undergoing heavy research currently. However, it has been deployed in police forces, as explained earlier, to identify prank callers. It is more often used in a ‘reversal’ method, and not for authentication purposes currently; many scientists and research agencies however believe Text Independent is the future - because of the user friendliness and added security features especially when combined with Speech Recognition technology. E.g. Jumbled challenge response questions can be implemented to authenticate a user over the phone so the user does not need to remember his password.

 

I will wrap up the post with this link http://knowledge.smu.edu.sg/index.cfm?fa=viewfeature&id=1071, an article on my team’s experience in the project - it however shares minimal information on VB, and more on consulting and innovation takeaways. Enjoy!

 

[digg=http://digg.com/security/Voice_Biometrics_The_Future_is_near]

del.icio.us Digg Facebook Technorati Google

Posted by Keith Ng on October 5th, 2007 | Filed in Future, Expert, Business |



Please leave a Comment