Text-Independent Speaker Recognition Based on Neural Networks

July 15, 2010 by Luigi Rosa · Leave a Comment
Filed under: Sound technology 
VN:F [1.8.8_1072]
Rating: 0 (from 0 votes)
VN:F [1.8.8_1072]
Rating: 0.0/10 (0 votes cast)

.: Click here to download :.

Speaker recognition or voice recognition is the task of recognizing people from their voices. Such systems extract features from speech, model them and use them to recognize the person from his/her voice. Speaker recognition has a history dating back some four decades, where the output of several analog filters was averaged over time for matching. Speaker recognition uses the acoustic features of speech that have been found to differ between individuals. These acoustic patterns reflect both anatomy (e.g., size and shape of the throat and mouth) and learned behavioral patterns (e.g., voice pitch, speaking style). This incorporation of learned patterns into the voice templates (the latter called “voiceprints”) has earned speaker recognition its classification as a “behavioral biometric.”

Speaker recognition systems employ three styles of spoken input: text-dependent, text-prompted and text-independent. Most speaker verification applications use text-dependent input, which involves selection and enrollment of one or more voice passwords. Text-prompted input is used whenever there is concern of imposters. The various technologies used to process and store voiceprints includes hidden Markov models, pattern matching algorithms, neural networks, matrix representation and decision trees. Some systems also use “anti-speaker” techniques, such as cohort models, and world models. Ambient noise levels can impede both collection of the initial and subsequent voice samples. Performance degradation can result from changes in behavioral attributes of the voice and from enrollment using one telephone and verification on another telephone. Voice changes due to aging also need to be addressed by recognition systems.

Many companies market speaker recognition engines, often as part of large voice processing, control and switching systems. Capture of the biometric is seen as non-invasive. The technology needs little additional hardware by using existing microphones and voice-transmission technology allowing recognition over long distances via ordinary telephones (wire line or wireless). Multi-layered networks are capable of performing just about any linear or nonlinear computation, and can approximate any reasonable function arbitrarily well. Such networks overcome the problems associated with the perceptron and linear networks. However, while the network being trained may be theoretically capable of performing correctly, back propagation and its variations may not always find a solution. There are many types of neural networks for various applications multilayered perceptrons (MLPs) are feedforward networks and universal approximators. They are the simplest and therefore most commonly used neural network architectures.

Index Terms: Matlab, speaker recognition, speaker verification, speaker matching, neural networks, feature extraction, ann, artificial neural networks, nn.

Figure 1. Speech signal

A simple and effective source code for Speaker Identification based on Neural Networks.

Demo code (protected P-files) available for performance evaluation. Matlab Signal Processing Toolbox and Matlab Neural Network Toolbox are required.

Release
Date
Major features
1.1
2006.07.12

  • Minor bug fixed
1.0
2006.06.14

We recommend to check the secure connection to PayPal, in order to avoid any fraud.
This donation has to be considered an encouragement to improve the code itself.

Speaker Recognition System Based on ANN – Release 1.0 – Click here for your donation. In order to obtain the source code you have to pay a little sum of money: 150 EUROS (less than 210 U.S. Dollars).
Once you have done this, please email us luigi.rosa@tiscali.it
As soon as possible (in a few days) you will receive our new release of Speaker Recognition System Based on ANN.

Alternatively, you can bestow using our banking coordinates:

Name :
Luigi Rosa
Address :
Via Centrale 35 67042 L’Aquila Italy
Bank name:
Poste Italiane
Bank address:
Viale Europa 190 00144 Roma Italy
IBAN (International Bank Account Number) :
IT-50-V-07601-03600-000058177916
BIC (Bank Identifier Code) :
BPPIITRRXXX

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

The authors have no relationship or partnership with The Mathworks. All the code provided is written in Matlab language (M-files and/or M-functions), with no dll or other protected parts of code (P-files or executables). The code was developed with Matlab 14 SP1. Matlab Signal Processing Toolbox and Matlab Neural Network Toolbox are required. The code provided has to be considered “as is” and it is without any kind of warranty. The authors deny any kind of warranty concerning the code as well as any kind of responsibility for problems and damages which may be caused by the use of the code itself including all parts of the source code.

VN:F [1.8.8_1072]
Rating: 0.0/10 (0 votes cast)
VN:F [1.8.8_1072]
Rating: 0 (from 0 votes)

Popularity: 1% [?]

Share and Enjoy:
  • Print
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google Bookmarks
  • Blogplay
  • Live
  • PDF
  • Technorati
  • Twitter
  • Yahoo! Bookmarks
  • Add to favorites
  • email
  • MySpace
  • RSS

Copyright Protection of Digital Audio Data

February 19, 2010 by Luigi Rosa · Leave a Comment
Filed under: Signal Processing, Sound technology 
VN:F [1.8.8_1072]
Rating: 0 (from 0 votes)
VN:F [1.8.8_1072]
Rating: 0.0/10 (0 votes cast)

.: Click here to download :.

The outstanding progress of digital technology has increased the ease with which digital data is reproduced and retransmitted. However, since the advantages of such a progress are broadly available, they offer equally increasing potential to both legal and unauthorized data manipulation. Consequently, the necessity arises for copyright protection of digital products against unauthorized recording attempts, knows as data piracy. Current research in image, audio and video copyright protection exploits the fact that the human visual and audio perception cannot detect slight changes in certain temporal or frequency domains of the image and the audio signal, respectively. This property is called masking, according to which a faint but perceptible signal becomes non-perceptible in the presence of another one under certain conditions. Most research methods consider a watermark signal produced in a unique way by a function of one or more input keys. These keys can be both owner and signal dependent and generate a signal which is embedded on the original one. The embedding signal is known as a watermark or copyright label. Temporal and frequency characteristics of the original signal should be taken into account in the watermark casting process to reduce perceptible distortions in the watermarked signal. Each individual that produces or possesses digital data owns a unique key that identifies its legal possession and is required for the watermark detection. Besides copyright purposes, a watermark serves authentication purposes, as well.

A watermark has to be statistically undetectable by others to prevent the efforts of its unauthorized removal. This condition is fulfilled if the potential number of keys that produce distinct watermarks is large enough to ensure statistical safety. The detection scheme should be as statistically reliable as possible. False rejection or acceptance of the existence of the watermark should be minimal. Finally, a watermark has to be robust to signal manipulation and impossible to be removed without significant alteration of the signal. In other words, a pirate should have to destroy the audio signal before he accomplishes to destroy the watermark. The robustness should extend to common signal processing operations, such as filtering, compression, resampling, requantization, cropping, noise, D/A conversion.

Index Terms: Matlab, source, code, watermarking, watermark, detection, embedding, audio, copyright, protection.

Figure 1. Copyright protection

A simple and effective source code for Digital Audio Watermarking.

Demo code (protected P-files) available for performance evaluation. Matlab Signal Processing Toolbox is required.

Release
Date
Major features
1.0
2008.04.19

We recommend to check the secure connection to PayPal, in order to avoid any fraud.
This donation has to be considered an encouragement to improve the code itself.

Digital Audio Watermarking – Click here for your donation. In order to obtain the source code you have to pay a little sum of money: 90 EUROS (less than 126 U.S. Dollars).
Once you have done this, please email us luigi.rosa@tiscali.it
As soon as possible (in a few days) you will receive our new release of Digital Audio Watermarking.

Alternatively, you can bestow using our banking coordinates:

Name :
Luigi Rosa
Address :
Via Centrale 35 67042 L’Aquila Italy
Bank name:
Poste Italiane
Bank address:
Viale Europa 190 00144 Roma Italy
IBAN (International Bank Account Number) :
IT-50-V-07601-03600-000058177916
BIC (Bank Identifier Code) :
BPPIITRRXXX

The authors have no relationship or partnership with The Mathworks. All the code provided is written in Matlab language (M-files and/or M-functions), with no dll or other protected parts of code (P-files or executables). The code was developed with Matlab 2006a. Matlab Signal Processing Toolbox is required. The code provided has to be considered “as is” and it is without any kind of warranty. The authors deny any kind of warranty concerning the code as well as any kind of responsibility for problems and damages which may be caused by the use of the code itself including all parts of the source code.

VN:F [1.8.8_1072]
Rating: 0.0/10 (0 votes cast)
VN:F [1.8.8_1072]
Rating: 0 (from 0 votes)

Popularity: 1% [?]

Share and Enjoy:
  • Print
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google Bookmarks
  • Blogplay
  • Live
  • PDF
  • Technorati
  • Twitter
  • Yahoo! Bookmarks
  • Add to favorites
  • email
  • MySpace
  • RSS

Speaker Recognition System

October 29, 2009 by Luigi Rosa · Leave a Comment
Filed under: Sound technology 
VN:F [1.8.8_1072]
Rating: 0 (from 0 votes)
VN:F [1.8.8_1072]
Rating: 5.5/10 (2 votes cast)

.: Click here to download :.

Speaker recognition is the process of automatically recognizing who is speaking on the basis of individual information included in speech waves. This technique makes it possible to use the speaker’s voice to verify their identity and control access to services such as voice dialing, banking by telephone, telephone shopping, database access services, information services, voice mail, security control for confidential information areas, and remote access to computers.

Speaker identity is correlated with the physiological and behavioral characteristics of the speaker. These characteristics exist both in the spectral envelope (vocal tract characteristics) and in the supra-segmental features (voice source characteristics and dynamic features spanning several segments).

The most common short-term spectral measurements currently used are Linear Predictive Coding (LPC)-derived cepstral coefficients and their regression coefficients. A spectral envelope reconstructed from a truncated set of cepstral coefficients is much smoother than one reconstructed from LPC coefficients. Therefore it provides a stabler representation from one repetition to another of a particular speaker’s utterances. As for the regression coefficients, typically the first- and second-order coefficients are extracted at every frame period to represent the spectral dynamics. These coefficients are derivatives of the time functions of the cepstral coefficients and are respectively called the delta- and delta-delta-cepstral coefficients.

Index Terms: speaker, recognition, verification, sound, words.

Figure 1. Microphone

A simple and effective source code for Speaker Recognition. This code is based on Amin Koohi’s excellent submission available here and improves results using an advanced metric for distance computation. In this way a better recognition rate is achieved. On the initial dataset (8 speakers) we obtain a recognition rate of 100% (the previuos one was 87.5%). We can achieve analogous results (100% recognition rate) for a larger dataset (11 speakers).

Demo code (protected P-files) available for performance evaluation. Matlab Signal Processing Toolbox is required.
Release
Date
Major features
1.0

2005.12.07

We recommend to check the secure connection to PayPal, in order to avoid any fraud.
This donation has to be considered an encouragement to improve the code itself.

Speaker Recognition System – Release 1.0 – Click here for your donation. In order to obtain the source code you have to pay a little sum of money: 26 EUROS (less than 36,4 U.S. Dollars).
Once you have done this, please email us luigi.rosa@tiscali.it
As soon as possible (in a few days) you will receive our new release of Speaker Recognition System.

Alternatively, you can bestow using our banking coordinates:

Name :
Luigi Rosa
Address :
Via Centrale 35 67042 L’Aquila Italy
Bank name:
Poste Italiane
Bank address:
Viale Europa 190 00144 Roma Italy
IBAN (International Bank Account Number) :
IT-50-V-07601-03600-000058177916
BIC (Bank Identifier Code) :
BPPIITRRXXX

The authors have no relationship or partnership with The Mathworks. All the code provided is written in Matlab language (M-files and/or M-functions), with no dll or other protected parts of code (P-files or executables). The code was developed with Matlab 14 SP1. Matlab Signal Processing Toolbox is required. The code provided has to be considered “as is” and it is without any kind of warranty. The authors deny any kind of warranty concerning the code as well as any kind of responsibility for problems and damages which may be caused by the use of the code itself including all parts of the source code.

VN:F [1.8.8_1072]
Rating: 5.5/10 (2 votes cast)
VN:F [1.8.8_1072]
Rating: 0 (from 0 votes)

Popularity: 1% [?]

Share and Enjoy:
  • Print
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google Bookmarks
  • Blogplay
  • Live
  • PDF
  • Technorati
  • Twitter
  • Yahoo! Bookmarks
  • Add to favorites
  • email
  • MySpace
  • RSS

Next Page »