What is the technical principle of speech recognition?
Time:2022-01-26
Views:2335
The essence of speech recognition is a pattern recognition based on speech feature parameters, that is, through learning, the system can classify the input speech according to a certain pattern, and then find the best matching result according to the judgment criteria. At present, the principle of pattern matching has been applied to most speech recognition systems.
General pattern recognition includes preprocessing, feature extraction, pattern matching and other basic modules. Firstly, preprocess the input speech, including framing, windowing, pre emphasis and so on. The second is feature extraction, so it is particularly important to select appropriate feature parameters. Common characteristic parameters include pitch period, formant, short-term average energy or amplitude, linear prediction coefficient (LPC), perceptual weighted prediction coefficient (PLP), short-term average zero crossing rate, linear prediction cepstrum coefficient (LPCC), autocorrelation function, Mel cepstrum coefficient (MFCC), wavelet transform coefficient, empirical mode decomposition coefficient (EMD), Gamma pass filter coefficient (GFCC), etc. In the actual recognition, the template of the test speech should be generated according to the training process, and finally recognized according to the distortion judgment criterion. Common distortion criteria include Euclidean distance, covariance matrix and Bayesian distance.
Disclaimer: This article is transferred from other platforms and does not represent the views and positions of this site. If there is infringement or objection, please contact us to delete. thank you! |