Energy-Based Speech/non-Speech Detector with Variable threshold

Figure 4.1: Energy-based detector blocks diagram

The first stage of the process consists on an energy-based speech/non-speech detector which can be divided into three major blocks as seen in figure 4.1. Each of these blocks are explained below. First of all, the data is preprocessed using common engineering techniques with the purpose of increasing the quality of the speech signal. Then a derivative filter is applied over the energy signal. Finally we use a thresholding method together with a minimum duration enforcement via a Finite State Machine (FSM) to detect silences. This work was initiated by M. Aguilo while visiting ICSI, and was assembled into the current system by the author. For a deeper explanation of each individual module refer to Aguilo's master's thesis in (Aguilo, 2005).


