In this section an analysis is made on the appropriateness of the different techniques implemented for the acoustic beamforming of the multiple available signals into an ``enhanced'' signal. The experiments were conducted using both development and evaluation sets as described in 6.1.2 where 4 meetings from CMU were taken out of the development set as they only contained a single microphone.
The experiments use as a comparison system the filter&sum (F&S) beamforming used in the RT06s NIST evaluation, which contains all the modules and algorithms described in section 5.2. This implementation is the one used in the following section to test the appropriateness of all the algorithms in the single-channel diarization module. Each module is evaluated by comparing the performance of the system with and without it, maintaining all other modules in place.
The metrics used in the experiments process in this section are the Signal-to-Noise ratio (SNR) and the Diarization Error Rate (DER) as described in 6.1.3. In order to conduct a fair comparison, the F&S was obtained for each considered beamformed signal and the SNR was computed . After this, the DER was obtained by running the diarization module on that signal (previously parameterized) using the optimum diarization parameters according to the results in section 6.5. The TDOA values were not used in this analysis and the speech/non-speech labels were kept constant to those of the RT06s system (used as the baseline system) as computed and explained in section 4.1.3. This was done in order to focus on the changes in DER only from the change in the beamforming module.
The modules within the beamforming that were analyzed are: