Experiments from Broadcast News to Meetings

This section covers the experiments done to assess the performance of the system given the different improvements proposed in the previous chapters. To do so, the following structure will be followed:

As explained above, there are several baseline systems that are considered to test the different modules proposed by this thesis. By doing so every module's performance can be evaluated independently.

The system used to evaluate the speech/non-speech detector can be considered as the baseline of this thesis as it is directly derived from the broadcast news (BN) system found at ICSI at the time of this thesis work start. Such system already contains a few improvements from the BN initial system but these are considered core and will not be evaluated.

The other baseline used (although it is in reality an intermediate system) uses the beamforming system submitted to the RT06s evaluation and the hybrid speech/non-speech detector together with the initial baseline. This is used to evaluate the algorithms in the acoustic beamforming module and in the diarization module.

Table 6.2: Results for the CV-EM training algorithm in the agglomerate system
System DER Development set
Baseline(24 shows) 20.6% 19.04% 16.49% 18.71%
RT06s system(20 shows) 19.45% 17.65% 14.70% 17.26%

DER Evaluation set
Baseline 24.54% 26.5% 18.65% 23.23%

Table 6.2 shows the baseline scores to compare to through the following sections. The difference between the baseline and the RT06s system is the inclusion or not of 4 CMU meetings from the development set with a single channel. The version with 20 meeting excerpts is used to develop the beamforming system, while the complete baseline is used to evaluate it (all meetings contain more than one channel).

user 2008-12-08