Next:
List of Tables
Up:
Robust Speaker Diarization for
Previous:
Resum
Contents
List of Tables
List of Figures
Introduction
Context and Motivations of this Thesis
Definition of the Thesis Objectives
Outline of the Thesis
State of the art
Acoustic Features for Speaker Diarization
Speaker Segmentation
Metric-Based Segmentation
Non Metric-Based Segmentation
Speaker Diarization
Hierarchical Clustering Techniques
Other Clustering Techniques
Use of Support Information in Diarization
Speaker Diarization in Meetings
Current Meeting Room Research Projects
Databases
NIST RT Speaker Diarization Systems for Meetings
Multichannel Acoustic Enhancement
Introduction to Acoustic Array Processing
Microphone Array Beamforming
Time Delay of Arrival Estimation
Speaker Diarization: from Broadcast News to Meetings
The ICSI Broadcast News System
Speech/non-Speech Detection and Parameters Extraction
Clusters Initialization and Acoustic Modeling
Clusters Comparison, Pruning and Clusters Merging
Stopping Criterion and System Output
Analysis of Differences from Broadcast News to Meetings
Input Data Analysis: Broadcast News versus Meetings
Summary of Differences and Proposed Changes
Robust Speaker Diarization System for Meetings
Acoustic Signal Enhancement
Single Channel System Frontend
Speaker Clusters and Models Initialization
Clusters Merging and System Output
Acoustic Modeling Algorithms for Speaker Diarization in Meetings
Speech/Non-Speech Algorithm
Energy-Based Speech/non-Speech Detector with Variable threshold
Model-based Speech/Non-Speech Decoder
Hybrid Speech/non-Speech Detection
Speaker Clusters Description and Modeling
Friends-and-Enemies Initialization
Clusters and Models Complexity Selection
Acoustic Modeling without Time Restrictions
Cluster Purification Algorithms
Frame-Level Cluster Purification
Segment-Level Cluster Purification
Multichannel Processing for Meetings
Multichannel Acoustic Beamforming for Meetings
Meeting Room Microphone Array Characteristics
Filter-and-Sum Beamforming
Multichannel Acoustic Beamforming System Implementation
Individual Channels Signal Enhancement
Meeting Information Extraction
TDOA Values Selection
Output Signal Generation
Use of the Estimated Delays for Speaker Diarization
TDOA Modeling and Features Fusion
Automatic Features Weight Estimation
Experiments
Meetings Domain Experiments Setup
Baseline Systems
Databases
Evaluation Metrics
Reference Segmentation Selection and Calculation
Experiments from Broadcast News to Meetings
Speech/Non-Speech Detection Block
Acoustic Beamforming Experiments
Baseline System Analysis
Reference Channel Estimation Analysis
TDOA Post-Processing Analysis
Signal Output Algorithms Analysis
Use of the Beamformed Signal for ASR
Speaker Diarization Module Experiments
Individual Algorithms Performance
Algorithms Agglomeration Performance
Overall Experiments and Analysis of Results
NIST Evaluations in Speaker Diarization
NIST Rich Transcription Evaluations in Speaker Diarization for Meetings
RT05s and RT06s Evaluation Conditions
Methodology of the Evaluations
Data used on the Speaker Diarization Evaluations
ICSI Participation in the RT Evaluations
Participation in the 2005 Spring Rich Transcription Evaluation
Participation in the 2006 Spring Rich Transcription Evaluation
Pros and Cons of the NIST Evaluations
Conclusions
Overall Thesis Final Review
Review of Objectives Completion
Possible Future Work Topics
BIC Formulation for Gaussian Mixture Models
Rich Transcription evaluation datasets
Bibliography
Bibliography
user 2008-12-08