Segmentation Using Other Techniques

There are some speaker segmentation techniques proposed in the literature that are not a clear fit to any of the previous categories. These are therefore mentioned here.

In Vescovi et al. (2003) and Zdansky and Nouza (2005), dynamic programming is proposed to find the speaker change points. In Zdansky and Nouza (2005) BIC is used as marginal likelihood, solving the system via ML where all possible number of change points is considered. In Vescovi et al. (2003) they also use BIC and explore possible computation reduction techniques.

Pwint and Sattar (2005) propose a genetic algorithm where the number of segments is estimated via Walsh basis functions and the location of change points is found using a multi-population genetic procedure.

In Lathoud, McCowan and Odobez (2004), segmentation is based on the location estimation of the speakers by using multiple microphones. The difference between two locations is used as a feature and tracking techniques are employed to estimate the change points of possibly moving speakers. Further work on using location cues for clustering will be presented in the next section.


user 2008-12-08