Segment-Level Cluster Purification

There are some situations where a cluster retains speaker segments from more than one speaker; the segment-level cluster purification algorithm is a proposed mechanism used to force splitting these cluster into two parts. The algorithm detects the segments in each cluster that are likely to belong to another speaker and reassigns one of them to a new cluster in each iteration of the agglomerative clustering algorithm. The algorithm works as follows:

  1. Find the segment that best represents each model (highest normalized likelihood). This is done to isolate the effect of a big speaker model when trying to determine if it contains any segments from more than one speaker. The most representative segment is very probable to contain only data from one speaker and it is more reliable to compare it with other segments of similar size.

  2. Compute, within each cluster, the $ \Delta$BIC value between the best segment (found in step 1) and each of the other segments. If all pairs have a value greater than a minimum purity (empirically set to -50) that model is labelled as ``pure'' and is not checked again in subsequent iterations.

  3. The segment that most differs from its model's best segment is assigned to a new model. All models are retrained and the data is resegmented with Viterbi.

In order to avoid instability, the algorithm is run at most $ K_{init}$ times ($ K_{init}$ being the number of initial clusters). Doing so avoids clusters continuously split and merge the same segments over and over.

user 2008-12-08