-
Adami, A., Burget, L., Dupont, S., Garudadri, H., Grezl, F., Hermansky, H.,
Jain, P., Kajarekar, S., Morgan, N. and Sivadas, S.: 2002, Qualcomm-icsi-ogi
features for asr, Proc. International Conference on Speech and Language
Processing.
-
-
Adami, A. G., Kajarekar, S. S. and Hermansky, H.: 2002, A new speaker change
detection method for two-speaker segmentation, Proc. IEEE International
Conference on Acoustics, Speech and Signal Processing, Orlando, Florida.
-
-
Aguilo, M.: 2005, Deteccion de actividad oral en un sistema de
diarizacion, Master's thesis, UPC.
-
-
Ajmera, J.: 2004, Robust Audio Segmentation, PhD thesis, Ecole
Polytechnique Federale de Lausanne.
-
-
Ajmera, J., Bourlard, H. and Lapidot, I.: 2002, Improved unknown-multiple
speaker clustering using HMM, Technical report, IDIAP.
-
-
Ajmera, J., Lathoud, G. and McCowan, I.: 2004, Clustering and segmenting
speakers and their locations in meetings, Proc. IEEE International
Conference on Acoustics, Speech and Signal Processing, Vol. 1, pp. 605-608.
-
-
Ajmera, J., McCowan, I. and Bourlard, H.: 2003, Robust speaker change
detection, Technical report, IDIAP.
-
-
Ajmera, J., McCowan, I. and Bourlard, H.: 2004, Robust speaker change
detection, IEEE Signal Processing Letters 11(8), 649-651.
-
-
Ajmera, J. and Wooters, C.: 2003, A robust speaker clustering algorithm,
IEEE Automatic Speech Recognition and Understanding Workshop, US Virgin
Islands, USA.
-
-
Anguera, X.: 2005, Xbic: Real-time cross probabilities measure for speaker
segmentation, Technical Report TR-99-2004, ICSI.
-
-
Anguera, X., Aguilo, M., Wooters, C., Nadeu, C. and Hernando, J.: 2006, Hybrid
speech/non-speech detector applied to speaker diarization of meetings,
Speaker Odyssey 06, Puerto Rico, USA.
-
-
Anguera, X. and Hernando, J.: 2004a,
- Evolutive speaker segmentation using a
repository system, Proc. International Conference on Speech and Language
Processing, Jeju Island, Korea.
-
Anguera, X. and Hernando, J.: 2004b,
- XBIC: nueva medida para segmentacion de
locutor hacia el indexado automatico de la senal de voz, III Jornadas en
Tecnologia del Habla, Valencia, Spain.
-
Anguera, X., Wooters, C. and Hernando, J.: 2005, Speaker diarization for
multi-party meetings using acoustic fusion, IEEE Automatic Speech
Recognition and Understanding Workshop, Puerto Rico, USA.
-
-
Anguera, X., Wooters, C. and Hernando, J.: 2006a,
- Automatic cluster complexity
and quantity selection: Towards robust speaker diarization, MLMI'06,
Washington DC, USA.
-
Anguera, X., Wooters, C. and Hernando, J.: 2006b,
- Frame purification for
cluster comparison in speaker diarization, MMUA'06, Toulouse, France.
-
Anguera, X., Wooters, C. and Hernando, J.: 2006c,
- Friends and enemies: A novel
initialization for speaker diarization, Proc. International Conference
on Speech and Language Processing, Pittsburgh, USA.
-
Anguera, X., Wooters, C. and Hernando, J.: 2006d,
- Purity algorithms for speaker
diarization of meetings data, Proc. IEEE International Conference on
Acoustics, Speech and Signal Processing, Toulouse, France.
-
Anguera, X., Wooters, C. and Pardo, J. M.: 2006a,
- Robust speaker diarization
for meetings: ICSI RT06s evaluation system, Proc. International
Conference on Speech and Language Processing, Pittsburgh, USA.
-
Anguera, X., Wooters, C. and Pardo, J. M.: 2006b,
- Robust speaker diarization
for meetings: ICSI RT06s meetings evaluation system, RT06s Meetings
Recognition Evaluation, Washington DC, USA.
-
Anguera, X., Wooters, C., Peskin, B. and Aguilo, M.: 2005, Robust speaker
segmentation for meetings: The ICSI-SRI spring 2005 diarization system,
RT05s Meetings Recognition Evaluation, Edinburgh, Great Brittain.
-
-
Appel, U. and Brandt, A.: 1982, Adaptive sequential segmentation of piecewise
stationary time series, Inf. Sci. 29(1), 27-56.
-
-
Attias, H.: 2000, A variational bayesian framework for graphical models,
Advances in Neural information processing systems .
MIT Press, Cambridge.
-
-
Augmented Multiparty Interaction (AMI) website: 2006.
URL: http://www.amiproject.org
-
-
Bakis, R., Chen, S., Gopalakrishnan, P. and Gopinath, R.: 1997, Transcription
of broadcast news shows with the IBM large vocabulary speech recognition
system, Speech Recognition Workshop, pp. 67-72.
-
-
Barras, C., Zhu, X., Meignier, S. and Gauvain, J.-L.: 2004, Improving speaker
diarization, Fall 2004 Rich Transcription Workshop (RT04), Palisades,
NY.
-
-
Basseville, M. and Nikiforov, I.: 1993, Detection of abrupt changes-theory
abd application, Prentice-Hall.
-
-
Beigi, H. S. and Maes, S. H.: 1998, Speaker, channel and environment change
detection, World Congress on Automation.
-
-
Beigi, H. S., Maes, S. H. and Sorensen, J. S.: 1998, A distance measure between
collections of distributions and its application to speaker recognition,
Proc. IEEE International Conference on Acoustics, Speech and Signal
Processing, Detroit, USA.
-
-
Ben, M., Betser, M., Bimbot, F. and Gravier, G.: 2004, Speaker diarization
using bottom-up clustering based on a parameter-derived distance between
adapted GMMs, Proc. International Conference on Speech and Language
Processing, Jeju Island, Korea.
-
-
Bilmes, J. and Zweig, G.: 2002, The graphical models toolkit: an open source
software system for speech and time-series processing, Proc. IEEE
International Conference on Acoustics, Speech and Signal Processing,
Orlando, Fl, USA.
-
-
Bimbot, F. and Mathan, L.: 1993, Text-free speaker recognition using an
arithmetic-harmonic sphericity measure, Eurospeech'93, Berlin, Germany,
pp. 169-172.
-
-
Bonastre, J.-F., Delacourt, P., Fredouille, C., Merlin, T. and Wellekens, C.:
2000, A speaker tracking system based on speaker turn detection for NIST
evaluation, Proc. IEEE International Conference on Acoustics, Speech and
Signal Processing, Istanbul, Turkey, pp. 1177-1180.
-
-
Brandstein, M., Adcock, J. and Silverman, H.: 1995, A practical time-delay
estimator for localizing speech sources with a microphone array, Comput.
Speech Lang. 9, 153-159.
-
-
Brandstein, M. and Griebel, S.: 2001, Explicit Speech Modeling for
Microphone Array Applications, Springer, chapter 7.
-
-
Brandstein, M. S. and Silverman, H. F.: 1997, A robust method for speech signal
time-delay estimation in reverberant rooms, Proc. IEEE International
Conference on Acoustics, Speech and Signal Processing, Munich, Germany.
-
-
Brandstein, M. and Ward, D.: 2001, Microphone Arrays, Springer.
-
-
Burger, S., Maclaren, V. and Yu, H.: 2002, The ISL meeting corpus: The impact
of meeting type on speech style, Proc. International Conference on
Speech and Language Processing, Denver, USA.
-
-
Campbell, J. P.: 1997, Speaker recognition: a tutorial, Proceedings of the
IEEE 1.85(9), 1437-1462.
-
-
Canseco, L., Lamel, L. and Gauvain, J.-L.: 2005, A comparative study using
manual and automatic transcriptions for diarization, IEEE Automatic
Speech Recognition and Understanding Workshop, San Juan, Puerto Rico.
-
-
Canseco-Rodriguez, L., Lamel, L. and Gauvain, J.-L.: 2004a,
- Speaker
Diarization from Speech Transcripts, Proc. International Conference on
Speech and Language Processing, Jeju Island, S. Korea, pp. 1272-1275.
-
Canseco-Rodriguez, L., Lamel, L. and Gauvain, J.-L.: 2004b,
- Towards using STT
for Broadcast News Speaker Diarization, Proc. DARPA RT04, Palisades
NY.
-
Carter, G., Nuttall, A. H. and Cable, P. G.: 1973, The smoothed coherence
transform, Proc. IEEE (Lett.) 61, 1497-1498.
-
-
Cassidy, S.: 2004, The macquarie speaker diarization system for rt04s,
NIST 2004 Spring Rich Transcrition Evaluation Workshop, Montreal, Canada.
-
-
Cettolo, M. and Vescovi, M.: 2003, Efficient audio segmentation algorithms
based on the BIC, Proc. IEEE International Conference on Acoustics,
Speech and Signal Processing.
-
-
Champagne, B., Bedard, S. and Stephenne, A.: 1996, Performance of time-delay
estimation in the presence of room reverberation, IEEE Transactions on
Speech and Audio Processing .
-
-
Chan, W., Lee, T., Zheng, N. and hua Ouyang: 2006, Use of vocal source features
in speaker segmentation, Proc. IEEE International Conference on
Acoustics, Speech and Signal Processing, Toulouse, France.
-
-
Chen, L., Rose, R. T., Parrill, F., Han, X., Tu, J., Huang, Z., Harper, M.,
Quek, F., McNeill, D., Tuttle, R. and Huang, T.: 2005, Vace multimodal
meeting corpus, MLMI, Edimburgh, UK.
-
-
Chen, S. S., Gales, M. J. F., Gopinath, R. A., Kanvesky, D. and Olsen, P.:
2002, Automatic transcription of broadcast news, Speech Communication
37, 69-87.
-
-
Chen, S. S. and Gopalakrishnan, P.: 1998, Clustering via the bayesian
information criterion with applications in speech recognition, Proc.
IEEE International Conference on Acoustics, Speech and Signal Processing,
Vol. 2, Seattle, USA, pp. 645-648.
-
-
Chickering, D. M. and Heckerman, D.: 1997, Efficient approximations for the
marginal likelihood of bayesian networks with hidden variables, Machine
Learning 29, 181-212.
-
-
CMU Meetings Corpus website: 2006.
URL: http://penance.is.cs.cmu.edu/meeting_room
-
-
Cognitive Assistant that Learns and Organizes (CALO) website: 2006.
URL: http://caloproject.sri.com/
-
-
Cohen, I. and Berdugo, B.: 2002, Speech enhancement based on a microphone array
and log-spectral amplitude estimation, 22nd Convention of Electrical and
Electronics Engineers in Israel.
-
-
Collet, M., Charlet, D. and Bimbot, F.: 2005, A correlation metric for speaker
tracking using anchor models, Proc. IEEE International Conference on
Acoustics, Speech and Signal Processing, Philadelphia, USA.
-
-
Computers in the Human Interaction Loop (CHIL) website: 2006.
URL: http://chil.server.de
-
-
Cox, H., Zeskind, R. and Kooij, I.: 1986, Practical supergain, IEEE
Transactions on Acoustics, Speech and Signal Processing
34(3), 393-397.
-
-
Cox, H., Zeskind, R. and Owen, M.: 1987, Robust adaptive beamforming, IEEE
Transactions on Acoustics, Speech and Signal Processing
35(10), 1365-1376.
-
-
DARPA Effective, Affordable, Reusable Speech-to-Text (EARS): 2004.
URL: http://www.darpa.mil/ipto/programs/ears
-
-
Delacourt, P., Kryze, D. and Wellekens, C. J.: 1999a,
- Detection of speaker
changes in an audio document, Eurospeech-1999, Budapest, Hungary.
-
Delacourt, P., Kryze, D. and Wellekens, C. J.: 1999b,
- Speaker-based
segmentation for audio data indexing, ESCA Workshop on accessing
Information in Audio Data.
-
Delacourt, P. and Wellekens, C. J.: 1999, Audio data indexing: Use of
second-order statistics for speaker-based segmentation, IEEE
International Conference on Multimedia, Computing and Systems, Florence,
Italy.
-
-
Delacourt, P. and Wellekens, C. J.: 2000, DISTBIC: A speaker-based
segmentation for audio data indexing, Speech Communication: Special
Issue in Accessing Information in Spoken Audio 32, 111-126.
-
-
Deshayes, J. and Picard, D.: 1986, Off-line statistical analysis of
change-point models using non-parametric and likelihood methods,
Springer-Verlag.
-
-
Digalakis, V., Monaco, P. and Murveit, H.: 1996, Genones: generalized mixture
tying in continuous hidden markov model-based speech recognizers, IEEE
transactions on speech and audio processing 4(4), 281-289.
-
-
Doclo, S. and Moonen, M.: 2002, Gsvd-based optimal filtering for single and
multimicrophone speech enhancement, IEEE Trans. Signal Processing
50, 2230-2244.
-
-
Duda, R. and Hart, P.: 1973, Pattern classification and Scene analysis,
John Wiley & Sons.
-
-
Dunn, R. B., Reynolds, D. and Quatieri, T. F.: 2000, Approaches to speaker
detection and tracking in conversational speech, Digital signal
processing 10, 93-112.
-
-
Eckart, C.: 1952, Optimal rectifier systems for the detection of steady
signals, Technical Report Rep SI0 12692, SI0 Ref 52-11,1952, Univ.
California, Scripps Inst. Oceanography, Marine Physical Lab.
-
-
Ellis, D. and Liu, J. C.: 2004, Speaker turn detection based on
between-channels differences, Proc. IEEE International Conference on
Acoustics, Speech and Signal Processing.
-
-
F. Reed, P. F. and Bershad, N.: 1981, Time delay estimation using the lms
adaptive filter - static behavior, IEEE Transactions on Acoustics,
Speech and Signal Processing .
-
-
Fiérrez-Aguilar, J., Ortega-García, J. and González-Rodríguez, J.:
2003, Fusion strategies in multimodal biometric verification, IEEE
International Conference on Multimedia and Expo.
-
-
Fischer, S. and Kammeyer, K.-D.: 1997, Broadband beamforming with adaptive
postfiltering for speech acquisition in noisy environments, Proc. IEEE
International Conference on Acoustics, Speech and Signal Processing.
-
-
Fiscus, J. G., Ajot, J., Michet, M. and Garofolo, J. S.: 2006, The rich
transcription 2006 spring meeting recognition evaluation, NIST 2006
Spring Rich Transcrition Evaluation Workshop, Washington DC, USA.
-
-
Fiscus, J. G., Garofolo, J., Ajot, J. and Michet, M.: 2006, Rt-06s speaker
diarization results and speech activity detection results, NIST 2006
Spring Rich Transcrition Evaluation Workshop, Washington DC, USA.
-
-
Fiscus, J. G., Radde, N., Garofolo, J. S., Le, A., Ajot, J. and Laprun, C. D.:
2005, The rich transcription 2005 spring meeting recognition evaluation,
NIST 2005 Spring Rich Transcrition Evaluation Workshop, Edimburgh, UK.
-
-
Flanagan, J., Johnson, J., Kahn, R. and Elko, G.: 1994, Computer-steered
microphone arrays for sound transduction in large rooms, Journal of the
Acoustic Society of America 78, 1508-1518.
-
-
Fredouille, C., Moraru, D., Meignier, S., Besacier, L. and Bonastre, J.-F.:
2004, The NIST 2004 spring rich transcription evaluation: Two-axis merging
strategy in the context of multiple distant microphone based meeting speaker
segmentation, NIST 2004 Spring Rich Transcrition Evaluation Workshop,
Montreal, Canada.
-
-
Gallardo-Antolin, A., Anguera, X. and Wooters, C.: 2006, Multi-stream speaker
diarization systems for the meetings domain, Proc. International
Conference on Speech and Language Processing, Pittsburgh, USA.
-
-
Gangadharaiah, R., Narayanaswamy, B. and Balakrishnan, N.: 2004, A novel method
for two-speaker segmentation, Proc. International Conference on Speech
and Language Processing, Jeju, S. Korea.
-
-
Garofolo, J. S., Laprun, C. D. and Fiscus, J. G.: 2004, The rich transcription
2004 spring meeting recognition evaluation, NIST 2004 Spring Rich
Transcrition Evaluation Workshop, Montreal, Canada.
-
-
Gauvain, J.-L., Lamel, L. and Adda, G.: 1998, Partitioning and transcription of
broadcast news data, Proc. International Conference on Speech and
Language Processing, Vol. 4, Sidney, Australia, pp. 1335-1338.
-
-
Gish, H. and Schmidt, M.: 1994, Text-independent speaker identification,
Signal Processing Magazine, IEEE pp. 18-32.
-
-
Gish, H., Siu, M.-H. and Rohlicek, R.: 1991, Segregation of speakers for speech
recognition and speaker identification, Proc. IEEE International
Conference on Acoustics, Speech and Signal Processing, Vol. 2, Toronto,
Canada, pp. 873-876.
-
-
Griffiths, L. and Jim, C.: 1982, An alternative approach to linearly
constrained adaptive beamforming, IEEE Trans. on Antenas and
Propagation .
-
-
Hain, T., Johnson, S., Turek, A., Woodland, P. and Young, S. J.: 1998, Segment
generation and clustering in the HTK broadcast news transcription system,
DARPA Broadcast News Transcription and Understanding Workshop,
pp. 133-137.
-
-
Heck, L. and Sankar, A.: 1997, Acoustic clustering and adaptation for robust
speech recognition, Eurospeech-97, Rhodes, Greece.
-
-
Hoshuyama, O., Sugiyama, A. and Hirano, A.: 1999, A robust adaptive beamformer
for microphone arrays with a blocking matrix using coefficient-constrained
adaptive filters, IEEE Trans. on Signal Processing .
-
-
Humaine emotion research website: 2006.
URL: http://emotion-research.net/
-
-
Hung, J., Wang, H. and Lee, L.: 2000, Automatic metric based speech
segmentation for broadcast news via principal component analysis, Proc.
International Conference on Speech and Language Processing, Beijing, China.
-
-
ICSI Meeting Recorder Project: Channel skew in ICSI-recorded meetings:
2006.
URL: http://www.icsi.berkeley.edu/ dpwe/research/mtgrcdr/chanskew.html
-
-
ICSI Meetings Recorder corpus: 2006.
URL: http://www.icsi.berkeley.edu/Speech/mr
-
-
Ifeachor, E. and Jervis, B.: 1996, Digital signal processing: a practical
approach, Addison-Wesley.
-
-
Ikbal, S., Misra, H., Sivadas, S., Hermansky, H., and Bourlard, H.: 2004,
Entropy based combination of tandem representations for noise robust asr,
Proc. International Conference on Speech and Language Processing, South
Korea.
-
-
improvements of the E-HMM based speaker diarization system for meetings
records, T.: 2006, The rich transcription 2006 spring meeting recognition
evaluation, NIST 2006 Spring Rich Transcrition Evaluation Workshop,
Washington DC, USA.
-
-
Interactive Multimodal Information Management (IM2) website: 2006.
URL: http://www.im2.ch
-
-
Istrate, D., Fredouille, C., Meignier, S., Besacier, L. and Bonastre, J.-F.:
2005, NIST RT05S evaluation: Pre-processing techniques and speaker
diarization on multiple microphone meetings, NIST 2005 Spring Rich
Transcrition Evaluation Workshop, Edinburgh, UK.
-
-
Janin, A., Ang, J., Bhagat, S., Dhillon, R., Edwards, J., Macias-Guarasa, J.,
Morgan, N., Peskin, B., Shriberg, E., Stolcke, A., Wooters, C. and Wrede, B.:
2004, The icsi meeting project: Resources and research, ICCASP,
Montreal.
-
-
Janin, A., Baron, D., Edwards, J., Ellis, D., Gelbart, D., Morgan, N., Peskin,
B., Pfau, T., Shriberg, E., Stolcke, A. and Wooters, C.: 2003, The ICSI
meeting corpus, ICCASP, Hong Kong.
-
-
Janin, A., Stolcke, A., Anguera, X., Boakye, K., Cetin, O., Frankel, J. and
Zheng, J.: 2006, The ICSI-SRI spring 2006 meeting recognition system,
Proceedings of the Rich Transcription 2006 Spring Meeting Recognition
Evaluation, Washington, USA.
-
-
Jin, H., Kubala, F. and Schwartz, R.: 1997, Automatic speaker clustering,
DARPA Speech Recognition workshop, Chantilly, USA.
-
-
Jin, Q., Laskowski, K., Schultz, T. and Waibel, A.: 2004, Speaker segmentation
and clustering in meetings, NIST 2004 Spring Rich Transcrition
Evaluation Workshop, Montreal, Canada.
-
-
Johnson, D. and Dudgeon, D.: 1993, Array signal processing, Prentice
Hall.
-
-
Johnson, S.: 1999, Who spoke when? - automatic segmentation and clustering for
determining speaker turns, Eurospeech-99, Budapest, Hungary.
-
-
Johnson, S. and Woodland, P.: 1998, Speaker clustering using direct
maximization of the MLLR-adapted likelihood, Proc. International
Conference on Speech and Language Processing, Vol. 5, pp. 1775-1779.
-
-
Juang, B. and Rabiner, L.: 1985, A probabilistic distance measure for hidden
markov models, AT&T Technical Journal 64, AT&T.
-
-
Kaneda, Y.: 1991, Directivity characteristics of adaptive microphone-array for
noise reduction (amnor), Journal of the Acoustical Society of Japan
12(4), 179-187.
-
-
Kaneda, Y. and Ohga, J.: 1986, Adaptive microphone-array system for noise
reduction, IEEE Trans. on Acoustics, Speech, and Signal Processing .
-
-
Kass, R. E. and Raftery, A. E.: 1995, Bayes factors, Journal of the
American Statistics association 90, 773-795.
-
-
Kataoka, A. and Ichirose, Y.: 1990, A microphone array configuration for anmor
(adaptive microphone-array system for noise reduction), Journal of the
Acoustical Society of Japan 11(6), 317-325.
-
-
Kemp, T., Schmidt, M., Westphal, M. and Waibel, A.: 2000, Strategies for
automatic segmentation of audio data, Proc. IEEE International
Conference on Acoustics, Speech and Signal Processing, Istanbul, Turkey,
pp. 1423-1426.
-
-
Kim, H.-G., Ertelt, D. and Sikora, T.: 2005, Hybrid speaker-based segmentation
system using model-level clustering, Proc. IEEE International Conference
on Acoustics, Speech and Signal Processing, Philadelphia, USA.
-
-
Knapp, C. H. and Carter, G. C.: 1976, The generalized correlation method for
estimation of time delay, IEEE Transactions on Acoustics, Speech and
Signal Processing ASSP-24(4), 320-327.
-
-
Kohonen, T.: 1990, The self-organizing map, Proceedings of the IEEE
78(9), 1464-1480.
-
-
Krim, H. and Viberg, M.: 1996, Two decades of array signal processing research,
IEEE Signal Processing Magazine pp. 67-94.
-
-
Kristjansson, T., Deligne, S. and Olsen, P.: 2005, Voicing features for robust
speech detection, Proc. International Conference on Speech and Language
Processing, Lisbon, Portugal.
-
-
Kubala, F., Jin, H., Matsoukas, S., Gnuyen, L., Schwartz, R. and Machoul, J.:
1997, The 1996 BBN byblos HUB-4 transcription system, Speech
Recognition Workshop, pp. 90-93.
-
-
Lapidot, I.: 2003, SOM as likelihood estimator for speaker clustering,
Eurospeech, Geneva, Switzerland.
-
-
Lapidot, I., Gunterman, H. and Cohen, A.: 2002, Unsupervised speaker
recognition based on competition between self-organizing-maps, IEEE
Transactions on Neural Networks 13(4), 877-887.
-
-
Lathoud, G. and McCowan, I. A.: 2003, Location based speaker segmentation,
Proc. IEEE International Conference on Acoustics, Speech and Signal
Processing.
-
-
Lathoud, G., McCowan, I. and Odobez, J.: 2004, Unsupervised location-based
segmentation of multi-party speech, ICASSP-NIST Meeting Recognition
Workshop.
-
-
Lathoud, G., Odobez, J.-M. and McCowan, I.: 2004, Short-term spatio-temporal
clustering of sporadic and concurrent events, Technical Report IDIAP-RR
04-14, IDIAP.
-
-
Lee, K.-F.: 1998, Large vocabulary speaker-independent continuous speech
recognition: the SPHINX system, PhD thesis, Carnegie Mellon University,
Pittsburgh, PA, USA.
-
-
Leeuwen, D. A. V. and Huijbregts, M.: 2006, The AMI speaker diarization
system for NIST RT06s meeting data, NIST 2006 Spring Rich
Transcrition Evaluation Workshop, Washington DC, USA.
-
-
Li, Q., Zheng, J., Tsai, A., and Zhou, Q.: 2002, Robust endpoint detection and
energy normalization for real-time speech and speaker recognition, IEEE
Transactions on Speech and Audio Processing 10(3).
-
-
Li, X.: 2005, Combination and Generation of Parallel Feature Streams for
Improved Speech Recognition, PhD thesis, ECE Department, CMU.
-
-
Liu, D. and Kubala, F.: 1999, Fast speaker change detection for broadcast news
transcription and indexing, Eurospeech-99, Vol. 3, Budapest, Hungary,
pp. 1031-1034.
-
-
Lopez, J. F. and Ellis, D. P. W.: 2000a,
- Using acoustic condition clustering to
improve acoustic change detection on broadcast news, Proc. International
Conference on Speech and Language Processing, Beijing, China.
-
Lopez, J. F. and Ellis, D. P. W.: 2000b,
- Using acoustic condition clustering to
improve acoustic change detection on broadcast news, Proc. International
Conference on Speech and Language Processing, Beijing, China.
-
Lu, L., Li, S. Z. and Zhang, H.-J.: 2001, Content-based audio segmentation
using support vector machines, ACM Multimedia Conference, pp. 203-211.
-
-
Lu, L. and Zhang, H.-J.: 2002a,
- Real-time unsupervised speaker change
detection, ICPR'02, Vol. 2, Quebec City, Canada.
-
Lu, L. and Zhang, H.-J.: 2002b,
- Speaker change detection and tracking in
real-time news broadcasting analysis, ACM International Conference on
Multimedia, pp. 602-610.
-
Lu, L., Zhang, H.-J. and Jiang, H.: 2002, Content analysis for audio
classification and segmentation, IEEE Transactions on Speech and Audio
Processing 10(7), 504-516.
-
-
MacKay, D. J. C.: 1997, Ensemble learning for hidden Markov models.
http://www.inference.phy.cam.ac.uk/mackay/abstracts/ensemblePaper.html.
-
-
Malegaonkar, A., Ariyaeeinia, A., Sivakumaran, P. and Fortuna, J.: 2006,
Unsupervised speaker change detection using probabilistic pattern matching,
IEEE Signal Processing Letters 13(8), 509-512.
-
-
Marro, C., Mahieux, Y. and Simmer, K.: 1998, Analysis of noise reduction and
dereverberation techniques based on microphone arrays with postfiltering,
IEEE Trans. on Speech and Audio Processing .
-
-
McCowan, I.: 2001, Robust Speech Recognition using microphone arrays, PhD
thesis, Queensland University of Technology, Australia.
-
-
McCowan, I. A., Pelecanos, J. and Sridharan, S.: 2001, Robust speaker
recognition using microphone arrays, IEEE Speaker Odyssey recognition
workshop.
-
-
McCowan, I., Gatica-Perez, D., Bengio, S., Lathoud, G., Barnard, M. and Zhang,
D.: 2005, Automatic analysis of multimodal group actions in meetings,
IEEE Trans. on Pattern Analysis and Machine Intelligence 27, 305-317.
-
-
McCowan, I., Marro, C. and Mauuary, L.: 2000, Robust speech recognition using
near-field superdirective beamforming with post-filtering, Proc. IEEE
International Conference on Acoustics, Speech and Signal Processing, Vol. 3,
pp. 1723-1726.
-
-
McCowan, I., Moore, D. and Sridharan, S.: 2000, Speech enhancement using
near-field superdirectivity with an adaptive sidelobe canceler and
post-filter, Australian International Conference on Speech Science and
Technology, pp. 268-273.
-
-
Meignier, S., Bonastre, J.-F. and Igournet, S.: 2001, E-HMM approach for
learning and adapting sound models for speaker indexing, A speaker
Oddissey, Chania, Crete, pp. 175-180.
-
-
Meignier, S., Moraru, D., Fredouille, C., Besacier, L. and Bonastre, J.-F.:
2004, Benefits of prior acoustic segmentation for automatic speaker
segmentation, Proc. IEEE International Conference on Acoustics, Speech
and Signal Processing, Montreal, Canada.
-
-
Meinedo, H. and Neto, J.: 2003, Audio segmentation, classification and
clustering in a broadcast news task, Proc. IEEE International Conference
on Acoustics, Speech and Signal Processing, Hong-Kong, China.
-
-
Metze, F., Fugen, C., Pan, Y., Schultz, T. and Yu, H.: 2004, The ISL RT-04S
meetings transcription system, Proc. IEEE International Conference on
Acoustics, Speech and Signal Processing, Montreal, Canada.
-
-
Mirghafori, N., Stolcke, A., Wooters, C., Pirinen, T., Bulyko, I., Gelbart, D.,
Graciarena, M., Otterson, S., Peskin, B. and Ostendorf, M.: 2004, From
switchboard to meetings: Development of the 2004 ICSI-SRI-UW meeting
recognition system, Proc. International Conference on Speech and
Language Processing, Jeju Island, Korea.
-
-
Mirghafori, N. and Wooters, C.: 2006, Nuts and flakes: A study of data
characteristics in speaker diarization, Proc. IEEE International
Conference on Acoustics, Speech and Signal Processing, Toulouse, France.
-
-
Misra, H., Bourlard, H., and Tyagi, V.: 2003, New entropy based combination
rules in hmm/ann multi-stream asr, Proc. IEEE International Conference
on Acoustics, Speech and Signal Processing, Hong Kong.
-
-
Moh, Y., Nguyen, P. and Junqua, J.-C.: 2003, Towards domain independent speaker
clustering, Proc. IEEE International Conference on Acoustics, Speech and
Signal Processing, Hong Kong.
-
-
Moraru, D., Ben, M. and Gravier, G.: 2005, Experiments on speaker tracking and
segmentation in radio broadcast news, Proc. International Conference on
Speech and Language Processing, Lisbon, Portugal.
-
-
Moraru, D., Besacier, L., Meignier, S., Fredouille, C. and francois Bonastre,
J.: 2004, Speaker diarization in the elisa consodrium over the last 4 years,
NIST 2004 Spring Rich Transcrition Evaluation Workshop, Montreal,
Canada.
-
-
Moraru, D., Meignier, S., Besacier, L., Bonastre, J.-F. and Magrin-Chagnolleau,
I.: 2002, The ELISA consortium approaches in speaker segmentation during
the NIST 2002 speaker recognition evaluation, NIST 2002 Spring Rich
Transcrition Evaluation Workshop.
-
-
Moraru, D., Meignier, S., Besacier, L., Bonastre, J.-F. and Magrin-Chagnolleau,
I.: 2004, The ELISA consortium approaches in speaker segmentation during
the NIST 2002 speaker recognition evaluation, Proc. IEEE International
Conference on Acoustics, Speech and Signal Processing, Montreal, Canada.
-
-
Moraru, D., Meignier, S., Fredouille, C., Besacier, L. and Bonastre, J.-F.:
2004, The ELISA consortium approaches in broadcast news speaker
segmentation during the NIST 2003 rich transcription evaluation, Proc.
IEEE International Conference on Acoustics, Speech and Signal Processing,
Montreal, Canada.
-
-
Mori, K. and Nakagawa, S.: 2001, Speaker change detection and speaker
clustering using VQ distortion for broadcast news speech recognition,
Proc. IEEE International Conference on Acoustics, Speech and Signal
Processing, Vol. 1, Salt Lake City, USA, pp. 413-416.
-
-
Multimodal Meeting Manager (M4) website: 2006.
URL: http://www.m4project.org
-
-
Nakagawa, S. and Suzuki, H.: 1993, A new speech recognition method based on
VQ-distortion and hmm, Proc. IEEE International Conference on
Acoustics, Speech and Signal Processing, Vol. 2, Minneapolis, USA,
pp. 676-679.
-
-
National Institute for Standards and Technology: 2006.
URL: http://www.nist.gov/speech
-
-
Nguyen, P.: 2003, SWAMP: An isometric frontend for speaker clustering,
NIST 2003 Rich Transcription Workshop, Boston, USA.
-
-
Nishida, M. and Kawahara, T.: 2003, Unsupervised speaker indexing using speaker
model selection based on bayesian information criterion, Proc. IEEE
International Conference on Acoustics, Speech and Signal Processing, Hong
Kong.
-
-
NIST Fall Rich Transcription Evaluation website: 2006.
URL: http://www.nist.gov/speech/tests/rt/rt2004/fall
-
-
NIST Fall Rich Transcription on meetings 2006 Evaluation Plan:
- 2006.
URL: http://www.nist.gov/speech/tests/rt/rt2006/spring/docs/rt06s-meeting-eval-plan-V2.pdf
-
NIST MD-eval-v21 DER evaluation script: 2006.
URL: http://www.nist.gov/speech/tests/rt/rt2006/spring/code/md-eval-v21.pl
-
-
NIST Pilot Meeting Corpus website: 2006.
URL: http://www.nist.gov/speech/test_beds/mr_proj/meeting_co rpus_1
-
-
NIST Rich Transcription evaluations, website:
http://www.nist.gov/speech/tests/rt: 2006.
URL: http://www.nist.gov/speech/tests/rt
-
-
NIST Speech Recognition Evaluation: 2006.
URL: http://www.nist.gov/speech/tests/spk/index.htm
-
-
NIST Speech tools and APIs: 2006.
URL: http://www.nist.gov/speech/tools/index.htm
-
-
NIST Spring Rich Transcription Evaluation in Meetings website,
http://www.nist.gov/speech/tests/rt/rt2005/spring: 2006.
URL: http://www.nist.gov/speech/tests/rt/rt2005/spring
-
-
Omar, M. K., Chaudhari, U. and Ramaswamy, G.: 2005, Blind change detection for
audio segmentation, Proc. IEEE International Conference on Acoustics,
Speech and Signal Processing, Philadelphia, USA.
-
-
Ouellet, P., Boulianne, G. and Kenny, P.: 2005, Fravors of gaussian warping,
Proc. International Conference on Speech and Language Processing,
Lisbon, Portugal.
-
-
Pardo, J. M., Anguera, X. and Wooters, C.: 2006a,
- Speaker diarization for
multi-microphone meetings using only between-channel differences, MLMI
2006.
-
Pardo, J. M., Anguera, X. and Wooters, C.: 2006b,
- Speaker diarization for
multiple distant microphone meetings: Mixing acoustic features and
inter-channel time differences, Proc. International Conference on Speech
and Language Processing.
-
Pattern analysis, Statistical modeling and Computational learning (Pascal)
website: 2006.
URL: http://www.pascal-network.org/
-
-
Pelecanos, J. and Sridharan, S.: 2001, Feature warping for robust speaker
verification, ISCA Speaker Recognition Workshop odyssey, Crete, Grece.
-
-
Perez-Freire, L. and Garcia-Mateo, C.: 2004, A multimedia approach for audio
segmentation in TV broadcast news, Proc. IEEE International Conference
on Acoustics, Speech and Signal Processing, Montreal, Canada, pp. 369-372.
-
-
Pwint, M. and Sattar, F.: 2005, A segmentation method for noisy speech using
genetic algorithm, Proc. IEEE International Conference on Acoustics,
Speech and Signal Processing, Philadelphia, USA.
-
-
Rentzeperis, E., Stergiou, A., Boukis, C., Pnevmatikakis, A. and Polymenakos,
L. C.: 2006, The 2006 athens information technology speech activity detection
and speaker diarization systems, NIST 2006 Spring Rich Transcrition
Evaluation Workshop, Washington DC, USA.
-
-
Reynolds, D. A., Singer, E., Carlson, B. A., O'Leary, G. C., McLaughlin, J. J.
and Zixxman, M. A.: 1998, Blind clustering of speech utterances based on
speaker and language characteristics, Proc. International Conference on
Speech and Language Processing, Sidney, Australia.
-
-
Reynolds, D. and Torres-Carrasquillo, P.: 2004, The MIT Lincoln Laboratories
RT-04F diarization systems: Applications to broadcast audio and telephone
conversations, Fall 2004 Rich Transcription Workshop (RT04),
Palisades, NY.
-
-
Roch, M. and Cheng, Y.: 2004, Speaker segmentation using the MAP-adapted
bayesian information criterion, Odyssey-04, Toledo, Spain,
pp. 349-354.
-
-
Rombouts, G. and M.Moonen: 2003, Qrd-based unconstrained optimal filtering for
acoustic noise reduction, IEEE Trans. Signal Processing
83(9), 1889-1904.
-
-
Rosca, J., Balan, R. and Beaugeant, C.: 2003, Multi-channel psychoacoustically
motivated speech enhancement, Proc. IEEE International Conference on
Acoustics, Speech and Signal Processing.
-
-
Ross, A., Jain, A. K. and Qian, J. Z.: 2001, Information fusion in biometrics,
3rd International Conference on Audio and Video-Based Person
Authentication.
-
-
Roth, P.: 1971, Effective measurements using digital signal analysis, IEEE
Spectrum 8, 62-70.
-
-
Rougui, J., Rziza, M., Aboutajdine, D., Gelgon, M. and Martinez, J.: 2006, Fast
incremental clustering of gaussian mixture speaker models for scaling up
retrieval in on-line broadcast, Proc. IEEE International Conference on
Acoustics, Speech and Signal Processing, Toulouse, France.
-
-
Sanchez-Bote, J., Gonzalez-Rodriguez, J. and Ortega-Garcia, J.: 2003, A
real-time auditory-basec microphone array assessed with e-rasti evaluation
proposal, Proc. IEEE International Conference on Acoustics, Speech and
Signal Processing.
-
-
Sankar, A., Beaufays, F. and Digalakis, V.: 1995, Training data clustering for
improved speech recognition, Eurospeech-95, Madrid, Spain.
-
-
Sankar, A., Weng, F., Stolcke, Z. R. A. and Grande, R. R.: 1998, Development of
SRI's 1997 broadcast news transcription system, DARPA Broadcast News
Transcription and Understanding Workshop, Landsdowne, USA.
-
-
Schmidt, R.: 1986, Multiple emitter location and signal parameter estimation,
IEEE Transactions on Antennas and Propagation .
-
-
Schwarz, G.: 1971, A sequential student test, The Annals of Statistics
42(3), 1003-1009.
-
-
Schwarz, G.: 1978, Estimating the dimension of a model, The Annals of
Statistics 6, 461-464.
-
-
Shaobing Chen, S. and Gopalakrishnan, P.: 1998, Speaker, environment and
channel change detection and clustering via the bayesian information
criterion, Proceedings DARPA Broadcast News Transcription and
Understanding Workshop, Virginia, USA.
-
-
Shinozaki, T. and Ostendorf, M.: 2007, Cross-validation EM training for
robust parameter estimation, Proc. IEEE International Conference on
Acoustics, Speech and Signal Processing .
submitted.
-
-
sian Cheng, S. and min Wang, H.: 2003, A sequential metric-based audio
segmentation method via the bayesian information criterion,
Eurospeech'03, Geneva, Switzerland.
-
-
sian Cheng, S. and min Wang, H.: 2004, METRIC-SEQDAC: A hybrid approach for
audio segmentation, Proc. International Conference on Speech and
Language Processing, Jeju, South Korea.
-
-
Siegler, M. A., Jain, U., Raj, B. and Stern, R. M.: 1997, Automatic
segmentation, classification and clustering of broadcast news audio,
DARPA Speech Recognition Workshop, Chantilly, pp. 97-99.
-
-
Similar Network of Excellence website: 2006.
URL: http://www.similar.cc/cms/default.asp?id=0
-
-
Sinha, R., Tranter, S. E., Gales, J. J. F. and Woodland, P. C.: 2005, The
cambridge university march 2005 speaker diarisation system, European
Conference on Speech Communication and Technology (Interspeech), Lisbon,
Portugal, pp. 2437-2440.
-
-
Siu, M.-H., Yu, G. and Gish, H.: 1992, An unsupervised, sequential learning
algorithm for the segmentation of speech waveforms with multiple speakers,
Proc. IEEE International Conference on Acoustics, Speech and Signal
Processing, Vol. 2, San Francisco, USA, pp. 189-192.
-
-
Sivakumaran, P., Fortuna, J. and Ariyaeeinia, A.: 2001, On the use of the
bayesian information criterion in multiple speaker detection,
Eurospeech'01, Scandinavia.
-
-
Solomonov, A., Mielke, A., Schmidt, M. and Gish, H.: 1998, Clustering speakers
by their voices, Proc. IEEE International Conference on Acoustics,
Speech and Signal Processing, Vol. 2, Seattle, USA, pp. 757-760.
-
-
Speech in noisy environments: 2006.
URL: http://www.speech.sri.com/projects/spine/
-
-
Spring 2005 (RT-05S) Rich Transcription Meeting Recognition Evaluation
Plan: n.d.
URL: http://www.nist.gov/speech/tests/rt/rt2005/spring/rt05s-meeting-eval-plan-V1.pdf
-
-
Spring 2006 (RT-06S) Rich Transcription Meeting Recognition Evaluation
Plan: n.d.
URL: http://www.nist.gov/speech/tests/rt/rt2006/spring/docs/rt06s-meeting-eval-plan-V2.pdf
-
-
Stolcke, A., Anguera, X., Boakye, K., Cetin, O., Grezl, F., Janin, A., Mandal,
A., Peskin, B., Wooters, C. and Zheng, J.: 2005, Further progress in meeting
recognition: The icsi-sri spring 2005 speech-to-text evaluation system,
RT05s Meetings Recognition Evaluation, Edinburgh, Great Brittain.
-
-
Strassel, S. and Glenn, M.: 2004, Shared linguistic resources for human
language technology in the meeting domain, ICASSP-DARPA Meetings
Diarization Workshop, Montreal, Canada.
-
-
Sturim, D., Reynolds, D., Singer, E. and J.P.Campbell: 2001, Speaker indexing
in large audio databases using anchor models, Proc. IEEE International
Conference on Acoustics, Speech and Signal Processing, Salt Lake City, USA.
-
-
Tager, W.: 1998a,
- Etudes en traitement d'antenne pour la prise de son,
PhD thesis, Universite de Rennes.
-
Tager, W.: 1998b,
- Near field superdirectivity (nfsd), Proc. IEEE
International Conference on Acoustics, Speech and Signal Processing,
pp. 2045-2048.
-
Tranter, S.: 2005, Two-way cluster voting to improve speaker diarization
performance, Proc. IEEE International Conference on Acoustics, Speech
and Signal Processing, Montreal, Canada.
-
-
Tranter, S. and Reynolds, D.: 2004, Speaker diarization for broadcast news,
ODYSSEY'04, Toledo, Spain.
-
-
Trees, H. V.: 1968, Detection Estimation and Modulation Theory, Vol. 1,
Wiley.
-
-
Tritschler, A. and Gopinath, R.: 1999, Improved speaker segmentation and
segments clustering using the bayesian information criterion,
Eurospeech'99, pp. 679-682.
-
-
Tsai, W.-H., Cheng, S.-S., Chao, Y.-H. and Wang, H.-M.: 2005, Clustering speech
utterances by speaker using eigenvoice-motivated vector space models,
Proc. IEEE International Conference on Acoustics, Speech and Signal
Processing, Philadelphia, USA.
-
-
Tsai, W.-H., Cheng, S.-S. and Wang, H.-M.: 2004, Speaker clustering of speech
utterances using a voice characteristic reference space, Proc.
International Conference on Speech and Language Processing, Jeju Island,
Korea.
-
-
Tsai, W.-H. and Wang, H.-M.: 2006, On maximizing the within-cluster homogeneity
of speaker voice characteristics for speech utterance clustering, Proc.
IEEE International Conference on Acoustics, Speech and Signal Processing,
Toulouse, France.
-
-
Valente, F.: 2006, Infinite models for speaker clustering, Proc.
International Conference on Speech and Language Processing, Pittsburgh, USA.
-
-
Valente, F. and Wellekens, C.: 2004, Variational bayesian speaker clustering,
Speaker Odyssey, Toledo, Spain.
-
-
Valente, F. and Wellekens, C.: 2005, Variational bayesian adaptation for
speaker clustering, Proc. IEEE International Conference on Acoustics,
Speech and Signal Processing, Lisbon, Portugal.
-
-
Valin, J., Rouat, J. and Michaud, F.: 2004, Microphone array post-filter for
separation of simultaneous non-stationary sources, Proc. IEEE
International Conference on Acoustics, Speech and Signal Processing.
-
-
van Leeuwen, D.: 2005, The TNO speaker diarization system system for NIST
RT05s for meeting data, NIST 2005 Spring Rich Transcrition
Evaluation Workshop, Edinburgh, UK.
-
-
Vandecatseye, A. and Martens, J.-P.: 2003, A fast, accurate and stream-based
speaker segmentation and clustering algorithm, Eurospeech'03, Geneva,
Switzerland, pp. 941-944.
-
-
Vandecatseye, A., Martens, J.-P. et al.: 2004, The cost278 pan-european
broadcast news database, LREC'04, Lisbon, Portugal.
-
-
Veen, B. V. and Buckley, K.: 1988, Beamforming: A versatile approach to spacial
filtering, IEEE Transactions on Acoustics, Speech and Signal Processing
.
-
-
Verlinde, P., Chollet, G. and Acheroy, M.: 2000, Multi-modal identity
verification using expert fusion, Information Fusion
1(1), 17-33.
-
-
Vescovi, M., Cettolo, M. and Rizzi, R.: 2003, A DP algoritm for speaker
change detection, Eurospeech'03.
-
-
Video analysis and content extraction for defense intelligence (ARDA-VACE
II): 2006.
URL: http://www.informedia.cs.cmu.edu/arda/vaceII.html
-
-
Wactlar, H., Hauptmann, A. and Witbrock, M.: 1996, News on-demand experiments
in speech recognition, ARPA STL Workshop.
-
-
Wegmann, S., Scattone, F., Carp, I., Gillick, L., Roth, R. and Yamron, J.:
1998, Dragon system's 1997 broadcast news transcription system, DARPA
Broadcast News Transcription and Understanding Workshop, Landsdowne, USA.
-
-
Wiener and Norbert: 1949, Extrapolation, Interpolation, and Smoothing of
Stationary Time Series, Wiley.
-
-
Wilcox, L., Chen, F., Kimber, D. and Balasubramanian, V.: 1994, Segmentation of
speech using speaker identification, Proc. IEEE International Conference
on Acoustics, Speech and Signal Processing, Vol. 1, Adelaide, Australia,
pp. 161-164.
-
-
Willsky, A. S. and Jones, H. L.: 1976, A generalized likelihood ratio approach
to the detection and estimation of jumps in linear systems, IEEE
Transactions on Automatic Control AC-21(1), 108-112.
-
-
Woodland, P., Gales, M., Pye, D. and Young, S.: 1997, The development of the
1996 HTK broadcast news transcription system, Speech Recorgnition
Workshop, pp. 73-78.
-
-
Wooters, C., Fung, J., Peskin, B. and Anguera, X.: 2004, Towards robust speaker
segmentation: The ICSI-SRI fall 2004 diarization system, Fall 2004
Rich Transcription Workshop (RT04), Palisades, NY.
-
-
Wu, T., Lu, L., Chen, K. and Zhang, H.-J.: 2003a,
- UBM-based incremental
speaker adaptation, ICME'03, Vol. 2, pp. 721-724.
-
Wu, T., Lu, L., Chen, K. and Zhang, H.-J.: 2003b,
- UBM-based real-time speaker
segmentation for broadcasting news, Proc. IEEE International Conference
on Acoustics, Speech and Signal Processing.
-
Wu, T., Lu, L., Chen, K. and Zhang, H.-J.: 2003c,
- Universal background models
for real-time speaker change detection, International Conference on
Multimedia Modeling.
-
Yamaguchi, M., Yamashita, M. and Matsunaga, S.: 2005, Spectral
cross-correlation features for audio indexing of broadcast news and meetings,
Proc. International Conference on Speech and Language Processing.
-
-
Young, S., Kershaw, D., Odell, J., Ollason, D., Valtchev, V. and Woodland, P.:
2005, The HTK Book, Cambridge University Engineering Department.
-
-
Zdansky, J. and Nouza, J.: 2005, Detection of acoustic change-points in audio
records via grobal BIC maximization and dynamic programming, Proc.
International Conference on Speech and Language Processing, Lisbon,
Portugal.
-
-
Zelinski, R.: 1988, A microphone array with adaptive post-filtering for noise
reduction in reverberant rooms, Proc. IEEE International Conference on
Acoustics, Speech and Signal Processing, Vol. 5, pp. 2578-2581.
-
-
Zhang, X., Hansen, J. and Rehar, K.: 2004, Speech enhancement based on a
combined multi-channel array with constrained iterative and auditory masked
processing, Proc. IEEE International Conference on Acoustics, Speech and
Signal Processing.
-
-
Zhou, B. and Hansen, J. H.: 2000, Unsupervised audio stream segmentation and
clustering via the bayesian information criterion, Proc. International
Conference on Speech and Language Processing, Vol. 3, Beijing, China,
pp. 714-717.
-
-
Zhu, X., Barras, C., Lamel, L. and Gauvain, J.-L.: 2006, Speaker diarization:
from broadcast news to lectures, NIST 2006 Spring Rich Transcrition
Evaluation Workshop, Washington DC, USA.
-
-
Zhu, X., Barras, C., Meignier, S. and Gauvain, J.-L.: 2005, Combining speaker
identification and bic for speaker diarization, Proc. International
Conference on Speech and Language Processing, Lisbon, Portugal.
-
-
Zochova, P. and Radova, V.: 2005, Modified DISTBIC algorithm for speaker
change detection, Proc. International Conference on Speech and Language
Processing, Lisbon, Portugal.
-
user
2008-12-08