Rich Transcription evaluation datasets
In this appendix a complete list of the data used for the
development and test sets in this thesis is listed. This data
forms the datasets used by NIST in the RT evaluations, in the
conference room recordings subdomain.
Table B.1 shows the complete meeting names and
some relevant information about each meeting. The total
time column indicates the length of the excerpt extracted from
each meeting to be used for the evaluation, in seconds. The column
titled effective duration indicates the length of the
speech regions in each one of the meetings as indicated by the
forced-alignment reference segmentation files.
Table B.1:
Summary of datasets used in the
experiments
Filename |
Dataset |
total |
effective |
# speakers |
# channels |
|
|
duration |
duration |
|
|
CMU_20030109-1530 |
RT02s |
661.2 |
428.93 |
4 |
1 |
CMU_20030109-1600 |
RT02s |
666 |
425.09 |
4 |
1 |
ICSI_20000807-1000 |
RT02s |
682.15 |
443.70 |
6 |
6 |
ICSI_20011030-1030 |
RT02s |
689.97 |
411.73 |
10 |
6 |
LDC_20011121-1700 |
RT02s |
661.5 |
426.51 |
3 |
10 |
LDC_20011207-1800 |
RT02s |
697.4 |
413.21 |
3 |
4 |
NIST_20030623-1409 |
RT02s |
674 |
423.42 |
6 |
7 |
NIST_20030925-1517 |
RT02s |
662.07 |
336.72 |
4 |
7 |
CMU_20020319-1400 |
RT04s |
602.09 |
274.91 |
6 |
1 |
CMU_20020320-1500 |
RT04s |
503.63 |
259.27 |
4 |
1 |
ICSI_20010208-1430 |
RT04s |
599.85 |
369.66 |
7 |
6 |
ICSI_20010322-1450 |
RT04s |
607.53 |
385.61 |
7 |
6 |
LDC_20011116-1400 |
RT04s |
601.7 |
411.69 |
3 |
8 |
LDC_20011116-1500 |
RT04s |
601.5 |
340.50 |
3 |
8 |
NIST_20020214-1148 |
RT04s |
612.31 |
303.08 |
6 |
7 |
NIST_20020305-1007 |
RT04s |
616.67 |
386.41 |
7 |
7 |
AMI_20041210-1052 |
RT05s |
730.802 |
474.97 |
4 |
12 |
AMI_20050204-1206 |
RT05s |
714.385 |
408.56 |
4 |
16 |
CMU_20050228-1615 |
RT05s |
721.5 |
428.87 |
4 |
7 |
CMU_20050301-1415 |
RT05s |
718.479 |
418.79 |
4 |
3 |
ICSI_20010531-1030 |
RT05s |
731.033 |
442.07 |
7 |
6 |
ICSI_20011113-1100 |
RT05s |
719.65 |
448.77 |
9 |
6 |
NIST_20050412-1303 |
RT05s |
727.018 |
352.76 |
10 |
7 |
NIST_20050427-0939 |
RT05s |
715.65 |
431.06 |
4 |
7 |
VT_20050304-1300 |
RT05s |
718.968 |
511.54 |
5 |
2 |
VT_20050318-1430 |
RT05s |
724.619 |
311.78 |
5 |
2 |
CMU_20050912-0900 |
RT06s |
1071.191 |
686.02 |
4 |
2 |
CMU_20050914-0900 |
RT06s |
1078.913 |
626.60 |
4 |
2 |
EDI_20050216-1051 |
RT06s |
1080.065 |
578.00 |
4 |
16 |
EDI_20050218-0900 |
RT06s |
1090.262 |
604.43 |
4 |
16 |
NIST_20051024-0930 |
RT06s |
1089.009 |
680.43 |
9 |
7 |
NIST_20051102-1323 |
RT06s |
1086.517 |
673.01 |
8 |
7 |
VT_20050623-1400 |
RT06s |
1082.2 |
509.85 |
5 |
4 |
VT_20051027-1400 |
RT06s |
1065.376 |
511.06 |
4 |
4 |
|
user
2008-12-08