The whole table is extracted from [1] except noted otherwise.
Year
|
Common
Condition(s)
|
Evaluation
Features
|
1997
|
Two
handset training(involving two-session [2])
Different
number test,
30 second
durations.
|
Tests
of 3 durations, 3 training conditions,
Switchboard-2
Phase 1 data
|
1998
|
One handset
training (involving two-session [2])
Same
number tests,
30
second durations.
|
Tests
of 3 duration, 3 training conditions,
Switchboard-2
Phase 2 data,
Handset
type detector info made available.
|
1999
|
One handset
training,
Different
number electret tests,
15-45
seconds duration tests.
|
Added
multi-speaker tasks,
Variable
durations used in main test trials,
Switchbaord-2
Phase 3 data.
|
2000
|
One
session training,
Different
number electret tests,
15-45
seconds duration tests.
|
Re-segmented
1997& 1998 test data for reuse,
Extra
test on AHUMADA Spanish data.
|
2001
|
One
session training,
Different
number electret tests,
15-45
seconds duration tests.
|
Repeated
2000 main test with added trials,
Additional
test on Switchboard cellular data,
Additional
test allowing human or machine transcripts with extended training data.
|
2002
|
One-session
training on conv. phone data
|
Cellular
data, alternative tests of extended training, speaker segmentation, and a limited
corpus of simulated forensic data
|
2003
|
One-session
training on conv. phone data
|
Cellular
data, extended training
|
2004
|
Handheld
landline conv. phone speech, English only
|
Multi-language
data with bilingual speakers
|
2005
|
English
only with handheld tel. set
|
Included
cross-channel trials with mic. test, both sides of 2-channel convs. provided
|
2006
|
English
only trials (including mic. test trials)
|
Included
cross-channel trials with mic. test
|
2008
|
8 –
contrasting English and bilingual speakers, interview and conv. phone speech
along with cross-condition trials
|
Interview
speech recorded over multiple mic channels and conv. phone speech recorded
over mic and tel channels, multiple languages
|
2010
|
9 –
contrasting tel and mic channels, interview and conversational phone speech,
and high, low and normal vocal effort
|
Multiple
microphones, phone calls with high, low, and normal vocal effort, aging data
(Greybeard), HASR
|
2012
|
5 –
interview test without noise, conv. phone test without noise, interview test
with added noise, conv. phone test with added noise, conv. phone test
collected in noisy environment
|
Target
speakers specified in advance (from previous evals) with large amounts of training, some test calls collected in noisy environments, phone test data
with added noise
|
Reference:
[1]
www.odyssey2012.org/html/doc/martin_oddyssey12_pres.pptx
[2]
Doddington, G.R., Pryzbocki, M, Martin, A.F., and Reynolds, D.A. NIST Speaker
Recognition Evaluation: Overview, Methodology, Systems, Results, Perspective
(Invited Paper), Speech Communication, September 2000.
No comments:
Post a Comment