Blog Archive

Thursday, April 4, 2013

NIST SRE 1997-2012 Quick Overview

The whole table is extracted from [1] except noted otherwise.

Common Condition(s)
Evaluation Features
Two handset training(involving two-session [2])
Different number test,
30 second durations.
Tests of 3 durations, 3 training conditions,
Switchboard-2 Phase 1 data
One handset training (involving two-session [2])
Same number tests,
30 second durations.
Tests of 3 duration, 3 training conditions,
Switchboard-2 Phase 2 data,
Handset type detector info made available.
One handset training,
Different number electret tests,
15-45 seconds duration tests.
Added multi-speaker tasks,
Variable durations used in main test trials,
Switchbaord-2 Phase 3 data.
One session training,
Different number electret tests,
15-45 seconds duration tests.
Re-segmented 1997& 1998 test data for reuse,
Extra test on AHUMADA Spanish data.
One session training,
Different number electret tests,
15-45 seconds duration tests.
Repeated 2000 main test with added trials,
Additional test on Switchboard cellular data,
Additional test allowing human or machine transcripts with extended training data.
One-session training on conv. phone data
Cellular data, alternative tests of extended training, speaker segmentation, and a limited corpus of simulated forensic data
One-session training on conv. phone data
Cellular data, extended training
Handheld landline conv. phone speech, English only
Multi-language data with bilingual speakers
English only with handheld tel. set
Included cross-channel trials with mic. test, both sides of 2-channel convs. provided
English only trials (including mic. test trials)
Included cross-channel trials with  mic. test
8 – contrasting English and bilingual speakers, interview and conv. phone speech along with cross-condition trials
Interview speech recorded over multiple mic channels and conv. phone speech recorded over mic and tel channels, multiple languages
9 – contrasting tel and mic channels, interview and conversational phone speech, and high, low and normal vocal effort
Multiple microphones, phone calls with high, low, and normal vocal effort, aging data (Greybeard), HASR
5 – interview test without noise, conv. phone test without noise, interview test with added noise, conv. phone test with added noise, conv. phone test collected in noisy environment
Target speakers specified in advance (from previous evals) with large amounts of training, some test calls collected in noisy environments, phone test data with added noise


[2] Doddington, G.R., Pryzbocki, M, Martin, A.F., and Reynolds, D.A. NIST Speaker Recognition Evaluation: Overview, Methodology, Systems, Results, Perspective (Invited Paper), Speech Communication, September 2000.

No comments:

Post a Comment