The waveforms are SPHERE-headered, digitised, zero-meaned (using NIST's bias), and compressed (using NIST's transparent shorten algorithm w_encode).
The filename extension for the waveforms will contain the characters, ``wv'', followed by a 1-character code to identify the channel. The 1024-byte NIST header for each waveform contains the following fields/types:
Field Type Description - Probable defaults marked () speaker_id string 3-char. speaker ID from filename speaking_mode string speaking mode ("read-common", "read-adaptation") recording_site string recording site recording_date string beginning of recording date stamp of the form DD-MMM-YYYY. recording_time -s11 string beginning of recording time stamp of the form HH:MM:SS.HH. recording_environment string text description of recording environment microphone string microphone description utterance_id string utterance ID from filename of the form SSSTEEUU as described in the filenames section above. prompt_id string WSJ source sentence text ID - see .ptx description below for format database_id string database (corpus) identifier database_version string database (corpus) revision ("1.0") channel_count integer number of channels in waveform ("1") speaker_session_number string 2-char. base-36 session ID from filename sample_count integer number of samples in waveform sample_max integer maximum sample value in waveform sample_min integer minimum sample value in waveform sample_rate integer waveform sampling rate ("16000") sample_n_bytes integer number of bytes per sample ("2") sample_byte_format string byte order (MSB/LSB -> "10", LSB/MSB -> "01") sample_coding string waveform encoding sample_checksum integer checksum obtained by the addition of all (uncompressed) samples into an unsigned 16-bit (short) and discarding overflow. sample_sig_bits integer number of significant bits in each sample ("16") session_utterance_number string 2-char. base-36 utterance number within session from the filename end_head none end of header identifier