The NIST file format: waveforms (.wv1, .wv2)

Next: Example Header Up: Data Formats Previous: File Naming Formats

The NIST file format: waveforms (.wv1, .wv2)

The waveforms are SPHERE-headered, digitised, zero-meaned (using NIST's bias), and compressed (using NIST's transparent shorten algorithm w_encode).

The filename extension for the waveforms will contain the characters, ``wv'', followed by a 1-character code to identify the channel. The 1024-byte NIST header for each waveform contains the following fields/types:

Field                    Type     Description - Probable defaults marked ()
speaker_id               string   3-char. speaker ID from filename
speaking_mode            string   speaking mode ("read-common",
                                  "read-adaptation")  
recording_site           string   recording site
recording_date           string   beginning of recording date stamp of the
                                  form DD-MMM-YYYY.  
recording_time -s11      string   beginning of recording time stamp of the
                                  form HH:MM:SS.HH.  
recording_environment    string   text description of recording environment
microphone               string   microphone description
utterance_id             string   utterance ID from filename of the form
                                  SSSTEEUU as described in the filenames 
                                  section above.
prompt_id                string   WSJ source sentence text ID - see .ptx
                                  description below for format
database_id              string   database (corpus) identifier
database_version         string   database (corpus) revision ("1.0")
channel_count            integer  number of channels in waveform ("1")
speaker_session_number   string   2-char. base-36 session ID from filename
sample_count             integer  number of samples in waveform
sample_max               integer  maximum sample value in waveform
sample_min               integer  minimum sample value in waveform
sample_rate              integer  waveform sampling rate ("16000")
sample_n_bytes           integer  number of bytes per sample ("2")
sample_byte_format       string   byte order (MSB/LSB -> "10",
					    LSB/MSB -> "01")
sample_coding            string   waveform encoding
sample_checksum          integer  checksum obtained by the addition of all
                                  (uncompressed) samples into an unsigned 
                                  16-bit (short) and discarding overflow.  
sample_sig_bits          integer  number of significant bits in each sample
                                  ("16")
session_utterance_number string   2-char. base-36 utterance number within 
                                  session from the filename
end_head                 none     end of header identifier

Example Header

Tue Jan 17 18:52:43 GMT 1995