sprex logo
Sprex
Banner Image
      
 

 

 

 

 

 

 

SprexTM Home

TallyGramTM

ANSRTM

TeachionaryTM

SprexPassTM Passwords

PhonolyzeTM

Lip-Synch MachineTM

Consulting Services

SprexOutTM TTS

Co-WorkTM

Exit

 

 

 

 

 

 

 


phono [sound] # lyze [cut]

PhonolyzeTM cuts speech recordings into sounds.

Background

Segmentation of audio into phones or words is a highly time-consuming and painstaking process. Required for a variety of applications, such as acoustic analysis studies in academia, or building speech synthesis databases for high-quality speech output systems, paid staff time to carry out this process for even relatively small audio databases can rapidly accumulate to prohibitive expenditure levels. (although the rare graduate student might perhaps be inspired to do this work for no cost.)

By using the PhonolyzeTM system to automatically segment audio into phones and (if a transcript is available) words, much of this painstaking work can be bypassed, reducing to the faster task of merely checking and doing minor corrections, depending on the level of precision required.

Documentation

Please read the manual page which documents the command-line interface to the phonolyze program.

Requirements

  • SNR and Language: For PhonolyzeTM to achieve optimum accuracy, signal to noise levels must be adequate (30dB), and the speech must be from a supported language (US English comes preinstalled, while add-on modules for Spanish, UK English, German, and Japanese are also available). If you have a different language to process, it's worth a try; sometimes results are better than the alternative.

  • Platform: Phonolyze is supported on Mandrake Linux 8.x, FreeBSD 4.x, Solaris 4.1.3, and Windows 2000. Support for other versions of these platforms is intended and likely but not guaranteed.

What next

Please read our on-line documentation above. If you would like a PhonolyzeTM demonstration using your own audio data, preparatory to a purchase of the product, please contact us to discuss your application and request a demonstration and a quotation. For the demonstration, we will ask you for a tarball of audio samples (16000Hz 16-bit linear PCM audio in raw or .wav format) and we'll generate phone- and/or word-level segmentations or transcriptions for you based on your data, so that you can compare it against independent segmentation of the same data.

Thank you very much

Copyright © 1996-2005 Sprex, Inc. All rights reserved.
Date: November 20, 2008