Speech Disorders as Indicators of Potential for Lyrical Success—Ozzie Tchomzkij SpecGram Vol CLI, No 2 Contents Re-Rating the World’s Languages—Waxaklahun Ubah K’awil and José Felipe Hernandez y Fernandez

Generative Speech Recognition:
A competence model of ASR

by Stanislaus Gorky

10 PRINT "Hello, world!"
20 END
Speech recognition has long posed a problem for speech scientists, phoneticians, and commercial speech researchers. In this paper I make observations on new developments in the generative program, which has great contributions to make towards speech recognition. My investigations on this matter are based on the foundational remarks on ASR made by Chomsky & Ladefoged (1968), to wit:

The development of a speech recognition system SRS has fundamentally to do with the study of any language L spoken by a human H at time T. (1968: 16)

Here Chomsky clearly anticipates the link between the generative program and speech recognition technology that has been recently confirmed by speech researchers. Fundamental to Chomsky’s thought on this matter, however, is the distinction between competence and performance in speech recognition.

“Recognizing a word W involves any number of complex factors C1...i:

  make is
      io.put_string("%N Hello, world!")
  end -- make
end -- class HELLO_WORLD
— Eiffel
interspeaker variability, speech rate, regional difference in accent, memory capacity, the speaker’s state of mind, and many other factors related to performance. Crucially, none of these other factors has to do with the speaker’s intention IS, i.e. the expression of the speaker’s thought TS. As such, all performance factors should be excluded from consideration, until such time as the competence has been studied thoroughly.” (1968: 345, fn. 2)

Following the recent developments of the minimalist program, we recognize that every successful word recognition involves a pairing of a sound image I and a lexical meaning L. Well-formed <I, L> pairs are said to converge in derivation, while ill-formed sound-meaning pairs crash. Strong acoustic features must be checked before Spell-Out, while weak acoustic features can be checked after Spell-Out, before the lexical form (LF), of course.

The distinction between strong and weak acoustic features accounts for typological variation in speech recognition parametrically. The weak
print [Hello world!]
— Logo
vowel features of English are not checked early in the derivation, leading to the perception of schwa in weak positions. Contrast with Spanish, where vowel features must be checked before Spell-Out, regardless of their prosodic status. Given this typological result, it is not unreasonable to expect the minimalist program to provide definitive answers to all cases of vowel reduction, at all times, and in all places (with some idealization of the data). These minimalist insights into vowel reduction can be applied directly to work in ASR.

The model was tested with the benchmark TIMIT corpus. I and several colleagues first observed the well-formed <I, L> pairs in the training data, and then intuited the response of the recognition grammar for the test data. Preliminary results indicate a robust model. My intuitions indicate that competence-based ASR can achieve in excess of 99% success in recognizing words spoken in the benchmark TIMIT corpus. Several other colleagues have reported similar intuitions. (Idiolectal variation accounts for variable estimates of competence ASR efficacy; intuitions range from 98% to 100%, σ 1.3%). Even when we imagined
:- write('Hello world'),nl.
— Prolog
the presence of noise in the data, the drop-off in performance was modest (97-99%, σ 1.2%); similar results were obtained when the recordings were imagined over a noisy com system.

These results show the clear advantage in ASR of disregarding interspeaker variability, speaker emotional status, speech rate, and other performance factors. When freed from concerns that are ultimately non-linguistic, our speech recognition system achieves results unparalleled in other research programs.

Speech Disorders as Indicators of Potential for Lyrical Success—Ozzie Tchomzkij
Re-Rating the World’s Languages—Waxaklahun Ubah K’awil and José Felipe Hernandez y Fernandez
SpecGram Vol CLI, No 2 Contents