SimPhon.Net

Usage-based approaches, computational modeling and
simulation studies in phonetics and phonology

Topics and Research Questions

Psycholinguistic/cognitive models; Neural models in speech production

  • How should a person’s individual differences (IDs, including attention and memory capacity) be built into models of the perception and production of acoustic detail?
  • How can we model the role attention and memory abilities play in the formation of accurate acoustic-phonetic representations in exemplars or exemplar clouds?
  • What potential do recurrent neural networks (reservoir computing) have in predicting prosodic parameters in speech production?
  • How can these networks be evaluated by integrating them e.g. in a speech synthesis platform?

Exemplar-theoretic models and alternative approaches

  • How can exemplar-theoretic models be tested in the laboratory or on large speech corpora and how can specific empirical findings feed back to computational models?
  • To what extent are episodic representations stored and transferred into long term memory?
  • When does abstraction / generalization in an exemplar-theoretic model happen?

Biomechanical/aeroacoustic models; Prosody

  • What level of detail is required in self-oscillating vocal fold models for what phonatory effect — and how can these requirements be evaluated?
  • How do such models compare to the glottal source models used in parametric speech synthesis?
  • Can the emergence of prosodic categories be modeled in a production-perception loop using optimization or genetic algorithms?
  • Is attention modulation a useful concept in modeling the emergence of prosodic (prominence) structure?
  • Can major prosodic constituents be modeled as emerging from lower level prosodic constituents and morphological boundaries?

Speech segmentation models

  • How is segmentation of new languages biased by native language knowledge?
  • What level of abstraction is optimal for exploitation of transition probabilities, e.g., individual phones vs broad classes or syllables?
  • How are multiple segmentation cues integrated?
  • How can the above questions be addressed in a machine learning framework: How far does the language of the training material influence segmentation and optimal feature selection?

Model comparison and integration

  • What can given computational models explain, where do they fail?
  • How does naïve discriminative learning compare to exemplar theory in modeling speech production and perception?
  • How can computational models be integrated such that a broader range of phenomena or processes is covered as compared to the coverage of the individual models?
  • How can the behavior of physiological models inform higher-level neural network models of speech production?
  • How can models of speech segmentation and prosody be integrated?