WorldCat Identities

Laprie, Yves

Works: 33 works in 35 publications in 2 languages and 49 library holdings
Roles: Other, Thesis advisor, Editor, Author, Opponent
Publication Timeline
Most widely held works by Yves Laprie
Proceedings of the 8th international Seminar on speech production by International Seminar on Speech Production( Book )

1 edition published in 2008 in English and held by 3 WorldCat member libraries worldwide

What makes 'mama' and 'papa' acceptable? Experiments with a replica of von Kempelen's speaking machine by Fabian Brackhane( )

1 edition published in 2015 in English and held by 2 WorldCat member libraries worldwide

Designing a bilingual speech corpus for French and German language learners by Jürgen Trouvain( )

1 edition published in 2017 in English and held by 2 WorldCat member libraries worldwide

Acoustic Evaluation of Simplifying Hypotheses Used in Articulatory Synthesis by Ioannis Douros( )

1 edition published in 2019 in English and held by 2 WorldCat member libraries worldwide

COSMO : un modèle bayésien des interactions sensori-motrices dans la perception de la parole by Raphael Laurent( )

1 edition published in 2014 in French and held by 2 WorldCat member libraries worldwide

While speech communication is a faculty that seems natural, a lot remainsto be understood about the nature of the cognitive representations and processes that are involved. Central to this PhD research is the study of interactions between perception and action during production or perception of syllables. We choose Bayesian Programming as a rigorous framework within which we provide a mathematical definition of the COSMO model ("Communicating Objects using Sensori-Motor Operations"), which allows to formalize motor, auditory and perceptuo-motor theories of speech communication and to study them quantitatively. This approach first leads to a strong theoretical result:we prove an indistinguishability theorem, according to which, given some ideal learning conditions, motor and auditory theories make identical predictions for perception tasks, and therefore cannot be distinguished empirically. To depart from these conditions, we introduce an original “learning by accommodation” algorithm, which enables to adapt to the ambient acoustic environment as well as to develop idiosyncrasies. This algorithm, which learns by mimicking acoustic targets, allows to acquire motor skills from acoustic inputs only, with the remarkable property of focusing its learning on the adequate regions. We use syllables synthesized by a vocal tract model (VLAM ) to analyse how thedifferent models evolve through learning and how robust they are to degradations
The IFCASL Corpus of French and German Non-native and Native Read Speech by Jürgen Trouvain( )

1 edition published in 2017 in English and held by 2 WorldCat member libraries worldwide

Production des consonnes plosives du français : du contrôle des bruits de plosion by Thibault Cattelain( )

1 edition published in 2019 in French and held by 2 WorldCat member libraries worldwide

Stop consonants (/p/, /b/, etc) are of particular interest for the understanding of speech motor control. Indeed, the production of these stop consonant requires the coordination of the 3 production levels: breathing, vocal folds vibration and articulation.The main goal of my thesis is to study how respiratory, laryngeal and articulatory gestures coordinate to control the variation of acoustic features of stop consonants, especially of their burts (intensity, duration, spectrum), which are crucial for stop consonant intelligibility. An important part of my thesis work also focuses on the muscular control of lip gestures in the production of bilabial stops. These goals needed a preliminary methodological work to compare, develop and implement different techniques, in order to measure and estimate articulatory efforts of speech production, physiologically and mechanically (lip movement kinematics, force sensors, orofacial electromyography). This methodological exploration has given rise to theacquisition of a large database (acoustic and physiological data) of French stop consonant productions, for twenty healthy speakers, including 2 phonation modes (modal and whispered), 2 speech rates (normal and fast) and several levels of articulatory effort.The analysis of this database has confirmed relationships already established inconversational speech between burst intensity and the maximum of intra-oral pressure (or opening velocity of lips for labial stops), and between spectral features of the burst and articulatory parameters of tongue movements for alveolar and velar stops. New other relationships have been observed in conversational : 1- the burst acoustic intensity increase when the lips compression and opening velocity increase (for labial stop consonants) ; 2-the burst acoustic intensity increase when the elevation tangential velocity of the tongue increase (for palatal stop consonants) ; 3- the lips compression, lips opening and closing velocities significantly increase when the activities of the OOS (Superior Orbicularis Oris) and DLI (Depressor of the Inferior Lip) muscles increase (during the movement phasis where muscles are agonists). These relationships depend on phonation quality (in whispered speech the accent is made on using kinematic parameters at the cost of aerodynamic, articulatory and temporal ones) and speech rate (most of physiological and articulatory parameters lost efficacies for acoustic control when speech rate increase)
Contribution expérimentale et théorique à l'analyse et la modélisation de la vibration des cordes vocales by Anne Bouvet( )

1 edition published in 2019 in English and held by 2 WorldCat member libraries worldwide

The production of the human voice is generated by vocal folds auto-oscillation, due to the interaction between the air flow coming from the lungs and the elastic structure of the vocal folds. The purpose of this thesis is to realise an experimental and theoretical study in order to improve the understanding and modelling of this phenomenon and some of its perturbations.Firstly, the MSePGG algorithm is proposed for the calibration of a non-invasive device for in vivo glottal area measurements. The algorithm is validated on mechanical replicas and illustrated for measurements on human speakers.Secondly, the vocal folds are covered by a thin layer of liquid, essential for phonation. An experimental approach is proposed to systematically study the influence of the presence of liquid on vocal fold replicas. Water spraying is shown to impact basic voice parameter as well as their perturbation. A simplified theoretical flow model accounting for the presence of both air and water is proposed and validated.Thirdly, the effect of vertical vocal fold angular asymmetry, as occurring in the case of unilateral vocal fold paralysis, on the fluid structure interaction is experimentally assessed. It is found that loss of vocal folds full contact leads to important variation on phonation features and their variations.A simple theoretical model is shown to fit the increase of auto-oscillation onset threshold pressure. For future clinical applications obtained results suggest the further development of the MSePGG device and illustrate the multiple of potential causes of voice perturbation
Rôle des relations perception-action dans la communication parlée et l'émergence des systèmes phonologiques : étude, modélisation computationnelle et simulations by Clément Moulin-Frier( )

1 edition published in 2011 in French and held by 2 WorldCat member libraries worldwide

Si la question de l'origine du langage reste d'un abord compliqué, celle de l'origine des formes du langage semble plus susceptible de se confronter à la démarche expérimentale. Malgré leur infinie variété, d'évidentes régularités y sont présentes~: les universaux du langage. Nous les étudions par des raisonnements plus généraux sur l'émergence du langage, notamment sur la recherche de précurseurs onto- et phylogénétiques. Nous abordons trois thèmes principaux~: la situation de communication parlée, les architectures cognitives des agents et l'émergence des universaux du langage dans des sociétés d'agents. Notre première contribution est un modèle conceptuel des agents communicants en interaction, issu de notre analyse bibliographique. Nous en proposons ensuite une formalisation mathématique Bayésienne~: le modèle d'un agent est une distribution de probabilités, et la production et la perception sont des inférences bayésiennes. Cela permet la comparaison formelle des différents courants théoriques en perception et en production de la parole. Enfin, nos simulations informatiques de société d'agents identifient les conditions qui favorisent l'apparition des universaux du langage
Techniques d'analyse et de synthèse de la parole appliquées à l'apprentissage des langues by Vincent Colotte( Book )

2 editions published in 2002 in French and held by 2 WorldCat member libraries worldwide

Nowadays when exchanges between people are more and more international, foreign language grasp is becoming essential. The computer-assisted language learning seems to be a new stake. In particular, the improvement of oral comprehension constitutes one of keys to control a language. To improve intelligibility, I work out a first strategy based on selective slowing down of speech signal. The transitory parts - regions of high acoustic cue concentration - turns out to be privileged candidates to the slowing down. The detection of these regions is based on the computation of a coefficient which reflects spectrum variation rate. I work out a second strategy which enhances relevant events of speech, i.e. that its amplification improves intelligibility. This strategy is based on the preservation of phonetic contrasts, in particular between voiced and unvoiced consonants. Thus, I developed an algorithm of detection of unvoiced plosives and unvoiced fricatives from criteria on energy. Two experiments of perception have been carried out to validate these strategies of intelligibility improvement: the first, preliminary, with French listeners on American sentences and the second with foreign students (learning French as foreign language) on French sentences. At last, to modify the prosodic elements (rhythm, intensity, fundamental frequency), my work was based on PSOLA method (Pitch Synchronous OverLap and Add). I work out an algorithm of pitch marking and I improve the accuracy of synthesis method. These strategies are totally automatic and allow to improve intelligibility of speech signal in the framework of language learning
Uncontrolled manifolds et réflexes à courte latence dans le contrôle moteur de la parole : une étude de modélisation by Andrew Szabados( )

1 edition published in 2017 in English and held by 2 WorldCat member libraries worldwide

This work makes use of a biomechanical model of speech production as a reference subject to address several phenomena related to the adaptability and stability of speech motor control, namely motor equivalence and postural stability. The first part of this thesis is related to the phenomenon of motor equivalence. Motor equivalence is a key feature of speech motor control, since speakers must constantly adapt to various phonetic contexts and speaking conditions. The Uncontrolled Manifold (UCM) idea offers a theoretical framework for considering motor equivalence in which coordination among motor control variables is separated into two subspaces, one in which changes in control variables modify the output and another one in which these changes do not influence the output.This concept is developed and investigated for speech production using a 2D biomechanical model. First, a representation of the linearized UCM based on orthogonal projection matrices is proposed. The UCMs of various vocal tract configurations of the 10 French oral vowels are then characterized using their command perturbation responses. It is then investigated whether each phonetic class such as phonemes, front/back vowels, rounded/un-rounded vowels can be characterized by a unique UCM, or whether the UCMs vary significantly across representatives of these different classes. It was found that linearized UCMs, especially those that are specifically computed for each configuration, but also across many of the phonetic classes allow for a command perturbation response that is effective. This suggests that similar motor equivalence strategies can be implemented within each of these classes and that UCMs provide a valid characterization of an equivalence strategy. Further work is suggested to elaborate which classes might be used in practice.The second part addresses the question of the degree to which postural control of the tongue is accomplished through passive mechanisms - such as the mechanical and elastic properties of the tongue itself - or through short-latency reflexes - such as the stretch reflex.A specific external force perturbation, was applied to the 2D biomechanical model , namely one in which the tongue is pulled anteriorly using specific force profile exerted on the tongue body using a force effector attached to the superior part of the tongue blade. Simulation results were compared to experimental data collected at Gipsa-lab under similar conditions.This perturbation was simulated with various values of the model's parameter modulating the reflex strength (feedback gain). The results showed that a perturbation rebound seen in simulated data is due to a reflex mechanism. Since a compatible rebound is seen in data from human subjects, this can be taken as evidence of a reflex mechanism being involved in postural stability of the tongue. The time course of the mechanisms of this reflex, including the generation of force and the movement of the tongue, were analyzed and it was determined that the precision of the model was insufficient to make any conclusions on the origin of this reflex (whether cortical or brainstem). Still, numerous experimental directions are proposed
Glottal Opening Measurements in VCV and VCCV Sequences by Yves Laprie( )

1 edition published in 2019 in English and held by 2 WorldCat member libraries worldwide

On the assessment of computer-assisted pronunciation training tools by Jürgen Trouvain( )

1 edition published in 2017 in English and held by 2 WorldCat member libraries worldwide

Outils, travaux et propositions pour le décodage acoustico-phonétique by Yves Laprie( Book )

2 editions published in 1990 in French and held by 2 WorldCat member libraries worldwide

Cette thèse est consacrée à l'approche experte du décodage acoustico-phonétique de la parole continue. Nous décrivons tout d'abord l'environnement logiciel à la disposition du chercheur. Nous avons développé un logiciel, Snorri, qui fournit les outils classiques pour enregistrer et restituer les signaux de parole, calculer et afficher des spectrogrammes mais aussi des outils spécifiquement conçus pour l'exploration des corpus de parole. Nous décrivons ensuite deux algorithmes destinés au suivi de formants. Le premier construit des pistes à partir de données cepstrales ou provenant du codage par prédiction linéaire ; le second effectue l'interprétation en termes de formants des résultats de l'algorithme précédent. Finalement, nous proposons une nouvelle approche du décodage acoustico-phonétique en utilisant des triplets qui sont des prototypes de sons en contexte. Ce système de décodage opère en deux étapes : d'abord les meilleurs triplets candidats sont proposés pour chacun des segments de parole ; la consistance de la solution globale est ensuite améliorée en utilisant des techniques de relaxation
Designing a Bilingual Speech Corpus for French and German Language Learners: a Two-Step Process by Camille Fauth( )

1 edition published in 2017 in English and held by 2 WorldCat member libraries worldwide

Perturbation de la production de la parole chez le patient atteint d'une paralysie laryngée : données acoustiques et aérodynamiques by Noé Xiu( )

1 edition published in 2018 in French and held by 1 WorldCat member library worldwide

Our thesis aims at studying the consequences of total or partial removal of the thyroid gland due to thyroid dysfunction, followed or not by a radiotherapy treatment, in the field of clinical phonetics. This type of intervention usually perturbs the speech production system and sometimes leads to permanent (less than 5% of cases) or transient degradation of voice quality. The work intends to be a contribution to research carried out in clinical linguistics and phonetics, more particularly in the area of compensatory or readjustment phenomena developed by patients, following perturbation provoked in their phonatory system. The work was carried out in collaboration with the Group Saint-Vincent Hospital, and more particularly with the Clinique Sainte-Anne of Strasbourg, within the department of thyroid surgery. Our study is longitudinal since we have followed a cohort of patients, who underwent thyroid gland surgery, for at least one year, acquiring acoustic and aerodynamic data every month, the postoperative examination having revealed or not a lesion in the mobility of the vocal folds. We have studied possible compensation or readjustment strategies that patients were able to deploy by themselves or with the help of speech therapy, in order to assess the flexibility of the speech production system. The purpose is thus to evaluate the flexibility of the speech production and perception system and to try to understand how this system works based on a specific dysfunction of pathological origin. It is thus a question of determining the limits of physical deviations imposed by linguistic requirements of clarity of the speech perception system. Through the various investigations that we have conducted, we have tried to account for possible viability of perceptually stable phonetic and phonological units, despite an omnipresent variability in the physical, articulatory, physiological and acoustic substrate. Particular attention is paid to societal dimensions related to quality of life (vocal fatigue, satisfaction of linguistic productions, self-esteem, etc.)
Towards a 3 dimensional dynamic generic speaker model to study geometry simplifications of the vocal tract using magnetic resonance imaging data by Ioannis Douros( )

1 edition published in 2020 in English and held by 1 WorldCat member library worldwide

In this thesis we used MRI (Magnetic Resonance Imaging) data of the vocal tract to study speech production. The first part consist of the study of the impact that the velum, the epiglottis and the head position has on the phonation of five french vowels. Acoustic simulations were used to compare the formants of the studied cases with the reference in order to measure their impact. For this part of the work, we used 3D static MR (Magnetic Resonance) images. As speech is usually a dynamic phenomenon, a question arose, whether it would be possible to process the 3D data in order to incorporate dynamic information of continuous speech. Therefore the second part presents some algorithms that one can use in order to enhance speech production data. Several image transformations were combined in order to generate estimations of vocal tract shapes which are more informative than the original ones. At this point, we envisaged apart from enhancing speech production data, to create a generic speaker model that could provide enhanced information not for a specific subject, but globally for speech. As a result, we devoted the third part in the investigation of an algorithm that one can use to create a spatiotemporal atlas of the vocal tract which can be used as a reference or standard speaker for speech studies as it is speaker independent. Finally, the last part of the thesis, refers to a selection of open questions of the field that are still left unanswered, some interesting directions that one can expand this thesis and some potential approaches that could help someone move forward towards these directions
Etude acoustique des fricatives de l'arabe standard (locuteurs algériens) by Amel Benamrane( )

1 edition published in 2013 in French and held by 1 WorldCat member library worldwide

This acoustic study focuses on standard Arabic, spoken by Algerian subjects (three female and three male subjects). The main thrust of the investigations is to provide clarification on the consonantal system of the language, and more particularly on the acoustic properties of its fricatives. This system is rich in places of articulation: labiodental, interdental, alveolar, postalveolair, uvular, pharyngeal and laryngeal. It is also characterized by the phenomenon of pharyngalisation, appearing in two pairs of fricatives: the interdental and alveolar fricatives. Based on our acoustic study, we have observed the properties relating to the frication noise of fricatives by calculating their center of gravity (CoG). We have also discussed the characteristics of the first four formants of the back vowels, short [a] and long [a:], in the vicinity of the fricatives, in CV sequences. Then, we have study absolute and relative segmental durations, relative intensity and harmonicity (HNR) of the fourteen fricatives of our study. This analyse was carried out to target their place of articulation and the phonological voicing contrast. Finally, we have tried to address the features of posterior locations, uvular, pharyngeal and laryngeal, which proved to be relevant as contrastive phonetic cues
Modélisation de la coarticulation labiale : mise en oeuvre sur une tête parlante by Vincent Robert( )

1 edition published in 2008 in French and held by 1 WorldCat member library worldwide

This thesis comes within the scope of talking heads. We are particularly interested in the prediction of labial and jaw coarticulation movements. After analyzing intra and inter speaker variability using two corpora, we defined a prediction algorithm for anticipatory coarticulation based on phonetic rules which takes into account interactions between articulators. We then proposed a solution to estimate labial and jaw movements using a one speaker corpus. It consists in concatenating elementary VC...CV sequences selected by our prediction algorithm and either extracted from the corpus or rebuilt by completion. We modeled articulatory movements using sigmoids which offer the advantage of considerably reducing the model size and which are adaptable to speaking rate or articulatory strategies. Additionally, sigmoids are able to keep distinctive contrasts between neighboring segments as well as intrinsic characteristics of the sounds. With the aim of estimating the quality of our synthesis process, we measured differences between real and predicted data for all the sentences of the corpus et we compared our solution with Cohen and Massaro 's algorithm. It turns out that our solution is better for specific VCCV sequences in which anticipation is more complex
Global active method for automatic formant tracking guided by local processing by Marie-Odile Berger( Book )

1 edition published in 1992 in English and held by 1 WorldCat member library worldwide

moreShow More Titles
fewerShow Fewer Titles
Audience Level
Audience Level
  Kids General Special  
Audience level: 0.96 (from 0.92 for Proceeding ... to 0.97 for Proceeding ...)

English (12)

French (10)