Questions tagged [phoneme]

A phoneme in linguistics is the smallest part of a word that can be pronounced. The word "the", for example, consists of phonemes /th/ and /e/. Use this tag to ask about how phonemes can be handled in natural language processing applications, or to identify them.

A phoneme in linguistics is the smallest part of a word that can be pronounced. The word "the", for example, consists of /th/ and /e/. Use this tag to ask about how phonemes can be handled in natural language processing applications, or to identify them.

For instance, in the Speech Synthesis Markup Language used by speech synthesisers to read text, the pronunciation of a phoneme can be specified:

<ssml:phoneme alphabet="x-microsoft-ups" ph="string"> </ssml:phoneme>

Here is what Wikipedia has on phonemes.

53 questions
2
votes
2 answers

Synthesize phoneme pairs on OSX

I need to create wave-files of 144 phoneme-pairs, such as "Da Di Du, Beh Bi Burr, ..." Specifically I need each one to maintain a constant pitch, so that I can pitch-shift them to make musical notes (If I could input pitch values that would be even…
P i
  • 29,020
  • 36
  • 159
  • 267
1
vote
1 answer

Converting real-time audio to phonemes

Using a microphone as an input for real-time audio. How do I extract the currently said phoneme from the audio? I need it for lipsyncing 2d characters. Basically, my approach would be to: Fetch the real-time audio using a microphone Detect the…
NectoJ
  • 60
  • 1
  • 6
1
vote
0 answers

Can you run Montreal Forced Aligner for bilingual data?

I have audio files from a bilingual speaker (who speaks German and Turkish). I am trying to get the phoneme level annotation for the same using MFA. Is there anyway I can get accurate outputs? I am getting a messed up output.
1
vote
0 answers

Is there a way to reconstruct words or sentences from phoneme predictions?

I am using the SpeechBrain model to predict phoneme sequences based on the audio data. The output of the model is like this, ['sil', 'dh', 'ih', 'r', 'iy', 'z', 'ah', 'n', 'z', 'f', 'er', 'dh', 'ih', 's', 'sil', 'd', 'ay', 'v', 's', 'iy', 'm',…
1
vote
2 answers

How to get phonemes from Google Cloud API Text-to-Speech

I am following the Google Cloud API Text-to-Speech Python tutorial. I would like to know if there is a way to return the phonemes and their duration, an intermediate step in generating the interpreted speech. Is that possible? If so, can you please…
gma
  • 31
  • 4
1
vote
0 answers

Improve pronounciation of a model

I fine-tuned a dataset of Nvidia Tacotron2. While working reasonably well, there are some mispronounciations of words(I train a german dataset). I have another set of wave files by the same speaker with according metadata.csv How do I filter this to…
1
vote
1 answer

Phoneme from Jsonstream unrecognized in c#

I have a little problem with my JsonStream in C# I am reading a Json Blob in my Storage with this inside: {"id":"275177", "fremdwort":"1.FFC-Frankfurt", "ipa":"ʹeːɐ̯stɐ ɛf ɛf ʦeː ʹfraŋkfʊrt"} in C#: while (Jsonreader.Read()) …
smotorious
  • 45
  • 4
1
vote
0 answers

Are SAPI (5.4) phoneme sets available for all supported languages?

SAPI 5.4 documentation mentions that English, Chinese, German, Spanish, French, and Japanese languages (are able to) use the SAPI phone set. However the SAPI 5.4 documentation includes phoneme sets for only (American) English, Chinese, and Japanese.…
Exergist
  • 157
  • 12
1
vote
0 answers

SAPI Symbol Usage for Speech Dictionary Input

I've been doing some work to add words and pronunciations to the Windows speech dictionary via the SpLexicon Interface of SAPI 5.4 (which I think is the only way to do it) via the AddPronunciation function, or in my case: // Initialize SpLexicon…
1
vote
1 answer

Poor recognition accuracy of Pocketsphinx using phoneme recognition on Android, French language

I am working on a project where I have to integrate the speech functionalities of Pocketsphinx into an android application. In fact, I have to integrate the phoneme recognition functionality provided by Pocketpshinx that should be able to recongize…
user5794813
1
vote
1 answer

Specifying path to the acoustic model in pocketsphinx

i want to build a little phoneme based "dialog system" that listens to speech converts it into a string of phonemes (how ever wrong it doesn't matter), processes / stores these and plays them back on phoneme level. i aim to use either festival /…
1
vote
1 answer

How to get Phonemes on voice recognition?

I am working on Voice Recognition to Display the Phonemes and its wave form if possible using the built-in voice recognition on vista and windows 7 using Delphi2009. Other programming languages are welcome.
XBasic3000
  • 3,418
  • 5
  • 46
  • 91
1
vote
1 answer

Using the SSML phoneme element

I am using Visual Basic.net Ultimate, and am developing a TTS application. May I please have some help with the phoneme element. Here is the text that I wish to speak: As you release the tension in your shoulders and neck, take another deep breath…
1
vote
2 answers

Decode speech into Phonemes in Sphinx4

Can I use CMUSphinx4 to decode a particular speech into phonemes and get those phonemes into further implementations?
Sameera
  • 304
  • 1
  • 19
0
votes
0 answers

How To Easily Convert english Audio files to IPA (Phonetics) with time stamps on windows?

How Would I Easily Convert english Audio files to IPA (Phonetic alphabet) with time stamps on windows? Everything I find is way out of date. Even similar questions here on stack are out of date. most stuff doesn't even work anymore, like python's…