Ukrainian MFA dictionary v2.0.0#

  • Maintainer: Montreal Forced Aligner

  • Language: Ukrainian

  • Dialect: N/A

  • Phone set: MFA

  • Number of words: 64,017

  • Phones: b bʲː c dzʲ dʲː d̪z̪ d̪z̪ː d̪ː e f i j k l m mʲː n̪ː o p pʲː sʲː s̪ː tsʲ tsʲː tʃʲ tʃʲː tʃː tʲː t̪s̪ t̪s̪ː t̪ː u x zʲː z̪ː ç ɐ ɑ ɔ ɛ ɡ ɡː ɦ ɦː ɪ ɲ ɲː ɾ ɾʲ ɾʲː ɾː ʃ ʃʲ ʃʲː ʊ ʋ ʋʲ ʋʲː ʋː ʎ ʎː ʒ ʒʲ ʒʲː ʝ

  • License: CC BY 4.0

  • Compatible MFA version: v2.0.0

  • Citation:

@techreport{mfa_ukrainian_mfa_dictionary_2022,
	author={McAuliffe, Michael and Sonderegger, Morgan},
	title={Ukrainian MFA dictionary v2.0.0},
	address={\url{https://mfa-models.readthedocs.io/pronunciation dictionary/Ukrainian/Ukrainian MFA dictionary v2_0_0.html}},
	year={2022},
	month={Mar},
}
../../_images/full_logo_yellow.svg

Installation#

Install from the MFA command line:

mfa model download dictionary ukrainian_mfa

Or download from the release page.

The dictionary available from the release page and command line installation has pronunciation and silence probabilities estimated as part acoustic model training (see Silence probability format and training pronunciation probabilities for more information. If you would like to use the version of this dictionary without probabilities, please see the plain dictionary.

Intended use#

This dictionary is intended for forced alignment of Ukrainian transcripts.

This dictionary uses the MFA phone set for Ukrainian, and was used in training the Ukrainian MFA acoustic model. Pronunciations can be added on top of the dictionary, as long as no additional phones are introduced.

Performance Factors#

When trying to get better alignment accuracy, adding pronunciations is generally helpful, especially for different styles and dialects. The most impactful improvements will generally be seen when adding reduced variants that involve deleting segments/syllables common in spontaneous speech. Alignment must include all phones specified in the pronunciation of a word, and each phone has a minimum duration (by default 10ms). If a speaker pronounces a multisyllabic word with just a single syllable, it can be hard for MFA to fit all the segments in, so it will lead to alignment errors on adjacent words as well.

Ethical considerations#

Deploying any Speech-to-Text model into any production setting has ethical implications. You should consider these implications before use.

Demographic Bias#

You should assume every machine learning model has demographic bias unless proven otherwise. For pronunciation dictionaries, it is often the case that transcription accuracy and lexicon coverage for the prestige variety modeled in this dictionary compared to other variants. If you are using this dictionary in production, you should acknowledge this as a potential issue.

IPA Charts#

Consonants#

Obstruent symbols to the left of are unvoiced and those to the right are voiced.

Manner

Labial

Labiodental

Dental

Alveolar

Alveopalatal

Palatal

Velar

Glottal

Nasal

Occurrences:
14,009
Examples:
* томас:
[ ɔ m ɑ ]
* нум:
[ ʊ m]
* ймемо:
[i m e m ɔ]
* мойри:
[m ɔ j ɾ ɪ]
Occurrences:
1,484
Examples:
* нгамі:
[ ɦ ɑ i]
* міг:
[ i ɦ]
* умій:
[ʊ i i]
* тиміш:
[ e i ʃ]
Occurrences:
12
Examples:
Occurrences:
20,986
Examples:
* нічне:
[ɲ i e]
* нгамі:
[ ɦ ɑ i]
* налию:
[ ɐ l ɪ j ʊ]
* зночі:
[ ɔ tʃʲ i]
Occurrences:
443
Examples:
* ванна:
[ʋ ɑ n̪ː ɐ]
* цінну:
[tsʲ i n̪ː ʊ]
* панну:
[p ɑ n̪ː ʊ]
* кінну:
[c i n̪ː ʊ]
Occurrences:
6,035
Examples:
* нічне:
[ɲ i e]
* їхній:
[j i x ɲ i i]
* ніц:
[ɲ i t̪s̪]
* давні:
[ ɑ u ɲ i]
Occurrences:
1,199
Examples:
* винні:
[ʋ ɪ ɲː i]
* рання:
[ɾ ɑ ɲː ɐ]
* мення:
[m e ɲː ɐ]
* вання:
[ʋ ɑ ɲː ɐ]

Stop

Occurrences:
15,797
Examples:
* плоха:
[p l ɔ x ɐ]
* полою:
[p ɔ l ɔ j ʊ]
* плила:
[p l ɪ l ɐ]
* п'єте:
[p j ɛ e]
Occurrences:
7,063
Examples:
* буває:
[b ʊ ʋ ɑ j e]
* зруб:
[ ɾ u b]
* богам:
[b ɔ ɦ ɐ m]
* бити:
[b ɪ ɪ]
Occurrences:
3
Examples:
Occurrences:
18,356
Examples:
* томас:
[ ɔ m ɑ ]
* нести:
[ e ɪ]
* круту:
[k ɾ ʊ ʊ]
* п'єте:
[p j ɛ e]
Occurrences:
20
Examples:
* гетто:
[ɦ ɛ t̪ː ɔ]
Occurrences:
11,380
Examples:
* ззаду:
[z̪ː ɑ ʊ]
* давні:
[ ɑ u ɲ i]
* надаю:
[ ɐ ɐ j ʊ]
* дивом:
[ e ʋ ɔ m]
Occurrences:
120
Examples:
* будда:
[b u d̪ː ɐ]
* оддає:
[ɔ d̪ː ɑ j e]
* оддам:
[ɔ d̪ː ɐ m]
* міддю:
[ i d̪ː ʊ]
Occurrences:
1,147
Examples:
* кітці:
[c i ɔ tsʲː i]
* кішку:
[c i ʃ k ʊ]
* луків:
[l ʊ c i u]
* шкіру:
[ʃ c i ɾ ʊ]
Occurrences:
1
Examples:
Occurrences:
17,650
Examples:
* який:
[j ɐ k ɪ i]
* круту:
[k ɾ ʊ ʊ]
* синку:
[ ɪ k ʊ]
* отрок:
[ɔ ɾ ɔ k]
Occurrences:
8
Examples:
* мекку:
[m ɛ ʊ]
* мекка:
[m ɛ ɐ]
Occurrences:
133
Examples:
* аякже:
[ɐ ɐ ɡ ʒ ɛ]
* ґатов:
[ɡ ɐ ɔ u]
* ґміни:
[ɡ i ɪ]
* ґанок:
[ɡ ɑ ɔ k]
Occurrences:
1
Examples:
* меґґі:
[m e ɡː i]

Affricate

Occurrences:
934
Examples:
* ніц:
[ɲ i t̪s̪]
* цехом:
[t̪s̪ ɛ x ɔ m]
* оце:
[ɔ t̪s̪ ɛ]
* цезар:
[t̪s̪ ɛ ɑ ɾ]
Occurrences:
4
Examples:
* цска:
[t̪s̪ː k ɐ]
Occurrences:
412
Examples:
* гудзя:
[ɦ u d̪z̪ ɐ]
* гудзь:
[ɦ u d̪z̪]
* будз:
[b u d̪z̪]
* дзень:
[d̪z̪ ɛ ɲ]
Occurrences:
1
Examples:
Occurrences:
7,505
Examples:
* нічне:
[ɲ i e]
* учора:
[ʊ ɔ ɾ ɐ]
* очам:
[ɔ ɑ m]
* хащах:
[x ɑ ʃ ɑ x]
Occurrences:
95
Examples:
* лучче:
[l u tʃː e]
* матч:
[m ɐ tʃː]
* одчай:
[ɔ tʃː ɐ i]
* одчув:
[ɔ tʃː u u]
Occurrences:
571
Examples:
* джері:
[ ɛ ɾʲ i]
* джеря:
[ e ɾ ɐ]
* ходжу:
[x ɔ ʊ]
* воджу:
[ʋ ɔ ʊ]

Sibilant

Occurrences:
13,344
Examples:
* гусар:
[ɦ u ɐ ɾ]
* офіс:
[ɔ i ]
* томас:
[ ɔ m ɑ ]
* схилі:
[ x ɪ ʎ i]
Occurrences:
25
Examples:
* ссав:
[s̪ː ɐ u]
* ссе:
[s̪ː e]
* масса:
[m ɐ s̪ː ɐ]
* ссати:
[s̪ː ɑ ɪ]
Occurrences:
9,595
Examples:
* захар:
[ ɐ x ɑ ɾ]
* зночі:
[ ɔ tʃʲ i]
* зруб:
[ ɾ u b]
* алмаз:
[ɐ m ɑ ]
Occurrences:
25
Examples:
* ззаду:
[z̪ː ɑ ʊ]
* ззаді:
[z̪ː ɑ i]
Occurrences:
10,399
Examples:
* місію:
[ i i j ʊ]
* асія:
[ɐ i j ɐ]
* стій:
[ i i]
* сіли:
[ i l ɪ]
Occurrences:
108
Examples:
* мессі:
[m e sʲː i]
* россю:
[ɾ o sʲː u]
* отся:
[ɔ sʲː ɐ]
* отсі:
[ɔ sʲː i]
Occurrences:
1,529
Examples:
* злі:
[ ʎ i]
* зніме:
[ ɲ i m ɛ]
* возі:
[ʋ ɔ i]
* змію:
[ i j ʊ]
Occurrences:
8
Examples:
Occurrences:
5,878
Examples:
* рвеш:
[ɾ ʋ ɛ ʃ]
* хащах:
[x ɑ ʃ ɑ x]
* щоки:
[ʃ ɔ k ɪ]
* прощу:
[p ɾ ɔ ʃ ʊ]
Occurrences:
150
Examples:
* груші:
[ɦ ɾ u ʃʲ i]
* суші:
[ u ʃʲ i]
* парші:
[p ɑ ɾ ʃʲ i]
* шіня:
[ʃʲ i ɲ ɐ]
Occurrences:
3
Examples:
Occurrences:
3,249
Examples:
* вжив:
[ʋ ʒ ɪ u]
* жуков:
[ʒ u k ɔ u]
* тож:
[ ɔ ʒ]
* жнива:
[ʒ ɪ ʋ ɐ]
Occurrences:
68
Examples:
* етажі:
[e ɐ ʒʲ i]
* жін:
[ʒʲ i ]
* жінка:
[ʒʲ i k ɐ]
* жінко:
[ʒʲ i k ɔ]
Occurrences:
23
Examples:

Fricative

Occurrences:
827
Examples:
* формі:
[f ɔ ɾ m i]
* ферм:
[f ɛ ɾ m]
* айфон:
[ɐ i f ɔ ]
* рифи:
[ɾ ɪ f ɪ]
Occurrences:
334
Examples:
* офіс:
[ɔ i ]
* офісу:
[ɔ i ʊ]
* шафі:
[ʃ ɑ i]
* фішка:
[ i ʃ k ɐ]
Occurrences:
232
Examples:
* хівря:
[ç i u ɾʲ ɐ]
* тихім:
[ ɪ ç i m]
* духів:
[ ʊ ç i u]
* вхіду:
[ʋ ç i ʊ]
Occurrences:
514
Examples:
* гірко:
[ʝ i ɾ k ɔ]
* гірка:
[ʝ i ɾ k ɐ]
* гімн:
[ʝ i m ]
* гіфи:
[ʝ i f ɪ]
Occurrences:
8,196
Examples:
* гусар:
[ɦ u ɐ ɾ]
* нгамі:
[ ɦ ɑ i]
* міг:
[ i ɦ]
* богам:
[b ɔ ɦ ɐ m]
Occurrences:
5
Examples:
* реггі:
[ɾ e ɦː i]

Approximant

Occurrences:
16,468
Examples:
* буває:
[b ʊ ʋ ɑ j e]
* слову:
[ l ɔ ʋ ʊ]
* вжив:
[ʋ ʒ ɪ u]
* рвеш:
[ɾ ʋ ɛ ʃ]
Occurrences:
3,568
Examples:
* квіти:
[k ʋʲ i ɪ]
* ловів:
[l ɔ ʋʲ i u]
* вівса:
[ʋʲ i u ɐ]
* відти:
[ʋʲ i ɪ]
Occurrences:
21
Examples:
* ввів:
[ʋʲː i u]
* вві:
[ʋʲː i]
Occurrences:
46
Examples:
* ввдно:
[ʋː ɔ]
* вволю:
[ʋː ɔ ʎ ʊ]
* ввело:
[ʋː e l ɔ]
* ввесь:
[ʋː ɛ ]
Occurrences:
9,709
Examples:
* буває:
[b ʊ ʋ ɑ j e]
* діяв:
[ i j ɑ u]
* налию:
[ ɐ l ɪ j ʊ]
* мойри:
[m ɔ j ɾ ɪ]

Tap

Occurrences:
23,819
Examples:
* гусар:
[ɦ u ɐ ɾ]
* захар:
[ ɐ x ɑ ɾ]
* учора:
[ʊ ɔ ɾ ɐ]
* мойри:
[m ɔ j ɾ ɪ]
Occurrences:
2,795
Examples:
* горює:
[ɦ ɔ ɾʲ ʊ e]
* стрій:
[ ɾʲ i i]
* нарік:
[ ɑ ɾʲ i k]
* рід:
[ɾʲ i ]
Occurrences:
6
Examples:
* гаррі:
[ɦ ɐ ɾʲː i]
Occurrences:
14
Examples:
* ферро:
[f ɛ ɾː ɔ]
* гурра:
[ɦ ʊ ɾː ɐ]

Lateral

Occurrences:
15,940
Examples:
* налию:
[ ɐ l ɪ j ʊ]
* плоха:
[p l ɔ x ɐ]
* слову:
[ l ɔ ʋ ʊ]
* мало:
[m ɑ l ɔ]
Occurrences:
15
Examples:
* алмаз:
[ɐ m ɑ ]
* алвіш:
[ɐ ʋʲ i ʃ]
* алло:
[ɐ ɔ]
* аллах:
[ɐ ɑ x]
Occurrences:
5,939
Examples:
* схилі:
[ x ɪ ʎ i]
* злі:
[ ʎ i]
* конлі:
[k ɔ ɲ ʎ i]
* людей:
[ʎ u ɛ i]
Occurrences:
90
Examples:
* валль:
[ʋ ɐ ʎː]
* виллє:
[ʋ ɪ ʎː ɛ]
* ілліч:
[i ʎː i ]
* гіллі:
[ʝ i ʎː i]

Vowels#

Vowel symbols to the left of are unrounded and those to the right are rounded.

Front

Near-Front

Central

Near-Back

Back

Close

Occurrences:
31,971
Examples:
* нічне:
[ɲ i e]
* офіс:
[ɔ i ]
* нгамі:
[ ɦ ɑ i]
* діяв:
[ i j ɑ u]
Occurrences:
13,975
Examples:
* гусар:
[ɦ u ɐ ɾ]
* діяв:
[ i j ɑ u]
* зруб:
[ ɾ u b]
* давні:
[ ɑ u ɲ i]
Occurrences:
24,405
Examples:
* налию:
[ ɐ l ɪ j ʊ]
* схилі:
[ x ɪ ʎ i]
* нести:
[ e ɪ]
* мойри:
[m ɔ j ɾ ɪ]
Occurrences:
17,156
Examples:
* буває:
[b ʊ ʋ ɑ j e]
* учора:
[ʊ ɔ ɾ ɐ]
* ззаду:
[z̪ː ɑ ʊ]
* налию:
[ ɐ l ɪ j ʊ]

Close-Mid

Occurrences:
30,731
Examples:
* нічне:
[ɲ i e]
* буває:
[b ʊ ʋ ɑ j e]
* нести:
[ e ɪ]
* ймемо:
[i m e m ɔ]
Occurrences:
2,551
Examples:
* дворі:
[ ʋ o ɾʲ i]
* водій:
[ʋ o i i]
* окріп:
[o k ɾʲ i p]
* родів:
[ɾ o i u]

Open-Mid

Occurrences:
7,519
Examples:
* п'єте:
[p j ɛ e]
* нєма:
[ ɛ m ɐ]
* зніме:
[ ɲ i m ɛ]
* глек:
[ɦ l ɛ k]
Occurrences:
40,284
Examples:
* офіс:
[ɔ i ]
* учора:
[ʊ ɔ ɾ ɐ]
* томас:
[ ɔ m ɑ ]
* зночі:
[ ɔ tʃʲ i]
Occurrences:
33,085
Examples:
* гусар:
[ɦ u ɐ ɾ]
* захар:
[ ɐ x ɑ ɾ]
* учора:
[ʊ ɔ ɾ ɐ]
* налию:
[ ɐ l ɪ j ʊ]

Open

Occurrences:
22,846
Examples:
* буває:
[b ʊ ʋ ɑ j e]
* нгамі:
[ ɦ ɑ i]
* захар:
[ ɐ x ɑ ɾ]
* томас:
[ ɔ m ɑ ]