Russian MFA dictionary v3.1.0#

  • Maintainer: Montreal Forced Aligner

  • Language: Russian

  • Dialect: N/A

  • Phone set: MFA

  • Number of words: 452,397

  • Phones: a b bʲː c dzʲː dʐː dʲː d̪z̪ d̪z̪ː d̪ː e f fʲː i j k m mʲː n̪ː o p pʲː r rʲː sʲː s̪ː tsʲ tɕː tʂː tʲː t̪s̪ t̪s̪ː t̪ː u v vʲː x zʲː z̪ː æ ç ɐ ɕ ɕː ə ɛ ɟ ɟː ɡ ɡː ɣ ɨ ɪ ɫ ɫː ɲ ɲː ɵ ʂ ʂː ʉ ʊ ʎ ʎː ʐ ʐː ʑː

  • License: CC BY 4.0

  • Compatible MFA version: v3.1.0

  • Citation:

@techreport{mfa_russian_mfa_dictionary_2024,
	author={McAuliffe, Michael and Sonderegger, Morgan},
	title={Russian MFA dictionary v3.1.0},
	address={\url{https://mfa-models.readthedocs.io/pronunciation dictionary/Russian/Russian MFA dictionary v3_1_0.html}},
	year={2024},
	month={Jun},
}

Installation#

Install from the MFA command line:

mfa model download dictionary russian_mfa

Or download from the release page.

The dictionary available from the release page and command line installation has pronunciation and silence probabilities estimated as part acoustic model training (see Silence probability format and training pronunciation probabilities for more information. If you would like to use the version of this dictionary without probabilities, please see the [plain dictionary](https://raw.githubusercontent.com/MontrealCorpusTools/mfa-models/main/dictionary/russian/mfa/Russian MFA dictionary v3_1_0.dict).

Intended use#

This dictionary is intended for forced alignment of Russian transcripts.

This dictionary uses the MFA phone set for Russian, and was used in training the Russian MFA acoustic model. Pronunciations can be added on top of the dictionary, as long as no additional phones are introduced.

Performance Factors#

When trying to get better alignment accuracy, adding pronunciations is generally helpful, especially for different styles and dialects. The most impactful improvements will generally be seen when adding reduced variants that involve deleting segments/syllables common in spontaneous speech. Alignment must include all phones specified in the pronunciation of a word, and each phone has a minimum duration (by default 10ms). If a speaker pronounces a multisyllabic word with just a single syllable, it can be hard for MFA to fit all the segments in, so it will lead to alignment errors on adjacent words as well.

Ethical considerations#

Deploying any Speech-to-Text model into any production setting has ethical implications. You should consider these implications before use.

Demographic Bias#

You should assume every machine learning model has demographic bias unless proven otherwise. For pronunciation dictionaries, it is often the case that transcription accuracy and lexicon coverage for the prestige variety modeled in this dictionary compared to other variants. If you are using this dictionary in production, you should acknowledge this as a potential issue.

IPA Charts#

Consonants#

Obstruent symbols to the left of are unvoiced and those to the right are voiced.

Manner

Labial

Labiodental

Dental

Alveolar

Retroflex

Palatal

Velar

Nasal

Occurrences:
104,592
Examples:
* шалом:
[ʂ ɐ ɫ o m]
* днём:
[ ɲ ɵ m]
* бором:
[b o r ə m]
* манят:
[m a ɲ ə ]
Occurrences:
49,477
Examples:
* химик:
[ç i ɪ k]
* минеи:
[ ɪ ɲ e ɪ]
* мерз:
[ e r ]
* мягко:
[ a x k ə]
Occurrences:
421
Examples:
* сумме:
[ u mʲː e]
* комми:
[k ɐ mʲː ɪ]
Occurrences:
632
Examples:
* сумму:
[ u ʊ]
* умма:
[u ə]
* гаммы:
[ɡ a ɨ]
* суммы:
[ u ɨ]
Occurrences:
111,963
Examples:
* умной:
[ʊ m o j]
* война:
[v ɐ j a]
* паин:
[p a ɪ ]
* едина:
[j ɪ i ə]
Occurrences:
13,428
Examples:
* пенно:
[ e n̪ː ə]
* анны:
[a n̪ː ɨ]
* танны:
[ a n̪ː ɨ]
* ванна:
[v a n̪ː ə]
Occurrences:
81,313
Examples:
* денек:
[ e ɲ ɪ k]
* данте:
[ a ɲ e]
* конни:
[k o ɲ ɪ]
* гнет:
[ɟ ɲ ɪ ]
Occurrences:
1,358
Examples:
* анни:
[ɐ ɲː ɪ]
* дэнни:
[ ɛ ɲː ɪ]
* бонне:
[b o ɲː e]
* ванне:
[v a ɲː e]

Stop

Occurrences:
118,098
Examples:
* парча:
[p ɐ r a]
* покой:
[p ɐ k o j]
* хопом:
[x ɐ p ə m]
* полли:
[p o ʎː ɪ]
Occurrences:
28,643
Examples:
* топи:
[ o ɪ]
* шляпе:
[ʂ ʎ æ e]
* пели:
[ e ʎ ɪ]
* пиу:
[ ɪ ʊ]
Occurrences:
231
Examples:
* пеппи:
[ ɪ pʲː ɪ]
* хиппи:
[ç i pʲː ɪ]
* хеппи:
[x ɨ pʲː ɪ]
* пиппи:
[ ɪ pʲː ɪ]
Occurrences:
384
Examples:
* каппы:
[k a ɨ]
Occurrences:
45,028
Examples:
* бреет:
[b e j ɪ ]
* бытья:
[b ɨ j a]
* барж:
[b a r ʂ]
* брэма:
[b ɛ m ə]
Occurrences:
17,745
Examples:
* бежал:
[ ɪ ʐ a ɫ]
* бабе:
[b a e]
* обе:
[o e]
* берне:
[ ɪ r ɲ e]
Occurrences:
85
Examples:
* лобби:
[ɫ o bʲː ɪ]
* хобби:
[x o bʲː ɪ]
Occurrences:
117
Examples:
* аббас:
[ɐ a ]
Occurrences:
152,418
Examples:
* толик:
[ ɐ ʎ i k]
* досад:
[ ɐ a ]
* мнит:
[m ɲ i ]
* мирят:
[ ɪ a ]
Occurrences:
705
Examples:
* будто:
[b u t̪ː ə]
* отток:
[ɐ t̪ː o k]
* гетто:
[ɟ e t̪ː ə]
* чемто:
[ ɪ t̪ː ə]
Occurrences:
55,249
Examples:
* инда:
[ɪ ə]
* дрянь:
[ æ ɲ]
* душу:
[ u ʂ ʊ]
* деда:
[ e ə]
Occurrences:
635
Examples:
* будды:
[b u d̪ː ɨ]
* отдал:
[ɐ d̪ː a ɫ]
* будда:
[b u d̪ː ə]
Occurrences:
0
Examples:
Occurrences:
86,410
Examples:
* стихе:
[ ɪ ç e]
* терн:
[ e r ]
* лэти:
[ɫ ɛ ɪ]
* житие:
[ʐ ɨ ɪ j e]
Occurrences:
642
Examples:
* идти:
[ɪ tʲː i]
* матти:
[m a tʲː i]
* итти:
[ɪ tʲː i]
Occurrences:
0
Examples:
Occurrences:
32,399
Examples:
* диац:
[ ɪ ɐ t̪s̪]
* диор:
[ ɪ o r]
* ходит:
[x o ɪ ]
* ведет:
[ e ɪ ]
Occurrences:
556
Examples:
* эдди:
[ɛ dʲː ɪ]
* отдел:
[ɐ dʲː e ɫ]
* будде:
[b u dʲː e]
Occurrences:
33,133
Examples:
* мягки:
[ æ ç c ɪ]
* кисни:
[c i ɲ ɪ]
* резки:
[ ɪ c i]
* китон:
[c ɪ o ]
Occurrences:
167
Examples:
* микки:
[ i ɪ]
* аккре:
[ɐ e]
Occurrences:
14,856
Examples:
* блоги:
[b ɫ o ɟ ɪ]
* мозги:
[m ɐ ɟ i]
* греть:
[ɟ e ]
* глюка:
[ɟ ʎ u k ə]
Occurrences:
73
Examples:
* аггею:
[a ɟː ɪ j ʊ]
* хигги:
[ç i ɟː ɪ]
* аггея:
[a ɟː e]
* агги:
[a ɟː ɪ]
Occurrences:
109,579
Examples:
* скупы:
[ k u p ɨ]
* корок:
[k o r ə k]
* корою:
[k o r ə j ʊ]
* плеск:
[p ʎ e k]
Occurrences:
387
Examples:
* аккад:
[ɐ ə ]
* кктай:
[ ə j]
* мокка:
[m ɐ ə]
* яакко:
[j ɪ ə]
Occurrences:
39,556
Examples:
* рогат:
[r ɐ ɡ a ]
* груды:
[ɡ r u ɨ]
* грома:
[ɡ r ɐ m a]
* шагом:
[ʂ a ɡ ə m]
Occurrences:
24
Examples:

Affricate

Occurrences:
23,078
Examples:
* борца:
[b ɐ r t̪s̪ a]
* делец:
[ ɪ ʎ e t̪s̪]
* ценят:
[t̪s̪ ɛ ɲ ə ]
* цезии:
[t̪s̪ ɛ ɪ ɪ]
Occurrences:
1,071
Examples:
* отцы:
[ɐ t̪s̪ː ɨ]
* ниццы:
[ɲ i t̪s̪ː ɨ]
* пицце:
[ i t̪s̪ː ɨ]
* поццо:
[p ɐ ɛ t̪s̪ː ə]
Occurrences:
52
Examples:
Occurrences:
9
Examples:
* цзы:
[d̪z̪ː ɨ]
Occurrences:
0
Examples:
Occurrences:
26
Examples:
* цюня:
[tsʲ ʉ ɲ ə]
* цюнь:
[tsʲ ʉ ɲ]
* цянь:
[tsʲ a ɲ]
Occurrences:
0
Examples:
Occurrences:
20
Examples:
Occurrences:
44,282
Examples:
* чёрти:
[ ɵ r ɪ]
* чун:
[ u ]
* удаче:
[ʊ a e]
* читу:
[ i ʊ]
Occurrences:
1,398
Examples:
* отчим:
[o tɕː ɪ m]
* четче:
[ e tɕː e]
* путча:
[p u tɕː ə]
* вотче:
[v o tɕː e]

Sibilant

Occurrences:
127,187
Examples:
* пьесу:
[ j e ʊ]
* лесу:
[ʎ e ʊ]
* послы:
[p ɐ ɫ ɨ]
* трусы:
[ r ʊ ɨ]
Occurrences:
7,019
Examples:
* отцом:
[ɐ t̪s̪ː o m]
* пассы:
[p ɐ s̪ː ɨ]
* пиццы:
[ i t̪s̪ː ɨ]
* касса:
[k a s̪ː ə]
Occurrences:
67,707
Examples:
* загон:
[ ɐ ɡ o ]
* зыбью:
[ ɨ j ʊ]
* зову:
[ o v ʊ]
* сдует:
[ u j ɪ ]
Occurrences:
225
Examples:
* сзади:
[z̪ː a ɪ]
* базз:
[b ɐ z̪ː]
* изза:
[ɪ ə z̪ː ə]
* цзы:
[d̪z̪ː ɨ]
Occurrences:
0
Examples:
Occurrences:
81,067
Examples:
* усищи:
[ʊ i ɕː ɪ]
* росли:
[r ɐ ʎ i]
* несем:
[ɲ ɪ ɪ m]
* неся:
[ɲ ɪ a]
Occurrences:
2,281
Examples:
* лиссе:
[ʎ i sʲː e]
* иссек:
[ɪ sʲː e k]
* кассе:
[k a sʲː e]
* ниссе:
[ɲ i sʲː e]
Occurrences:
0
Examples:
Occurrences:
14,113
Examples:
* ездя:
[j e ə]
* казне:
[k ɐ ɲ e]
* зевая:
[ ɪ v a j ə]
* взяты:
[v a ɨ]
Occurrences:
11
Examples:
Occurrences:
64,256
Examples:
* шьет:
[ʂ j ɪ ]
* спешу:
[ e ʂ ʊ]
* лужку:
[ɫ ʊ ʂ k u]
* шкуре:
[ʂ k u e]
Occurrences:
837
Examples:
* сшиб:
[ʂː ɨ p]
* сшить:
[ʂː ɨ ]
* сшила:
[ʂː ɨ ɫ ə]
* сшит:
[ʂː ɨ ]
Occurrences:
28,369
Examples:
* живы:
[ʐ ɨ v ɨ]
* даже:
[ a ʐ ɨ]
* жни:
[ʐ ɲ i]
* лежим:
[ʎ ɪ ʐ ɨ m]
Occurrences:
1,101
Examples:
* вожжи:
[v o ʐː ɨ]
* сжато:
[ʐː a ə]
* сжать:
[ʐː a ]
* жжет:
[ʐː ɨ ]
Occurrences:
50
Examples:
* исчах:
[ɪ ɕ ə x]
Occurrences:
23,278
Examples:
* спущу:
[ p ʊ ɕː u]
* вещь:
[ e ɕː]
* могущ:
[m ɐ ɡ u ɕː]
* щадят:
[ɕː ɪ a ]
Occurrences:
0
Examples:
Occurrences:
779
Examples:
* дожде:
[ ɐ ʑː e]
* езжу:
[j e ʑː ʊ]
* дождя:
[ ɐ ʑː a]
* жужжа:
[ʐ ʊ ʑː a]

Fricative

Occurrences:
60,559
Examples:
* иосиф:
[ɪ o ɪ f]
* утаив:
[ʊ ɐ i f]
* графа:
[ɡ r a f ə]
* еров:
[j ɪ r o f]
Occurrences:
6,553
Examples:
* кофий:
[k o ɪ j]
* фидаи:
[ ɪ ɐ i]
* фич:
[ i ]
* финал:
[ ɪ a ɫ]
Occurrences:
218
Examples:
Occurrences:
43
Examples:
* хоффу:
[x ɐ ʊ]
* куффы:
[k u ɨ]
* куффу:
[k u ʊ]
* хоффа:
[x ɐ ə]
Occurrences:
127,879
Examples:
* язвой:
[j a v ə j]
* вузах:
[v u ə x]
* ваша:
[v a ʂ ə]
* лавр:
[ɫ a v r]
Occurrences:
37,382
Examples:
* весей:
[ ɪ e j]
* вялых:
[ a ɫ ɨ x]
* овец:
[ɐ e t̪s̪]
* веера:
[ e j ɪ r ə]
Occurrences:
294
Examples:
* савл:
[ ɐ vʲː ɫ]
* ввело:
[vʲː ɪ ɫ o]
* вверь:
[vʲː e ]
* ввела:
[vʲː ɪ ɫ a]
Occurrences:
196
Examples:
* ввоза:
[ o ə]
* ввод:
[ o ]
* вводу:
[ o ʊ]
* вволю:
[ o ʎ ʊ]
Occurrences:
3,233
Examples:
* хит:
[ç i ]
* блохи:
[b ɫ o ç ɪ]
* ухе:
[u ç e]
* лихие:
[ʎ ɪ ç i j e]

Approximant

Occurrences:
193,567
Examples:
* уйдёт:
[ʊ j ɵ ]
* ютуб:
[j u ʊ p]
* злее:
[ ʎ e j e]
* борой:
[b o r ə j]
Occurrences:
95
Examples:
* фойе:
[f ɐ e]
* ийя:
[ɪ ə]

Trill

Occurrences:
165,446
Examples:
* раме:
[r a e]
* отпер:
[o ɪ r]
* лора:
[ɫ o r ə]
* ясара:
[j ɪ a r ə]
Occurrences:
82,753
Examples:
* ребр:
[ ɪ b r]
* трёт:
[ ɵ ]
* репу:
[ e p ʊ]
* перед:
[ e ɪ ]
Occurrences:
332
Examples:
* гарри:
[ɡ a rʲː ɪ]
* серри:
[ ɪ rʲː i]
* перри:
[ ɪ rʲː i]
* шерри:
[ʂ ɨ rʲː i]
Occurrences:
387
Examples:
* эрроу:
[ɪ o ʊ]
* мирры:
[ i ɨ]
* тррах:
[ ə x]
* мирра:
[ i ə]

Lateral

Occurrences:
90,761
Examples:
* скалу:
[ k ɐ ɫ u]
* плача:
[p ɫ a ə]
* план:
[p ɫ a ]
* ложки:
[ɫ o ʂ c ɪ]
Occurrences:
735
Examples:
* ллойд:
[ɫː o j ]
* салла:
[ ɐ ɫː ə]
* белла:
[b ɛ ɫː ə]
* алла:
[a ɫː ə]
Occurrences:
97,967
Examples:
* везли:
[ ɪ ʎ i]
* силен:
[ i ʎ ɪ ]
* эмиль:
[ɪ i ʎ]
* нежли:
[ɲ ɪ ʐ ʎ ɪ]
Occurrences:
1,404
Examples:
* галле:
[ɡ a ʎː e]
* долли:
[ o ʎː ɪ]
* холле:
[x o ʎː e]
* аллей:
[ɐ ʎː e j]

Vowels#

Vowel symbols to the left of are unrounded and those to the right are rounded.

Front

Near-Front

Central

Near-Back

Back

Close

Occurrences:
86,138
Examples:
* соли:
[ ɐ ʎ i]
* играм:
[i ɡ r ə m]
* хьюиш:
[x ʊ i ʂ]
* таили:
[ ɐ i ʎ ɪ]
Occurrences:
142,185
Examples:
* биржи:
[ i r ʐ ɨ]
* улице:
[u ʎ ɪ t̪s̪ ɨ]
* густы:
[ɡ u ɨ]
* ныне:
[ ɨ ɲ e]
Occurrences:
14,136
Examples:
* цюня:
[tsʲ ʉ ɲ ə]
* утюге:
[ʊ ʉ ɟ e]
* чуев:
[ ʉ j ɪ f]
* плюща:
[p ʎ ʉ ɕː ə]
Occurrences:
39,084
Examples:
* слуха:
[ ɫ u x ə]
* южным:
[j u ʐ ɨ m]
* буры:
[b u r ɨ]
* торгу:
[ ɐ r ɡ u]
Occurrences:
428,405
Examples:
* черти:
[ e r ɪ]
* щечки:
[ɕː e c ɪ]
* низал:
[ɲ ɪ a ɫ]
* дерев:
[ ɪ e f]
Occurrences:
115,796
Examples:
* сушку:
[ u ʂ k ʊ]
* локтю:
[ɫ o c ʊ]
* фауну:
[f a ʊ ʊ]
* маму:
[m a m ʊ]

Close-Mid

Occurrences:
101,845
Examples:
* копье:
[k ɐ j e]
* вешал:
[ e ʂ ə ɫ]
* метел:
[ ɪ e ɫ]
* славе:
[ ɫ a e]
Occurrences:
15,369
Examples:
* всё:
[f ɵ]
* царём:
[t̪s̪ ɐ ɵ m]
* пчёлы:
[p ɵ ɫ ɨ]
* рёв:
[ ɵ f]
Occurrences:
68,961
Examples:
* лишён:
[ʎ ɪ ʂ o ]
* сотне:
[ o ɲ e]
* полн:
[p o ɫ ]
* свода:
[ v o ə]
Occurrences:
379,739
Examples:
* моего:
[m ə j ɪ v o]
* ипром:
[i p r ə m]
* газом:
[ɡ a ə m]
* бруса:
[b r u ə]

Open-Mid

Occurrences:
11,111
Examples:
* женин:
[ʐ ɛ ɲ ɪ ]
* стена:
[ ɛ ə]
* вебер:
[v ɛ ɪ r]
* цеху:
[t̪s̪ ɛ x ʊ]
Occurrences:
20,001
Examples:
* сияй:
[ ɪ j æ j]
* чащах:
[ æ ɕː ə x]
* ящике:
[j æ ɕː ɪ c e]
* ряде:
[ æ e]
Occurrences:
236,936
Examples:
* козлы:
[k ɐ ɫ ɨ]
* кормы:
[k ɐ r m ɨ]
* таро:
[ ɐ r o]
* огнем:
[ɐ ɟ ɲ ɪ m]

Open

Occurrences:
152,488
Examples:
* глава:
[ɡ ɫ ɐ v a]
* мкаде:
[m k a e]
* башне:
[b a ʂ ɲ e]
* мигза:
[ ɪ ɡ a]