Russian MFA dictionary v3.1.0#
@techreport{mfa_russian_mfa_dictionary_2024,
author={McAuliffe, Michael and Sonderegger, Morgan},
title={Russian MFA dictionary v3.1.0},
address={\url{https://mfa-models.readthedocs.io/pronunciation dictionary/Russian/Russian MFA dictionary v3_1_0.html}},
year={2024},
month={Jun},
}
G2P models Acoustic models |
Installation#
Install from the MFA command line:
mfa model download dictionary russian_mfa
Or download from the release page.
The dictionary available from the release page and command line installation has pronunciation and silence probabilities estimated as part acoustic model training (see Silence probability format and training pronunciation probabilities for more information. If you would like to use the version of this dictionary without probabilities, please see the [plain dictionary](https://raw.githubusercontent.com/MontrealCorpusTools/mfa-models/main/dictionary/russian/mfa/Russian MFA dictionary v3_1_0.dict).
Intended use#
This dictionary is intended for forced alignment of Russian transcripts.
This dictionary uses the MFA phone set for Russian, and was used in training the Russian MFA acoustic model. Pronunciations can be added on top of the dictionary, as long as no additional phones are introduced.
Performance Factors#
When trying to get better alignment accuracy, adding pronunciations is generally helpful, especially for different styles and dialects. The most impactful improvements will generally be seen when adding reduced variants that involve deleting segments/syllables common in spontaneous speech. Alignment must include all phones specified in the pronunciation of a word, and each phone has a minimum duration (by default 10ms). If a speaker pronounces a multisyllabic word with just a single syllable, it can be hard for MFA to fit all the segments in, so it will lead to alignment errors on adjacent words as well.
Ethical considerations#
Deploying any Speech-to-Text model into any production setting has ethical implications. You should consider these implications before use.
Demographic Bias#
You should assume every machine learning model has demographic bias unless proven otherwise. For pronunciation dictionaries, it is often the case that transcription accuracy and lexicon coverage for the prestige variety modeled in this dictionary compared to other variants. If you are using this dictionary in production, you should acknowledge this as a potential issue.
IPA Charts#
Consonants#
Obstruent symbols to the left of are unvoiced and those to the right are voiced.
Manner |
Labial |
Labiodental |
Dental |
Alveolar |
Retroflex |
Palatal |
Velar |
|---|---|---|---|---|---|---|---|
Nasal |
Occurrences: 104,592 Examples: * шалом: [ʂ ɐ ɫ o m] * днём: [dʲ ɲ ɵ m] * бором: [b o r ə m] * манят: [m a ɲ ə t̪] Occurrences: 49,477 Examples: * химик: [ç i mʲ ɪ k] * минеи: [mʲ ɪ ɲ e ɪ] * мерз: [mʲ e r s̪] * мягко: [mʲ a x k ə] Occurrences: 421 Examples: * сумме: [s̪ u mʲː e] * комми: [k ɐ mʲː ɪ] Occurrences: 632 Examples: * сумму: [s̪ u mː ʊ] * умма: [u mː ə] * гаммы: [ɡ a mː ɨ] * суммы: [s̪ u mː ɨ] |
Occurrences: 111,963 Examples: * умной: [ʊ m n̪ o j] * война: [v ɐ j n̪ a] * паин: [p a ɪ n̪] * едина: [j ɪ dʲ i n̪ ə] Occurrences: 13,428 Examples: * пенно: [pʲ e n̪ː ə] * анны: [a n̪ː ɨ] * танны: [t̪ a n̪ː ɨ] * ванна: [v a n̪ː ə] |
Occurrences: 81,313 Examples: * денек: [dʲ e ɲ ɪ k] * данте: [d̪ a ɲ tʲ e] * конни: [k o ɲ ɪ] * гнет: [ɟ ɲ ɪ t̪] Occurrences: 1,358 Examples: * анни: [ɐ ɲː ɪ] * дэнни: [d̪ ɛ ɲː ɪ] * бонне: [b o ɲː e] * ванне: [v a ɲː e] |
||||
Stop |
Occurrences: 118,098 Examples: * парча: [p ɐ r tɕ a] * покой: [p ɐ k o j] * хопом: [x ɐ p ə m] * полли: [p o ʎː ɪ] Occurrences: 28,643 Examples: * топи: [t̪ o pʲ ɪ] * шляпе: [ʂ ʎ æ pʲ e] * пели: [pʲ e ʎ ɪ] * пиу: [pʲ ɪ ʊ] Occurrences: 231 Examples: * пеппи: [pʲ ɪ pʲː ɪ] * хиппи: [ç i pʲː ɪ] * хеппи: [x ɨ pʲː ɪ] * пиппи: [pʲ ɪ pʲː ɪ] Occurrences: 384 Examples: * каппы: [k a pː ɨ] Occurrences: 45,028 Examples: * бреет: [b rʲ e j ɪ t̪] * бытья: [b ɨ tʲ j a] * барж: [b a r ʂ] * брэма: [b rʲ ɛ m ə] Occurrences: 17,745 Examples: * бежал: [bʲ ɪ ʐ a ɫ] * бабе: [b a bʲ e] * обе: [o bʲ e] * берне: [bʲ ɪ r ɲ e] Occurrences: 85 Examples: * лобби: [ɫ o bʲː ɪ] * хобби: [x o bʲː ɪ] Occurrences: 117 Examples: * аббас: [ɐ bː a s̪] |
Occurrences: 152,418 Examples: * толик: [t̪ ɐ ʎ i k] * досад: [d̪ ɐ s̪ a t̪] * мнит: [m ɲ i t̪] * мирят: [mʲ ɪ rʲ a t̪] Occurrences: 705 Examples: * будто: [b u t̪ː ə] * отток: [ɐ t̪ː o k] * гетто: [ɟ e t̪ː ə] * чемто: [tɕ ɪ mʲ t̪ː ə] Occurrences: 55,249 Examples: * инда: [ɪ n̪ d̪ ə] * дрянь: [d̪ rʲ æ ɲ] * душу: [d̪ u ʂ ʊ] * деда: [dʲ e d̪ ə] Occurrences: 635 Examples: * будды: [b u d̪ː ɨ] * отдал: [ɐ d̪ː a ɫ] * будда: [b u d̪ː ə] |
Occurrences: 0 Examples: Occurrences: 86,410 Examples: * стихе: [sʲ tʲ ɪ ç e] * терн: [tʲ e r n̪] * лэти: [ɫ ɛ tʲ ɪ] * житие: [ʐ ɨ tʲ ɪ j e] Occurrences: 642 Examples: * идти: [ɪ tʲː i] * матти: [m a tʲː i] * итти: [ɪ tʲː i] Occurrences: 0 Examples: Occurrences: 32,399 Examples: * диац: [dʲ ɪ ɐ t̪s̪] * диор: [dʲ ɪ o r] * ходит: [x o dʲ ɪ t̪] * ведет: [vʲ e dʲ ɪ t̪] Occurrences: 556 Examples: * эдди: [ɛ dʲː ɪ] * отдел: [ɐ dʲː e ɫ] * будде: [b u dʲː e] |
Occurrences: 33,133 Examples: * мягки: [mʲ æ ç c ɪ] * кисни: [c i sʲ ɲ ɪ] * резки: [rʲ ɪ s̪ c i] * китон: [c ɪ t̪ o n̪] Occurrences: 167 Examples: * микки: [mʲ i cː ɪ] * аккре: [ɐ cː rʲ e] Occurrences: 14,856 Examples: * блоги: [b ɫ o ɟ ɪ] * мозги: [m ɐ z̪ ɟ i] * греть: [ɟ rʲ e tʲ] * глюка: [ɟ ʎ u k ə] Occurrences: 73 Examples: * аггею: [a ɟː ɪ j ʊ] * хигги: [ç i ɟː ɪ] * аггея: [a ɟː e] * агги: [a ɟː ɪ] |
Occurrences: 109,579 Examples: * скупы: [s̪ k u p ɨ] * корок: [k o r ə k] * корою: [k o r ə j ʊ] * плеск: [p ʎ e s̪ k] Occurrences: 387 Examples: * аккад: [ɐ kː ə t̪] * кктай: [kː t̪ ə j] * мокка: [m ɐ kː ə] * яакко: [j ɪ kː ə] Occurrences: 39,556 Examples: * рогат: [r ɐ ɡ a t̪] * груды: [ɡ r u d̪ ɨ] * грома: [ɡ r ɐ m a] * шагом: [ʂ a ɡ ə m] Occurrences: 24 Examples: |
||
Affricate |
Occurrences: 23,078 Examples: * борца: [b ɐ r t̪s̪ a] * делец: [dʲ ɪ ʎ e t̪s̪] * ценят: [t̪s̪ ɛ ɲ ə t̪] * цезии: [t̪s̪ ɛ zʲ ɪ ɪ] Occurrences: 1,071 Examples: * отцы: [ɐ t̪s̪ː ɨ] * ниццы: [ɲ i t̪s̪ː ɨ] * пицце: [pʲ i t̪s̪ː ɨ] * поццо: [p ɐ ɛ t̪s̪ː ə] Occurrences: 52 Examples: Occurrences: 9 Examples: * цзы: [d̪z̪ː ɨ] |
Occurrences: 0 Examples: Occurrences: 26 Examples: * цюня: [tsʲ ʉ ɲ ə] * цюнь: [tsʲ ʉ ɲ] * цянь: [tsʲ a ɲ] Occurrences: 0 Examples: Occurrences: 20 Examples: |
Occurrences: 44,282 Examples: * чёрти: [tɕ ɵ r tʲ ɪ] * чун: [tɕ u n̪] * удаче: [ʊ d̪ a tɕ e] * читу: [tɕ i t̪ ʊ] Occurrences: 1,398 Examples: * отчим: [o tɕː ɪ m] * четче: [tɕ e tɕː e] * путча: [p u tɕː ə] * вотче: [v o tɕː e] |
||||
Sibilant |
Occurrences: 127,187 Examples: * пьесу: [pʲ j e s̪ ʊ] * лесу: [ʎ e s̪ ʊ] * послы: [p ɐ s̪ ɫ ɨ] * трусы: [t̪ r ʊ s̪ ɨ] Occurrences: 7,019 Examples: * отцом: [ɐ t̪s̪ː o m] * пассы: [p ɐ s̪ː ɨ] * пиццы: [pʲ i t̪s̪ː ɨ] * касса: [k a s̪ː ə] Occurrences: 67,707 Examples: * загон: [z̪ ɐ ɡ o n̪] * зыбью: [z̪ ɨ bʲ j ʊ] * зову: [z̪ o v ʊ] * сдует: [z̪ d̪ u j ɪ t̪] Occurrences: 225 Examples: * сзади: [z̪ː a dʲ ɪ] * базз: [b ɐ z̪ː] * изза: [ɪ ə z̪ː ə] * цзы: [d̪z̪ː ɨ] |
Occurrences: 0 Examples: Occurrences: 81,067 Examples: * усищи: [ʊ sʲ i ɕː ɪ] * росли: [r ɐ sʲ ʎ i] * несем: [ɲ ɪ sʲ ɪ m] * неся: [ɲ ɪ sʲ a] Occurrences: 2,281 Examples: * лиссе: [ʎ i sʲː e] * иссек: [ɪ sʲː e k] * кассе: [k a sʲː e] * ниссе: [ɲ i sʲː e] Occurrences: 0 Examples: Occurrences: 14,113 Examples: * ездя: [j e zʲ dʲ ə] * казне: [k ɐ zʲ ɲ e] * зевая: [zʲ ɪ v a j ə] * взяты: [v zʲ a t̪ ɨ] Occurrences: 11 Examples: |
Occurrences: 64,256 Examples: * шьет: [ʂ j ɪ t̪] * спешу: [s̪ pʲ e ʂ ʊ] * лужку: [ɫ ʊ ʂ k u] * шкуре: [ʂ k u rʲ e] Occurrences: 837 Examples: * сшиб: [ʂː ɨ p] * сшить: [ʂː ɨ tʲ] * сшила: [ʂː ɨ ɫ ə] * сшит: [ʂː ɨ t̪] Occurrences: 28,369 Examples: * живы: [ʐ ɨ v ɨ] * даже: [d̪ a ʐ ɨ] * жни: [ʐ ɲ i] * лежим: [ʎ ɪ ʐ ɨ m] Occurrences: 1,101 Examples: * вожжи: [v o ʐː ɨ] * сжато: [ʐː a t̪ ə] * сжать: [ʐː a tʲ] * жжет: [ʐː ɨ t̪] |
Occurrences: 50 Examples: * исчах: [ɪ ɕ tɕ ə x] Occurrences: 23,278 Examples: * спущу: [s̪ p ʊ ɕː u] * вещь: [vʲ e ɕː] * могущ: [m ɐ ɡ u ɕː] * щадят: [ɕː ɪ dʲ a t̪] Occurrences: 0 Examples: Occurrences: 779 Examples: * дожде: [d̪ ɐ ʑː e] * езжу: [j e ʑː ʊ] * дождя: [d̪ ɐ ʑː a] * жужжа: [ʐ ʊ ʑː a] |
|||
Fricative |
Occurrences: 60,559 Examples: * иосиф: [ɪ o sʲ ɪ f] * утаив: [ʊ t̪ ɐ i f] * графа: [ɡ r a f ə] * еров: [j ɪ r o f] Occurrences: 6,553 Examples: * кофий: [k o fʲ ɪ j] * фидаи: [fʲ ɪ d̪ ɐ i] * фич: [fʲ i tɕ] * финал: [fʲ ɪ n̪ a ɫ] Occurrences: 218 Examples: Occurrences: 43 Examples: * хоффу: [x ɐ fː ʊ] * куффы: [k u fː ɨ] * куффу: [k u fː ʊ] * хоффа: [x ɐ fː ə] Occurrences: 127,879 Examples: * язвой: [j a z̪ v ə j] * вузах: [v u z̪ ə x] * ваша: [v a ʂ ə] * лавр: [ɫ a v r] Occurrences: 37,382 Examples: * весей: [vʲ ɪ sʲ e j] * вялых: [vʲ a ɫ ɨ x] * овец: [ɐ vʲ e t̪s̪] * веера: [vʲ e j ɪ r ə] Occurrences: 294 Examples: * савл: [s̪ ɐ vʲː ɫ] * ввело: [vʲː ɪ ɫ o] * вверь: [vʲː e rʲ] * ввела: [vʲː ɪ ɫ a] Occurrences: 196 Examples: * ввоза: [vː o z̪ ə] * ввод: [vː o t̪] * вводу: [vː o d̪ ʊ] * вволю: [vː o ʎ ʊ] |
Occurrences: 3,233 Examples: * хит: [ç i t̪] * блохи: [b ɫ o ç ɪ] * ухе: [u ç e] * лихие: [ʎ ɪ ç i j e] |
|||||
Approximant |
Occurrences: 193,567 Examples: * уйдёт: [ʊ j dʲ ɵ t̪] * ютуб: [j u t̪ ʊ p] * злее: [zʲ ʎ e j e] * борой: [b o r ə j] Occurrences: 95 Examples: * фойе: [f ɐ jː e] * ийя: [ɪ jː ə] |
||||||
Trill |
Occurrences: 165,446 Examples: * раме: [r a mʲ e] * отпер: [o t̪ pʲ ɪ r] * лора: [ɫ o r ə] * ясара: [j ɪ s̪ a r ə] Occurrences: 82,753 Examples: * ребр: [rʲ ɪ b r] * трёт: [t̪ rʲ ɵ t̪] * репу: [rʲ e p ʊ] * перед: [pʲ e rʲ ɪ t̪] Occurrences: 332 Examples: * гарри: [ɡ a rʲː ɪ] * серри: [sʲ ɪ rʲː i] * перри: [pʲ ɪ rʲː i] * шерри: [ʂ ɨ rʲː i] Occurrences: 387 Examples: * эрроу: [ɪ rː o ʊ] * мирры: [mʲ i rː ɨ] * тррах: [t̪ rː ə x] * мирра: [mʲ i rː ə] |
||||||
Lateral |
Occurrences: 90,761 Examples: * скалу: [s̪ k ɐ ɫ u] * плача: [p ɫ a tɕ ə] * план: [p ɫ a n̪] * ложки: [ɫ o ʂ c ɪ] Occurrences: 735 Examples: * ллойд: [ɫː o j t̪] * салла: [s̪ ɐ ɫː ə] * белла: [b ɛ ɫː ə] * алла: [a ɫː ə] |
Occurrences: 97,967 Examples: * везли: [vʲ ɪ zʲ ʎ i] * силен: [sʲ i ʎ ɪ n̪] * эмиль: [ɪ mʲ i ʎ] * нежли: [ɲ ɪ ʐ ʎ ɪ] Occurrences: 1,404 Examples: * галле: [ɡ a ʎː e] * долли: [d̪ o ʎː ɪ] * холле: [x o ʎː e] * аллей: [ɐ ʎː e j] |
Vowels#
Vowel symbols to the left of are unrounded and those to the right are rounded.
Front |
Near-Front |
Central |
Near-Back |
Back |
|
|---|---|---|---|---|---|
Close |
Occurrences: 86,138 Examples: * соли: [s̪ ɐ ʎ i] * играм: [i ɡ r ə m] * хьюиш: [x ʊ i ʂ] * таили: [t̪ ɐ i ʎ ɪ] |
Occurrences: 142,185 Examples: * биржи: [bʲ i r ʐ ɨ] * улице: [u ʎ ɪ t̪s̪ ɨ] * густы: [ɡ u s̪ t̪ ɨ] * ныне: [n̪ ɨ ɲ e] Occurrences: 14,136 Examples: * цюня: [tsʲ ʉ ɲ ə] * утюге: [ʊ tʲ ʉ ɟ e] * чуев: [tɕ ʉ j ɪ f] * плюща: [p ʎ ʉ ɕː ə] |
Occurrences: 39,084 Examples: * слуха: [s̪ ɫ u x ə] * южным: [j u ʐ n̪ ɨ m] * буры: [b u r ɨ] * торгу: [t̪ ɐ r ɡ u] |
||
Occurrences: 428,405 Examples: * черти: [tɕ e r tʲ ɪ] * щечки: [ɕː e tɕ c ɪ] * низал: [ɲ ɪ z̪ a ɫ] * дерев: [dʲ ɪ rʲ e f] |
Occurrences: 115,796 Examples: * сушку: [s̪ u ʂ k ʊ] * локтю: [ɫ o c tʲ ʊ] * фауну: [f a ʊ n̪ ʊ] * маму: [m a m ʊ] |
||||
Close-Mid |
Occurrences: 101,845 Examples: * копье: [k ɐ pʲ j e] * вешал: [vʲ e ʂ ə ɫ] * метел: [mʲ ɪ tʲ e ɫ] * славе: [s̪ ɫ a vʲ e] |
Occurrences: 15,369 Examples: * всё: [f sʲ ɵ] * царём: [t̪s̪ ɐ rʲ ɵ m] * пчёлы: [p tɕ ɵ ɫ ɨ] * рёв: [rʲ ɵ f] |
Occurrences: 68,961 Examples: * лишён: [ʎ ɪ ʂ o n̪] * сотне: [s̪ o tʲ ɲ e] * полн: [p o ɫ n̪] * свода: [s̪ v o d̪ ə] |
||
Occurrences: 379,739 Examples: * моего: [m ə j ɪ v o] * ипром: [i p r ə m] * газом: [ɡ a z̪ ə m] * бруса: [b r u s̪ ə] |
|||||
Open-Mid |
Occurrences: 11,111 Examples: * женин: [ʐ ɛ ɲ ɪ n̪] * стена: [s̪ t̪ ɛ n̪ ə] * вебер: [v ɛ bʲ ɪ r] * цеху: [t̪s̪ ɛ x ʊ] |
||||
Occurrences: 20,001 Examples: * сияй: [sʲ ɪ j æ j] * чащах: [tɕ æ ɕː ə x] * ящике: [j æ ɕː ɪ c e] * ряде: [rʲ æ dʲ e] |
Occurrences: 236,936 Examples: * козлы: [k ɐ z̪ ɫ ɨ] * кормы: [k ɐ r m ɨ] * таро: [t̪ ɐ r o] * огнем: [ɐ ɟ ɲ ɪ m] |
||||
Open |
Occurrences: 152,488 Examples: * глава: [ɡ ɫ ɐ v a] * мкаде: [m k a dʲ e] * башне: [b a ʂ ɲ e] * мигза: [mʲ ɪ ɡ z̪ a] |