Russian MFA dictionary v2.0.0a#
@techreport{mfa_russian_mfa_dictionary_2022,
author={McAuliffe, Michael and Sonderegger, Morgan},
title={Russian MFA dictionary v2.0.0a},
address={\url{https://mfa-models.readthedocs.io/pronunciation dictionary/Russian/Russian MFA dictionary v2_0_0a.html}},
year={2022},
month={May},
}
G2P models Acoustic models |
Installation#
Install from the MFA command line:
mfa model download dictionary russian_mfa
Or download from the release page.
The dictionary available from the release page and command line installation has pronunciation and silence probabilities estimated as part acoustic model training (see Silence probability format and training pronunciation probabilities for more information. If you would like to use the version of this dictionary without probabilities, please see the plain dictionary.
Intended use#
This dictionary is intended for forced alignment of Russian transcripts.
This dictionary uses the MFA phone set for Russian, and was used in training the Russian MFA acoustic model. Pronunciations can be added on top of the dictionary, as long as no additional phones are introduced.
Performance Factors#
When trying to get better alignment accuracy, adding pronunciations is generally helpful, especially for different styles and dialects. The most impactful improvements will generally be seen when adding reduced variants that involve deleting segments/syllables common in spontaneous speech. Alignment must include all phones specified in the pronunciation of a word, and each phone has a minimum duration (by default 10ms). If a speaker pronounces a multisyllabic word with just a single syllable, it can be hard for MFA to fit all the segments in, so it will lead to alignment errors on adjacent words as well.
Ethical considerations#
Deploying any Speech-to-Text model into any production setting has ethical implications. You should consider these implications before use.
Demographic Bias#
You should assume every machine learning model has demographic bias unless proven otherwise. For pronunciation dictionaries, it is often the case that transcription accuracy and lexicon coverage for the prestige variety modeled in this dictionary compared to other variants. If you are using this dictionary in production, you should acknowledge this as a potential issue.
IPA Charts#
Consonants#
Obstruent symbols to the left of are unvoiced and those to the right are voiced.
Manner |
Labial |
Labiodental |
Dental |
Alveolar |
Retroflex |
Palatal |
Velar |
---|---|---|---|---|---|---|---|
Nasal |
Occurrences: 95,612 Examples: * тима: [t ʲ i m ə] * мур: [m u r] * поем: [p ɐ j e m] * мачт: [m a t ɕ t ̪] Occurrences: 45,754 Examples: * томи: [t ̪ ɐ m ʲ i] * смела: [s ̪ m ʲ e ɫ ə] * умели: [ʊ m ʲ e ʎ ɪ] * гоями: [ɡ o j ə m ʲ ɪ] Occurrences: 391 Examples: * комми: [k ɐ m ʲ ː ɪ] * лемме: [ʎ e m ʲ ː e] * сумме: [s ̪ u m ʲ ː e] * гамме: [ɡ a m ʲ ː e] Occurrences: 593 Examples: * суммы: [s ̪ u m ː ɨ] * гаммы: [ɡ a m ː ɨ] * леммы: [ʎ e m ː ɨ] * умма: [u m ː ə] |
Occurrences: 97,046 Examples: * наряд: [n ̪ ɐ r ʲ a t ̪] * бедно: [b ʲ e d ̪ n ̪ ə] * руина: [r ʊ i n ̪ ə] * прян: [p r ʲ a n ̪] Occurrences: 9,498 Examples: * манны: [m a n ̪ ː ɨ] * инна: [i n ̪ ː ə] * сонны: [s ̪ o n ̪ ː ɨ] * анной: [ɐ n ̪ ː o j] |
Occurrences: 72,787 Examples: * пыхни: [p ɨ x ɲ i] * снять: [s ʲ ɲ æ t ʲ] * стыню: [s ̪ t ̪ ɨ ɲ ʊ] * неучу: [ɲ e ʊ t ɕ ʊ] Occurrences: 1,201 Examples: * донне: [d ̪ o ɲ ː e] * пенни: [p ʲ e ɲ ː ɪ] * нэнни: [n ̪ ɛ ɲ ː ɪ] * бонне: [b o ɲ ː e] |
||||
Stop |
Occurrences: 109,088 Examples: * плота: [p ɫ ɐ t ̪ a] * репу: [r ʲ e p ʊ] * поясь: [p ɐ j æ s ʲ] * пашня: [p a ʂ ɲ ə] Occurrences: 368 Examples: * каппы: [k a p ː ɨ] * каппа: [k a p ː ə] * каппу: [k a p ː ʊ] Occurrences: 41,438 Examples: * бирав: [b ʲ ɪ r a f] * обиды: [ɐ b ʲ i d ̪ ɨ] * бзом: [b z ̪ o m] * блузу: [b ɫ u z ̪ ʊ] Occurrences: 111 Examples: * аббас: [ɐ b ː a s ̪] * аббат: [ɐ b ː a t ̪] * абба: [ɐ b ː a] * аббою: [ɐ b ː o j ʊ] |
Occurrences: 140,480 Examples: * кутай: [k u t ̪ ə j] * пяты: [p ʲ ɪ t ̪ ɨ] * алеут: [ɐ ʎ ɪ u t ̪] * слёту: [s ʲ ʎ ɵ t ̪ ʊ] Occurrences: 668 Examples: * будто: [b u t ̪ ː ə] * гетто: [ɟ e t ̪ ː ə] * нетто: [n ̪ ɛ t ̪ ː ə] * оттаю: [ɐ t ̪ ː a j ʊ] Occurrences: 50,506 Examples: * дбало: [d ̪ b a ɫ ə] * дур: [d ̪ u r] * ябеда: [j æ b ʲ ɪ d ̪ ə] * дрочи: [d ̪ r ɐ t ɕ i] Occurrences: 608 Examples: * отдач: [ɐ d ̪ ː a t ɕ] * отдав: [ɐ d ̪ ː a f] * отдул: [ɐ d ̪ ː u ɫ] * аддон: [ɐ d ̪ ː o n ̪] |
Occurrences: 29,678 Examples: * метки: [m ʲ e t ̪ c ɪ] * шутки: [ʂ u t ̪ c ɪ] * отеки: [ɐ t ʲ ɪ c i] * шажки: [ʂ ɐ ʂ c i] Occurrences: 158 Examples: * сукки: [s ̪ u c ː ɪ] * микки: [m ʲ i c ː ɪ] * аккре: [ɐ c ː r ʲ e] * сукке: [s ̪ u c ː e] Occurrences: 13,563 Examples: * сгибу: [z ̪ ɟ i b ʊ] * гению: [ɟ e ɲ ɪ j ʊ] * жгли: [ʐ ɟ ʎ i] * гейшу: [ɟ e j ʂ ʊ] Occurrences: 66 Examples: * аггею: [a ɟ ː ɪ j ʊ] * хигги: [ç i ɟ ː ɪ] * аггей: [a ɟ ː ɪ j] * агги: [a ɟ ː ɪ] |
Occurrences: 100,640 Examples: * грека: [ɟ r ʲ e k ə] * рак: [r a k] * калаш: [k ɐ ɫ a ʂ] * текло: [t ʲ ɪ k ɫ o] Occurrences: 358 Examples: * сукка: [s ̪ u k ː ə] * яакко: [j ɪ k ː ə] * дакка: [d ̪ a k ː ə] * аккра: [a k ː r ə] Occurrences: 36,509 Examples: * луга: [ɫ u ɡ ə] * влогу: [v ɫ o ɡ ʊ] * гнору: [ɡ n ̪ ə r ʊ] * ногах: [n ̪ ɐ ɡ a x] Occurrences: 23 Examples: |
|||
Affricate |
Occurrences: 21,477 Examples: * цех: [t ̪ s ̪ ɛ x] * юнец: [j ʉ ɲ e t ̪ s ̪] * працу: [p r a t ̪ s ̪ ʊ] * целью: [t ̪ s ̪ ɛ ʎ j ʊ] Occurrences: 986 Examples: * пиццы: [p ʲ i t ̪ s ̪ ː ɨ] * отцы: [ɐ t ̪ s ̪ ː ɨ] * ситца: [s ʲ i t ̪ s ̪ ː ə] * ниццу: [ɲ i t ̪ s ̪ ː ʊ] Occurrences: 52 Examples: * цзы: [d ̪ z ̪ ː ɨ] * янцзы: [j a n ̪ d ̪ z ̪ ː ɨ] Occurrences: 9 Examples: * янцзы: [j a n ̪ d ̪ z ̪ ː ɨ] * цзы: [d ̪ z ̪ ː ɨ] |
Occurrences: 39,060 Examples: * маячь: [m ɐ j æ t ɕ] * печах: [p ʲ ɪ t ɕ a x] * чету: [t ɕ ɪ t ̪ u] * вечор: [v ʲ ɪ t ɕ ɵ r] Occurrences: 1,290 Examples: * четче: [t ɕ e t ɕ ː e] * отчим: [o t ɕ ː ɪ m] * матчи: [m a t ɕ ː ɪ] * отче: [o t ɕ ː e] |
|||||
Sibilant |
Occurrences: 114,322 Examples: * узкие: [u s ̪ c ɪ j e] * анис: [ɐ ɲ i s ̪] * сопку: [s ̪ o p k ʊ] * солей: [s ̪ o ʎ ɪ j] Occurrences: 6,451 Examples: * ссать: [s ̪ ː a t ʲ] * ссуд: [s ̪ ː u t ̪] * ассам: [ɐ s ̪ ː a m] * ситца: [s ʲ i t ̪ s ̪ ː ə] Occurrences: 62,391 Examples: * диезу: [d ʲ ɪ j e z ̪ ʊ] * язвам: [j a z ̪ v ə m] * лузою: [ɫ u z ̪ ə j ʊ] * запои: [z ̪ ɐ p o ɪ] Occurrences: 198 Examples: * сзади: [z ̪ ː a d ʲ ɪ] * изза: [ɪ ə z ̪ ː ə] * янцзы: [j a n ̪ d ̪ z ̪ ː ɨ] * цзы: [d ̪ z ̪ ː ɨ] |
Occurrences: 75,705 Examples: * кисте: [c ɪ s ʲ t ʲ e] * осях: [ɐ s ʲ a x] * гусям: [ɡ ʊ s ʲ a m] * месье: [m ʲ ɪ s ʲ j e] Occurrences: 2,051 Examples: * несся: [ɲ ɪ s ʲ ː ə] * пасся: [p a s ʲ ː ə] * нёсся: [ɲ ɵ s ʲ ː ə] * месси: [m ʲ ɪ s ʲ ː i] Occurrences: 13,001 Examples: * возня: [v ɐ z ʲ ɲ a] * езиде: [j ɪ z ʲ i d ʲ e] * грезя: [ɟ r ʲ e z ʲ ə] * зимуй: [z ʲ ɪ m u j] Occurrences: 11 Examples: * оззи: [o z ʲ ː ɪ] |
Occurrences: 60,057 Examples: * нашёл: [n ̪ ɐ ʂ o ɫ] * шубой: [ʂ u b ə j] * шхун: [ʂ x u n ̪] * пашню: [p a ʂ ɲ ʊ] Occurrences: 727 Examples: * сшили: [ʂ ː ɨ ʎ ɪ] * ведши: [v ʲ e t ʂ ː ɨ] * сшило: [ʂ ː ɨ ɫ ə] * нёсши: [ɲ ɵ ʂ ː ɨ] Occurrences: 25,733 Examples: * жире: [ʐ ɨ r ʲ e] * обяжи: [ɐ b ʲ ɪ ʐ ɨ] * сражу: [s ̪ r ɐ ʐ u] * ложны: [ɫ o ʐ n ̪ ɨ] Occurrences: 1,030 Examples: * джеме: [d ʐ ː ɛ m ʲ e] * джине: [d ʐ ː ɨ ɲ e] * фиджи: [f ʲ i d ʐ ː ɨ] * беджи: [b ɛ d ʐ ː ɨ] |
Occurrences: 691 Examples: * проч: [p r ɐ t ɕ] * сечка: [s ʲ e t ɕ k ə] * пещах: [p ʲ ɪ ɕ ː a x] * урчат: [ʊ r t ɕ a t ̪] Occurrences: 19,735 Examples: * щеп: [ɕ ː e p] * овощу: [o v ə ɕ ː ʊ] * тащат: [t ̪ a ɕ ː ə t ̪] * матчи: [m a t ɕ ː ɪ] Occurrences: 0 Examples: * позже: [p o ʑ ː e] * жужжи: [ʐ ʊ ʑ ː i] * визжу: [v ʲ ɪ ʑ ː u] * дожде: [d ̪ ɐ ʑ ː e] Occurrences: 732 Examples: * вожжи: [v o ʑ ː ɪ] * жужжа: [ʐ ʊ ʑ ː a] * вожжа: [v ɐ ʑ ː a] * вожже: [v ɐ ʑ ː e] |
|||
Fricative |
Occurrences: 57,073 Examples: * девки: [d ʲ e f c ɪ] * чинив: [t ɕ ɪ ɲ i f] * флеша: [f ɫ ɛ ʂ ə] * винив: [v ʲ ɪ ɲ i f] Occurrences: 5,969 Examples: * фейку: [f ʲ e j k ʊ] * фига: [f ʲ i ɡ ə] * фишке: [f ʲ i ʂ c e] * фиуме: [f ʲ ɪ ʊ m ʲ e] Occurrences: 196 Examples: Occurrences: 38 Examples: * хоффу: [x ɐ f ː ʊ] * куффы: [k u f ː ɨ] * куффу: [k u f ː ʊ] * хоффа: [x ɐ f ː ə] Occurrences: 116,962 Examples: * свече: [s ̪ v ʲ ɪ t ɕ e] * верах: [v ʲ e r ə x] * веже: [v ʲ e ʐ ɨ] * иврит: [ɪ v r ʲ i t ̪] Occurrences: 33,769 Examples: * навел: [n ̪ ə v ʲ ɪ ɫ] * виска: [v ʲ ɪ s ̪ k a] * витаю: [v ʲ ɪ t ̪ a j ʊ] * вежей: [v ʲ e ʐ ɨ j] Occurrences: 273 Examples: * ввезу: [v ʲ ː ɪ z ̪ u] * вверх: [v ʲ ː e r x] * ввяжу: [v ʲ ː ɪ ʐ u] * введя: [v ʲ ː ɪ d ʲ a] Occurrences: 188 Examples: * ввода: [v ː o d ̪ ə] * вводи: [v ː ɐ d ʲ i] * авва: [a v ː ə] * вводя: [v ː ɐ d ʲ a] |
Occurrences: 2,991 Examples: * шхера: [ʂ ç e r ə] * ахи: [a ç ɪ] * тихим: [t ʲ ɪ ç i m] * блохе: [b ɫ ɐ ç e] |
|||||
Approximant |
Occurrences: 175,175 Examples: * боец: [b ɐ j e t ̪ s ̪] * згрою: [z ̪ ɡ r o j ʊ] * семьи: [s ʲ e m ʲ j ɪ] * думой: [d ̪ u m ə j] Occurrences: 89 Examples: * майя: [m a j ː ə] * райях: [r a j ː ə x] * майях: [m a j ː ə x] * вайю: [v a j ː ʊ] |
||||||
Trill |
Occurrences: 152,094 Examples: * жерла: [ʐ ɛ r ɫ ə] * горе: [ɡ o r ʲ e] * харчо: [x ɐ r t ɕ ɵ] * шурья: [ʂ ʊ r ʲ j a] Occurrences: 76,338 Examples: * крепя: [c r ʲ ɪ p ʲ a] * ширят: [ʂ ɨ r ʲ ə t ̪] * узрят: [u z ̪ r ʲ ə t ̪] * твари: [t ̪ v a r ʲ ɪ] Occurrences: 294 Examples: * фурря: [f u r ʲ ː ə] * гарри: [ɡ a r ʲ ː ɪ] * мирре: [m ʲ i r ʲ ː e] * сорри: [s ̪ o r ʲ ː ɪ] Occurrences: 359 Examples: * тррах: [t ̪ r ː ə x] * герру: [ɟ e r ː ʊ] * царра: [t ̪ s ̪ a r ː ə] * мирры: [m ʲ i r ː ɨ] |
||||||
Lateral |
Occurrences: 84,619 Examples: * масла: [m a s ̪ ɫ ə] * лавою: [ɫ a v ə j ʊ] * плату: [p ɫ a t ̪ ʊ] * валко: [v a ɫ k ə] Occurrences: 674 Examples: * баллу: [b a ɫ ː ʊ] * холла: [x o ɫ ː ə] * алло: [ɐ ɫ ː o] * балла: [b a ɫ ː ə] |
Occurrences: 89,260 Examples: * далле: [d ̪ a ʎ ː e] * почли: [p ɐ t ɕ ʎ i] * хвале: [x v ɐ ʎ e] * таили: [t ̪ ɐ i ʎ ɪ] Occurrences: 1,269 Examples: * аллею: [ɐ ʎ ː e j ʊ] * ралли: [r a ʎ ː ɪ] * аллеи: [ɐ ʎ ː e ɪ] * келли: [c ɪ ʎ ː ɪ] |
Vowels#
Vowel symbols to the left of are unrounded and those to the right are rounded.
Front |
Near-Front |
Central |
Near-Back |
Back |
|
---|---|---|---|---|---|
Close |
Occurrences: 77,764 Examples: * санин: [s ̪ ɐ ɲ i n ̪] * физик: [f ʲ i z ʲ ɪ k] * крени: [c r ʲ ɪ ɲ i] * лубки: [ɫ ʊ p c i] |
Occurrences: 128,753 Examples: * асуры: [a s ̪ ʊ r ɨ] * спецы: [s ̪ p ʲ ɪ t ̪ s ̪ ɨ] * ширак: [ʂ ɨ r a k] * цыпы: [t ̪ s ̪ ɨ p ɨ] Occurrences: 12,332 Examples: * юге: [j ʉ ɟ e] * полюя: [p ɐ ʎ ʉ j ə] * ощупь: [o ɕ ː ʉ p ʲ] * чучой: [t ɕ ʉ t ɕ ɵ j] |
Occurrences: 35,960 Examples: * языку: [j ɪ z ̪ ɨ k u] * пожую: [p ə ʐ ʊ j u] * щипну: [ɕ ː ɪ p n ̪ u] * разуй: [r ɐ z ̪ u j] |
||
Occurrences: 391,416 Examples: * пчеле: [p t ɕ ɪ ʎ e] * идешь: [ɪ d ʲ ɪ ʂ] * частя: [t ɕ ɪ s ʲ t ʲ a] * чипку: [t ɕ ɪ p k u] |
Occurrences: 105,500 Examples: * путчу: [p u t ɕ ː ʊ] * стону: [s ̪ t ̪ o n ̪ ʊ] * неуча: [ɲ e ʊ t ɕ ə] * гидру: [ɟ i d ̪ r ʊ] |
||||
Close-Mid |
Occurrences: 89,942 Examples: * верши: [v ʲ e r ʂ ɨ] * метео: [m ʲ e t ʲ ɪ o] * отве: [ɐ t ̪ v ʲ e] * тапке: [t ̪ a p c e] |
Occurrences: 14,754 Examples: * учёте: [ʊ t ɕ ɵ t ʲ e] * йодах: [j ɵ d ̪ ə x] * куёт: [k ʊ j ɵ t ̪] * учёб: [ʊ t ɕ ɵ p] |
Occurrences: 62,139 Examples: * мор: [m o r] * троп: [t ̪ r o p] * душою: [d ̪ ʊ ʂ o j ʊ] * уклон: [ʊ k ɫ o n ̪] |
||
Occurrences: 346,275 Examples: * жеста: [ʐ ɛ s ̪ t ̪ ə] * паяя: [p ɐ j æ j ə] * тона: [t ̪ o n ̪ ə] * джута: [d ʐ ː u t ̪ ə] |
|||||
Open-Mid |
Occurrences: 10,020 Examples: * арес: [ɐ r ɛ s ̪] * шесть: [ʂ ɛ s ʲ t ʲ] * сэйди: [s ̪ ɛ j d ʲ i] * джека: [d ʐ ː ɛ k ə] |
||||
Occurrences: 18,704 Examples: * ябед: [j æ b ʲ ɪ t ̪] * вещая: [v ʲ ɪ ɕ ː æ j ə] * ятю: [j æ t ʲ ʊ] * ваяем: [v ɐ j æ j ɪ m] |
Occurrences: 217,564 Examples: * кроте: [k r ɐ t ʲ e] * брони: [b r ɐ ɲ i] * осоку: [ɐ s ̪ o k ʊ] * омыло: [ɐ m ɨ ɫ ə] |
||||
Open |
Occurrences: 141,863 Examples: * афтам: [a f t ̪ ə m] * ларях: [ɫ a r ʲ ə x] * шарже: [ʂ a r ʐ ɨ] * раца: [r ɐ t ̪ s ̪ a] |