Russian MFA dictionary v2.0.0#
@techreport{mfa_russian_mfa_dictionary_2022,
author={McAuliffe, Michael and Sonderegger, Morgan},
title={Russian MFA dictionary v2.0.0},
address={\url{https://mfa-models.readthedocs.io/pronunciation dictionary/Russian/Russian MFA dictionary v2_0_0.html}},
year={2022},
month={Mar},
}
G2P models |
Installation#
Install from the MFA command line:
mfa model download dictionary russian_mfa
Or download from the release page.
The dictionary available from the release page and command line installation has pronunciation and silence probabilities estimated as part acoustic model training (see Silence probability format and training pronunciation probabilities for more information. If you would like to use the version of this dictionary without probabilities, please see the plain dictionary.
Intended use#
This dictionary is intended for forced alignment of Russian transcripts.
This dictionary uses the MFA phone set for Russian, and was used in training the Russian MFA acoustic model. Pronunciations can be added on top of the dictionary, as long as no additional phones are introduced.
Performance Factors#
When trying to get better alignment accuracy, adding pronunciations is generally helpful, especially for different styles and dialects. The most impactful improvements will generally be seen when adding reduced variants that involve deleting segments/syllables common in spontaneous speech. Alignment must include all phones specified in the pronunciation of a word, and each phone has a minimum duration (by default 10ms). If a speaker pronounces a multisyllabic word with just a single syllable, it can be hard for MFA to fit all the segments in, so it will lead to alignment errors on adjacent words as well.
Ethical considerations#
Deploying any Speech-to-Text model into any production setting has ethical implications. You should consider these implications before use.
Demographic Bias#
You should assume every machine learning model has demographic bias unless proven otherwise. For pronunciation dictionaries, it is often the case that transcription accuracy and lexicon coverage for the prestige variety modeled in this dictionary compared to other variants. If you are using this dictionary in production, you should acknowledge this as a potential issue.
IPA Charts#
Consonants#
Obstruent symbols to the left of are unvoiced and those to the right are voiced.
Manner |
Labial |
Labiodental |
Dental |
Alveolar |
Retroflex |
Palatal |
Velar |
---|---|---|---|---|---|---|---|
Nasal |
Occurrences: 95,612 Examples: * разим: [r ɐ zʲ i m] * режим: [rʲ ɪ ʐ ɨ m] * карм: [k a r m] * фэнам: [f ɛ n̪ ə m] Occurrences: 45,754 Examples: * мике: [mʲ i c e] * змеей: [z̪ mʲ e j ɪ j] * мешая: [mʲ ɪ ʂ a j ə] * жмёте: [ʐ mʲ ɵ tʲ e] Occurrences: 391 Examples: * гамме: [ɡ a mʲː e] * сумме: [s̪ u mʲː e] * комми: [k ɐ mʲː ɪ] * лемме: [ʎ e mʲː e] Occurrences: 593 Examples: * гамму: [ɡ a mː ʊ] * лемму: [ʎ e mː ʊ] * леммы: [ʎ e mː ɨ] * амман: [ɐ mː a n̪] |
Occurrences: 97,046 Examples: * янус: [j a n̪ ʊ s̪] * пуншу: [p u n̪ ʂ ʊ] * ныне: [n̪ ɨ ɲ e] * урну: [u r n̪ ʊ] Occurrences: 9,498 Examples: * инны: [ɪ n̪ː ɨ] * сонны: [s̪ o n̪ː ɨ] * танны: [t̪ a n̪ː ɨ] * ценно: [t̪s̪ ɛ n̪ː ə] |
Occurrences: 72,787 Examples: * линь: [ʎ i ɲ] * нюхаю: [ɲ u x ə j ʊ] * ныне: [n̪ ɨ ɲ e] * вняв: [v ɲ a f] Occurrences: 1,201 Examples: * ванне: [v a ɲː e] * бонне: [b o ɲː e] * анне: [ɐ ɲː e] * донне: [d̪ o ɲː e] |
||||
Stop |
Occurrences: 109,088 Examples: * плаха: [p ɫ a x ə] * пуншу: [p u n̪ ʂ ʊ] * купою: [k u p ə j ʊ] * пепле: [pʲ e p ʎ e] Occurrences: 368 Examples: * каппу: [k a pː ʊ] * каппа: [k a pː ə] * каппы: [k a pː ɨ] Occurrences: 41,438 Examples: * роба: [r o b ə] * абазе: [ɐ b a zʲ e] * буки: [b u c ɪ] * болят: [b ɐ ʎ a t̪] Occurrences: 111 Examples: * аббою: [ɐ bː o j ʊ] * абба: [ɐ bː a] * аббат: [ɐ bː a t̪] * аббу: [ɐ bː u] |
Occurrences: 140,480 Examples: * отрыл: [ɐ t̪ r ɨ ɫ] * болят: [b ɐ ʎ a t̪] * сдаёт: [z̪ d̪ ɐ j ɵ t̪] * стажу: [s̪ t̪ a ʐ ʊ] Occurrences: 668 Examples: * нетто: [n̪ ɛ t̪ː ə] * чемто: [tɕ ɪ mʲ t̪ː ə] * будто: [b u t̪ː ə] * ротта: [r o t̪ː ə] Occurrences: 50,506 Examples: * сдаёт: [z̪ d̪ ɐ j ɵ t̪] * шкода: [ʂ k o d̪ ə] * вдаль: [v d̪ a ʎ] * следу: [sʲ ʎ e d̪ ʊ] Occurrences: 608 Examples: * отдуй: [ɐ d̪ː u j] * отдам: [ɐ d̪ː a m] * отдув: [ɐ d̪ː u f] * отдых: [o d̪ː ɨ x] |
Occurrences: 29,678 Examples: * мике: [mʲ i c e] * буки: [b u c ɪ] * жидке: [ʐ ɨ t̪ c e] * кляпы: [c ʎ a p ɨ] Occurrences: 158 Examples: * микки: [mʲ i cː ɪ] * сукки: [s̪ u cː ɪ] * аккре: [ɐ cː rʲ e] * сукке: [s̪ u cː e] Occurrences: 13,563 Examples: * гири: [ɟ i rʲ ɪ] * гнёшь: [ɟ ɲ ɵ ʂ] * грим: [ɟ rʲ i m] * гипсы: [ɟ i p s̪ ɨ] Occurrences: 66 Examples: * аггея: [a ɟː ɪ j ə] * аггею: [a ɟː ɪ j ʊ] * регги: [r ɛ ɟː ɪ] * аггел: [a ɟː ɪ ɫ] |
Occurrences: 100,640 Examples: * купою: [k u p ə j ʊ] * печка: [pʲ e tɕ k ə] * крылу: [k r ɨ ɫ u] * кущею: [k u ɕː ɪ j ʊ] Occurrences: 358 Examples: * никко: [ɲ ɪ kː o] * мекка: [mʲ e kː ə] * дакка: [d̪ a kː ə] * аккра: [a kː r ə] Occurrences: 36,509 Examples: * углы: [ʊ ɡ ɫ ɨ] * гасну: [ɡ a s̪ n̪ ʊ] * голь: [ɡ o ʎ] * груд: [ɡ r u t̪] Occurrences: 23 Examples: |
|||
Affricate |
Occurrences: 21,477 Examples: * царил: [t̪s̪ ɐ rʲ i ɫ] * рацея: [r ɐ t̪s̪ ɛ j ə] * церий: [t̪s̪ ɛ rʲ ɪ j] * немца: [ɲ e m t̪s̪ ə] Occurrences: 986 Examples: * отцу: [ɐ t̪s̪ː u] * ниццу: [ɲ i t̪s̪ː ʊ] * ситца: [sʲ i t̪s̪ː ə] * отцов: [ɐ t̪s̪ː o f] Occurrences: 52 Examples: Occurrences: 9 Examples: * янцзы: [j a n̪ d̪z̪ː ɨ] * цзы: [d̪z̪ː ɨ] |
Occurrences: 39,060 Examples: * речь: [rʲ e tɕ] * рачил: [r ɐ tɕ i ɫ] * печка: [pʲ e tɕ k ə] * чекой: [tɕ ɪ k o j] Occurrences: 1,290 Examples: * путчи: [p u tɕː ɪ] * четче: [tɕ e tɕː e] * отчет: [ɐ tɕː ɪ t̪] * матчи: [m a tɕː ɪ] |
|||||
Sibilant |
Occurrences: 114,322 Examples: * янус: [j a n̪ ʊ s̪] * соусу: [s̪ o ʊ s̪ ʊ] * гасну: [ɡ a s̪ n̪ ʊ] * стажу: [s̪ t̪ a ʐ ʊ] Occurrences: 6,451 Examples: * ссать: [s̪ː a tʲ] * ссуди: [s̪ː ʊ dʲ i] * лассо: [ɫ ɐ s̪ː o] * ссут: [s̪ː u t̪] Occurrences: 62,391 Examples: * разор: [r ə z̪ ə r] * сдаёт: [z̪ d̪ ɐ j ɵ t̪] * мызой: [m ɨ z̪ ə j] * змеей: [z̪ mʲ e j ɪ j] Occurrences: 198 Examples: * изза: [ɪ ə z̪ː ə] * сзади: [z̪ː a dʲ ɪ] |
Occurrences: 75,705 Examples: * власе: [v ɫ a sʲ e] * следе: [sʲ ʎ e dʲ e] * иксе: [i c sʲ e] * следу: [sʲ ʎ e d̪ ʊ] Occurrences: 2,051 Examples: * вёзся: [vʲ ɵ sʲː ə] * массе: [m a sʲː e] * муссе: [m u sʲː e] * мессе: [mʲ e sʲː e] Occurrences: 13,001 Examples: * разим: [r ɐ zʲ i m] * абазе: [ɐ b a zʲ e] * стезе: [sʲ tʲ ɪ zʲ e] * сузив: [s̪ u zʲ ɪ f] Occurrences: 11 Examples: * оззи: [o zʲː ɪ] |
Occurrences: 60,057 Examples: * шипев: [ʂ ɨ pʲ e f] * бьёшь: [bʲ j ɵ ʂ] * пуншу: [p u n̪ ʂ ʊ] * векш: [vʲ e k ʂ] Occurrences: 727 Examples: * сшила: [ʂː ɨ ɫ ə] * сшиб: [ʂː ɨ p] * сшил: [ʂː ɨ ɫ] * сшиби: [ʂː ɨ bʲ i] Occurrences: 25,733 Examples: * режим: [rʲ ɪ ʐ ɨ m] * стажу: [s̪ t̪ a ʐ ʊ] * жидке: [ʐ ɨ t̪ c e] * нежа: [ɲ e ʐ ə] Occurrences: 1,030 Examples: * езжав: [j ɪ ʐː a f] * визжа: [vʲ ɪ ʐː a] * вожжу: [v ɐ ʐː u] * позже: [p o ʐː ɨ] |
Occurrences: 691 Examples: * винищ: [vʲ ɪ ɲ i ɕ] * мощь: [m o ɕ] * вещ: [vʲ e ɕ] * дрищ: [d̪ rʲ i ɕ] Occurrences: 19,735 Examples: * тёщею: [tʲ ɵ ɕː ɪ j ʊ] * кущею: [k u ɕː ɪ j ʊ] * соищу: [s̪ ə ɪ ɕː u] * вещий: [vʲ e ɕː ɪ j] Occurrences: 732 Examples: * езжав: [j ɪ ʑː a f] * дожде: [d̪ ɐ ʑː e] * визжа: [vʲ ɪ ʑː a] * вожжу: [v ɐ ʑː u] |
|||
Fricative |
Occurrences: 57,073 Examples: * шипев: [ʂ ɨ pʲ e f] * фэнам: [f ɛ n̪ ə m] * кофре: [k o f rʲ e] * вняв: [v ɲ a f] Occurrences: 5,969 Examples: * впишу: [fʲ pʲ ɪ ʂ u] * ферзи: [fʲ ɪ r zʲ i] * софит: [s̪ ɐ fʲ i t̪] * бровь: [b r o fʲ] Occurrences: 196 Examples: Occurrences: 38 Examples: * хоффа: [x ɐ fː ə] * куффу: [k u fː ʊ] * куффы: [k u fː ɨ] * хоффу: [x ɐ fː ʊ] Occurrences: 116,962 Examples: * оравы: [ɐ r a v ɨ] * власе: [v ɫ a sʲ e] * вуях: [v u j ə x] * вняв: [v ɲ a f] Occurrences: 33,769 Examples: * векш: [vʲ e k ʂ] * вешая: [vʲ e ʂ ə j ə] * ловим: [ɫ o vʲ ɪ m] * верны: [vʲ e r n̪ ɨ] Occurrences: 273 Examples: * введи: [vʲː ɪ dʲ i] * введя: [vʲː ɪ dʲ a] * ввяз: [vʲː a s̪] * ввёз: [vʲː ɵ s̪] Occurrences: 188 Examples: * аввам: [a vː ə m] * ввоз: [vː o s̪] * ввалю: [vː ɐ ʎ u] * аввою: [a vː ə j ʊ] |
Occurrences: 2,991 Examples: * дыхи: [d̪ ɨ ç ɪ] * хитон: [ç ɪ t̪ o n̪] * бэхи: [b ɛ ç ɪ] * махин: [m ɐ ç i n̪] |
|||||
Approximant |
Occurrences: 175,175 Examples: * янус: [j a n̪ ʊ s̪] * тёщею: [tʲ ɵ ɕː ɪ j ʊ] * бьёшь: [bʲ j ɵ ʂ] * купою: [k u p ə j ʊ] Occurrences: 89 Examples: * райю: [r a jː ʊ] * райям: [r a jː ə m] * вайю: [v a jː ʊ] * майею: [m a jː ɪ j ʊ] |
||||||
Trill |
Occurrences: 152,094 Examples: * разим: [r ɐ zʲ i m] * роба: [r o b ə] * отрыл: [ɐ t̪ r ɨ ɫ] * рачил: [r ɐ tɕ i ɫ] Occurrences: 76,338 Examples: * речь: [rʲ e tɕ] * гири: [ɟ i rʲ ɪ] * режим: [rʲ ɪ ʐ ɨ m] * пырей: [p ɨ rʲ e j] Occurrences: 294 Examples: * герре: [ɟ e rʲː e] * мирре: [mʲ i rʲː e] * перри: [pʲ ɪ rʲː i] * гарри: [ɡ a rʲː ɪ] Occurrences: 359 Examples: * тррах: [t̪ rː ə x] * герра: [ɟ e rː ə] * терра: [tʲ ɪ rː ə] * эрроу: [ɪ rː o ʊ] |
||||||
Lateral |
Occurrences: 84,619 Examples: * плаха: [p ɫ a x ə] * отрыл: [ɐ t̪ r ɨ ɫ] * рачил: [r ɐ tɕ i ɫ] * крылу: [k r ɨ ɫ u] Occurrences: 674 Examples: * аллах: [ɐ ɫː a x] * виллы: [vʲ i ɫː ɨ] * галлы: [ɡ a ɫː ɨ] * галла: [ɡ a ɫː ə] |
Occurrences: 89,260 Examples: * линь: [ʎ i ɲ] * пепле: [pʲ e p ʎ e] * нолю: [n̪ ɐ ʎ u] * болят: [b ɐ ʎ a t̪] Occurrences: 1,269 Examples: * нелли: [n̪ ɨ ʎː ɪ] * колли: [k o ʎː ɪ] * ролле: [r o ʎː e] * булле: [b u ʎː e] |
Vowels#
Vowel symbols to the left of are unrounded and those to the right are rounded.
Front |
Near-Front |
Central |
Near-Back |
Back |
|
---|---|---|---|---|---|
Close |
Occurrences: 77,764 Examples: * разим: [r ɐ zʲ i m] * гири: [ɟ i rʲ ɪ] * линь: [ʎ i ɲ] * рачил: [r ɐ tɕ i ɫ] |
Occurrences: 128,753 Examples: * шипев: [ʂ ɨ pʲ e f] * отрыл: [ɐ t̪ r ɨ ɫ] * режим: [rʲ ɪ ʐ ɨ m] * ныне: [n̪ ɨ ɲ e] Occurrences: 12,332 Examples: * кочую: [k ɐ tɕ ʉ j ʊ] * юлил: [j ʉ ʎ i ɫ] * бюсте: [bʲ ʉ sʲ tʲ e] * тюлям: [tʲ ʉ ʎ ə m] |
Occurrences: 35,960 Examples: * пуншу: [p u n̪ ʂ ʊ] * купою: [k u p ə j ʊ] * нюхаю: [ɲ u x ə j ʊ] * буки: [b u c ɪ] |
||
Occurrences: 391,416 Examples: * гири: [ɟ i rʲ ɪ] * тёщею: [tʲ ɵ ɕː ɪ j ʊ] * режим: [rʲ ɪ ʐ ɨ m] * буки: [b u c ɪ] |
Occurrences: 105,500 Examples: * янус: [j a n̪ ʊ s̪] * тёщею: [tʲ ɵ ɕː ɪ j ʊ] * пуншу: [p u n̪ ʂ ʊ] * соусу: [s̪ o ʊ s̪ ʊ] |
||||
Close-Mid |
Occurrences: 89,942 Examples: * речь: [rʲ e tɕ] * шипев: [ʂ ɨ pʲ e f] * печка: [pʲ e tɕ k ə] * абазе: [ɐ b a zʲ e] |
Occurrences: 14,754 Examples: * тёщею: [tʲ ɵ ɕː ɪ j ʊ] * бьёшь: [bʲ j ɵ ʂ] * сдаёт: [z̪ d̪ ɐ j ɵ t̪] * гнёшь: [ɟ ɲ ɵ ʂ] |
Occurrences: 62,139 Examples: * роба: [r o b ə] * соусу: [s̪ o ʊ s̪ ʊ] * робя: [r o bʲ ə] * анон: [ɐ n̪ o n̪] |
||
Occurrences: 346,275 Examples: * плаха: [p ɫ a x ə] * роба: [r o b ə] * купою: [k u p ə j ʊ] * нюхаю: [ɲ u x ə j ʊ] |
|||||
Open-Mid |
Occurrences: 10,020 Examples: * фэнам: [f ɛ n̪ ə m] * поэте: [p o ɛ tʲ e] * мэрия: [m ɛ rʲ ɪ j ə] * рацея: [r ɐ t̪s̪ ɛ j ə] |
||||
Occurrences: 18,704 Examples: * ряби: [rʲ æ bʲ ɪ] * явит: [j æ vʲ ɪ t̪] * баяне: [b ɐ j æ ɲ e] * снясь: [sʲ ɲ æ sʲ] |
Occurrences: 217,564 Examples: * разим: [r ɐ zʲ i m] * отрыл: [ɐ t̪ r ɨ ɫ] * рачил: [r ɐ tɕ i ɫ] * абазе: [ɐ b a zʲ e] |
||||
Open |
Occurrences: 141,863 Examples: * янус: [j a n̪ ʊ s̪] * плаха: [p ɫ a x ə] * абазе: [ɐ b a zʲ e] * карм: [k a r m] |