Belarusian CV dictionary v2.0.0#
@misc{Ahn_Chodroff_2022,
author={Ahn, Emily and Chodroff, Eleanor},
title={VoxCommunis Corpus},
address={\url{https://osf.io/t957v}},
publisher={OSF},
year={2022},
month={Jan}
}
Acoustic models |
Installation#
Install from the MFA command line:
mfa model download dictionary belarusian_cv
Or download from the release page.
Intended use#
This dictionary is intended for forced alignment of Belarusian transcripts.
This dictionary uses the XPF phone set for Belarusian, and was used in training the Belarusian XPF acoustic model. Pronunciations can be added on top of the dictionary, as long as no additional phones are introduced.
Performance Factors#
When trying to get better alignment accuracy, adding pronunciations is generally helpful, especially for different styles and dialects. The most impactful improvements will generally be seen when adding reduced variants that involve deleting segments/syllables common in spontaneous speech. Alignment must include all phones specified in the pronunciation of a word, and each phone has a minimum duration (by default 10ms). If a speaker pronounces a multisyllabic word with just a single syllable, it can be hard for MFA to fit all the segments in, so it will lead to alignment errors on adjacent words as well.
Ethical considerations#
Deploying any Speech-to-Text model into any production setting has ethical implications. You should consider these implications before use.
Demographic Bias#
You should assume every machine learning model has demographic bias unless proven otherwise. For pronunciation dictionaries, it is often the case that transcription accuracy and lexicon coverage for the prestige variety modeled in this dictionary compared to other variants. If you are using this dictionary in production, you should acknowledge this as a potential issue.
IPA Charts#
Consonants#
Obstruent symbols to the left of are unvoiced and those to the right are voiced.
Manner |
Labial |
Labiodental |
Alveolar |
Alveopalatal |
Palatal |
Velar |
---|---|---|---|---|---|---|
Nasal |
Occurrences: 21,952 Examples: * мачой: [m a tʃ o j] * цэпам: [ts ɛ p a m] * усяму: [u sʲ a m u] * комам: [k o m a m] Occurrences: 12,176 Examples: * мёдзе: [mʲ o dzʲ ɛ] * найме: [n a j mʲ ɛ] * схеме: [s xʲ ɛ mʲ ɛ] * смеў: [s mʲ ɛ u] |
Occurrences: 45,504 Examples: * януш: [j a n u ʃ] * валун: [v a l u n] * браны: [b r a n i] * родны: [r o d n i] Occurrences: 14,086 Examples: * нерв: [nʲ ɛ r v] * нечую: [nʲ ɛ tʃ u j u] * скіне: [s kʲ i nʲ ɛ] * здані: [z d a nʲ i] Occurrences: 4,253 Examples: * сёння: [sʲ o nʲː a] * рання: [r a nʲː a] * панне: [p a nʲː ɛ] * ранні: [r a nʲː i] Occurrences: 1,253 Examples: * вінны: [vʲ i nː i] * анна: [a nː a] * іонны: [j i o nː i] * ленны: [lʲ ɛ nː i] |
||||
Stop |
Occurrences: 27,004 Examples: * пузе: [p u zʲ ɛ] * цэпам: [ts ɛ p a m] * топкі: [t o p kʲ i] * паэма: [p a ɛ m a] Occurrences: 11,154 Examples: * кабат: [k a b a t] * бушуе: [b u ʃ u j ɛ] * браны: [b r a n i] * база: [b a z a] |
Occurrences: 33,269 Examples: * твор: [t v o r] * кабат: [k a b a t] * топкі: [t o p kʲ i] * узлёт: [u z lʲ o t] Occurrences: 7 Examples: Occurrences: 21,655 Examples: * адрыў: [a d r i u] * хард: [x a r d] * жудца: [ʒ u d ts a] * удалы: [u d a l i] |
Occurrences: 32,671 Examples: * скуры: [s k u r i] * акопе: [a k o pʲ ɛ] * акруг: [a k r u ɣ] * кабат: [k a b a t] |
|||
Affricate |
Occurrences: 8,260 Examples: * цэхі: [ts ɛ xʲ i] * цэпам: [ts ɛ p a m] * жудца: [ʒ u d ts a] * перца: [pʲ ɛ r ts a] Occurrences: 3,252 Examples: * біцца: [bʲ i tsː a] * уецца: [u j ɛ tsː a] * мыцца: [m i tsː a] * мецца: [mʲ ɛ tsː a] Occurrences: 148 Examples: * дзвюх: [dz vʲ u x] * дзэта: [dz ɛ t a] * дзвюм: [dz vʲ u m] * дзонг: [dz o n ɣ] |
Occurrences: 15,059 Examples: * гучэў: [ɣ u tʃ ɛ u] * мачой: [m a tʃ o j] * пучкі: [p u tʃ kʲ i] * нечую: [nʲ ɛ tʃ u j u] Occurrences: 23 Examples: * печчу: [pʲ ɛ tʃː u] * сучча: [s u tʃː a] * ноччу: [n o tʃː u] Occurrences: 960 Examples: * джазу: [dʒ a z u] * джаза: [dʒ a z a] * аджаў: [a dʒ a u] * езджу: [j ɛ z dʒ u] |
||||
Sibilant |
Occurrences: 33,484 Examples: * скуры: [s k u r i] * скуль: [s k u lʲ] * сцэн: [s ts ɛ n] * склаў: [s k l a u] Occurrences: 10,102 Examples: * усяму: [u sʲ a m u] * сякія: [sʲ a kʲ i j a] * сілу: [sʲ i l u] * сёння: [sʲ o nʲː a] Occurrences: 134 Examples: * рыссю: [r i sʲː u] * нёсся: [nʲ o sʲː a] * воссю: [v o sʲː u] * ссек: [sʲː ɛ k] Occurrences: 159 Examples: * ссаць: [sː a tsʲ] * ссала: [sː a l a] Occurrences: 17,056 Examples: * джазу: [dʒ a z u] * зорах: [z o r a x] * база: [b a z a] * узлёт: [u z lʲ o t] Occurrences: 1,827 Examples: * пузе: [p u zʲ ɛ] * язі: [j a zʲ i] * вазіў: [v a zʲ i u] * зірне: [zʲ i r nʲ ɛ] Occurrences: 28 Examples: * ззяла: [zʲː a l a] * ззяе: [zʲː a j ɛ] * маззю: [m a zʲː u] * ззялі: [zʲː a lʲ i] Occurrences: 23 Examples: * ззаду: [zː a d u] |
Occurrences: 9,236 Examples: * януш: [j a n u ʃ] * шафаю: [ʃ a f a j u] * бушуе: [b u ʃ u j ɛ] * нашая: [n a ʃ a j a] Occurrences: 20 Examples: * пешшу: [pʲ ɛ ʃː u] Occurrences: 5,349 Examples: * жудца: [ʒ u d ts a] * жар: [ʒ a r] * жах: [ʒ a x] * біржа: [bʲ i r ʒ a] Occurrences: 46 Examples: * ружжо: [r u ʒː o] |
||||
Fricative |
Occurrences: 2,407 Examples: * шафаю: [ʃ a f a j u] * халіф: [x a lʲ i f] * футу: [f u t u] * кофты: [k o f t i] Occurrences: 1,626 Examples: * фюрэр: [fʲ u r ɛ r] * ферме: [fʲ ɛ r mʲ ɛ] * філіп: [fʲ i lʲ i p] * шафёр: [ʃ a fʲ o r] Occurrences: 26,867 Examples: * твор: [t v o r] * валун: [v a l u n] * нерв: [nʲ ɛ r v] * гвалт: [ɣ v a l t] Occurrences: 7,717 Examples: * вінны: [vʲ i nː i] * верыш: [vʲ ɛ r i ʃ] * віле: [vʲ i lʲ ɛ] * веек: [vʲ ɛ j ɛ k] |
|||||
Approximant |
Occurrences: 40,798 Examples: * мачой: [m a tʃ o j] * януш: [j a n u ʃ] * шафаю: [ʃ a f a j u] * бушуе: [b u ʃ u j ɛ] |
|||||
Trill |
Occurrences: 53,725 Examples: * твор: [t v o r] * скуры: [s k u r i] * адрыў: [a d r i u] * хард: [x a r d] |
|||||
Lateral |
Occurrences: 17,188 Examples: * валун: [v a l u n] * удалы: [u d a l i] * лодка: [l o d k a] * плугі: [p l u ɣʲ i] Occurrences: 26,097 Examples: * скуль: [s k u lʲ] * галіў: [ɣ a lʲ i u] * шляхі: [ʃ lʲ a xʲ i] * узлёт: [u z lʲ o t] Occurrences: 149 Examples: * галлі: [ɣ a lʲː i] * соллю: [s o lʲː u] * зелля: [zʲ ɛ lʲː a] * голле: [ɣ o lʲː ɛ] |
Vowels#
Vowel symbols to the left of are unrounded and those to the right are rounded.
Front |
Near-Front |
Central |
Near-Back |
Back |
|
---|---|---|---|---|---|
Close |
Occurrences: 100,163 Examples: * скуры: [s k u r i] * адрыў: [a d r i u] * цэхі: [ts ɛ xʲ i] * вінны: [vʲ i nː i] |
Occurrences: 57,198 Examples: * пузе: [p u zʲ ɛ] * гучэў: [ɣ u tʃ ɛ u] * джазу: [dʒ a z u] * скуры: [s k u r i] |
|||
Close-Mid |
Occurrences: 26,354 Examples: * твор: [t v o r] * акопе: [a k o pʲ ɛ] * мёдзе: [mʲ o dzʲ ɛ] * мачой: [m a tʃ o j] |
||||
Open-Mid |
Occurrences: 46,580 Examples: * пузе: [p u zʲ ɛ] * гучэў: [ɣ u tʃ ɛ u] * акопе: [a k o pʲ ɛ] * мёдзе: [mʲ o dzʲ ɛ] |
||||
Open |
Occurrences: 211,979 Examples: * джазу: [dʒ a z u] * акопе: [a k o pʲ ɛ] * адрыў: [a d r i u] * хард: [x a r d] |