Japanese MFA dictionary v3.0.0#

  • Maintainer: Montreal Forced Aligner

  • Language: Japanese

  • Dialect: N/A

  • Phone set: MFA

  • Number of words: 499,793

  • Phones: a b bʲː c d dz dzː dʑː dʲː e h i j k m mʲː n o p pʲː s t ts tsː tɕː tʲː v w z ç çː ŋ ɕ ɕː ɟ ɟː ɡ ɡː ɨ ɨː ɨ̥ ɯ ɯː ɯ̥ ɰ̃ ɲ ɲː ɴ ɴː ɸ ɸʲ ɸʲː ɸː ɾ ɾʲ ɾʲː ɾː ʑ ʔ

  • License: CC BY 4.0

  • Compatible MFA version: v3.0.0

  • Citation:

@techreport{mfa_japanese_mfa_dictionary_2024,
	author={McAuliffe, Michael and Sonderegger, Morgan},
	title={Japanese MFA dictionary v3.0.0},
	address={\url{https://mfa-models.readthedocs.io/pronunciation dictionary/Japanese/Japanese MFA dictionary v3_0_0.html}},
	year={2024},
	month={Feb},
}
../../_images/full_logo_yellow.svg

Installation#

Install from the MFA command line:

mfa model download dictionary japanese_mfa

Or download from the release page.

The dictionary available from the release page and command line installation has pronunciation and silence probabilities estimated as part acoustic model training (see Silence probability format and training pronunciation probabilities for more information. If you would like to use the version of this dictionary without probabilities, please see the [plain dictionary](https://raw.githubusercontent.com/MontrealCorpusTools/mfa-models/main/dictionary/japanese/mfa/Japanese MFA dictionary v3_0_0.dict).

Intended use#

This dictionary is intended for forced alignment of Japanese transcripts.

This dictionary uses the MFA phone set for Japanese, and was used in training the Japanese MFA acoustic model. Pronunciations can be added on top of the dictionary, as long as no additional phones are introduced.

Performance Factors#

When trying to get better alignment accuracy, adding pronunciations is generally helpful, especially for different styles and dialects. The most impactful improvements will generally be seen when adding reduced variants that involve deleting segments/syllables common in spontaneous speech. Alignment must include all phones specified in the pronunciation of a word, and each phone has a minimum duration (by default 10ms). If a speaker pronounces a multisyllabic word with just a single syllable, it can be hard for MFA to fit all the segments in, so it will lead to alignment errors on adjacent words as well.

Ethical considerations#

Deploying any Speech-to-Text model into any production setting has ethical implications. You should consider these implications before use.

Demographic Bias#

You should assume every machine learning model has demographic bias unless proven otherwise. For pronunciation dictionaries, it is often the case that transcription accuracy and lexicon coverage for the prestige variety modeled in this dictionary compared to other variants. If you are using this dictionary in production, you should acknowledge this as a potential issue.

IPA Charts#

Consonants#

Obstruent symbols to the left of are unvoiced and those to the right are voiced.

Manner

Labial

Labiodental

Alveolar

Palatal

Velar

Uvular

Glottal

Nasal

Occurrences:
124,858
Examples:
* 求めたら:
[m o t o m e t a ɾ a]
* 楽しませて:
[t a n o ɕ i m a s e t e]
* まるや:
[m a ɾ ɯ j a]
* カンボジア:
[k a m b o ʑ i a]
Occurrences:
34,298
Examples:
* 中国絡み:
[ ɨː ɡ o k ɯ ɡ a ɾ a i]
* 磨きたい:
[ i ɡ a c i t a i]
* 伏見区:
[ɸ ɯ ɕ i i k ɯ]
* 妙高山:
[ k s a ɴ]
Occurrences:
766
Examples:
* 一本道:
[i o mʲː i i]
* 難民たち:
[n a mʲː i ɲ t a ]
* 近未来:
[c i mʲː i ɾ a i]
* 新民連:
[ɕ i mʲː i n ɾ e ɴ]
Occurrences:
3,663
Examples:
* 外国産米:
[ɡ a i k o k ɯ s a a i]
* 専門店:
[s e o n t e ɴ]
* 何万倍:
[n a a m b a i]
* 半面高:
[h a e ŋ k ]
Occurrences:
117,160
Examples:
* nasa:
[n a s a]
* 鳴沢村:
[n a ɾ ɯ s a w a m ɯ ɾ a]
* かけこんで:
[k a k e k o n d e]
* レバノン人:
[ɾ e b a n o ɰ̃ ɲ i ɴ]
Occurrences:
4,069
Examples:
* いろんな:
[i ɾ o a]
* 本能的:
[h o t e c i]
* 福島県内:
[ɸ ɯ k ɯ ɕ i m a k e a i]
* 天王寺区:
[t e ʑ i k ɯ]
Occurrences:
46,135
Examples:
* 二段階目:
[ɲ d a ŋ k a i m e]
* 生産量:
[s s a ɲ ɾʲ ]
* お役人:
[o j a k ɯ ɲ i ɴ]
* 三キロ:
[s a ɲ c i ɾ o]
Occurrences:
1,265
Examples:
* にんにく:
[ɲ i ɲː i k ɯ]
* 不信任:
[ɸ ɕ i ɲː i ɴ]
* 新日鉄:
[ɕ i ɲː i t e ts ɨ]
* カンニング:
[k a ɲː i ŋ ɡ ɯ]
Occurrences:
27,375
Examples:
* ドーピング:
[d i ŋ ɡ ɯ]
* 人間たち:
[ɲ i ŋ ɡ e n t a ]
* 感覚的:
[k a ŋ k a k ɯ t e c]
* 文化祭:
[b ɯ ŋ k a s a i]
Occurrences:
57,108
Examples:
* 研究員:
[k e ɲ c ɨː i ɴ]
* 歌い手さん:
[ɯ t a i t e s a ɴ]
* 鈴木君:
[s ɨ z ɨ c k ɯ ɴ]
* 未経験:
[ i k k e ɴ]
Occurrences:
9
Examples:
* うーん:
[ɴː]

Stop

Occurrences:
21,264
Examples:
* レポート:
[ɾ e p t o]
* 飼い猫ぷ:
[k a i n e k o p ɯ]
* 文法的:
[b ɯ m p t e c i]
* パロディ:
[p a ɾ o i]
Occurrences:
4,147
Examples:
* ぴったり:
[ a ɾʲ i]
* ガレスピー:
[ɡ a ɾ e s ɨ ]
* スピノザ:
[s ɨ i n o z a]
* ピント:
[ i n t o]
Occurrences:
532
Examples:
* フロッピー:
[ɸ ɯ ɾ o pʲː ]
* グッピー:
[ɡ ɯ pʲː ]
* すっぴん:
[s ɨ pʲː i ɴ]
* 年月日:
[n e ŋ ɡ a pʲː i]
Occurrences:
5,578
Examples:
* ストリップ:
[s t o ɾʲ i ]
* カップ系:
[k a ɯ k ]
* ひっ迫:
[ç i a k ɯ]
* きっぷ:
[c i ɯ]
Occurrences:
73,996
Examples:
* ブチ切れ:
[b ɯ i c i ɾ e]
* コロンブス:
[k o ɾ o m b ɯ s ɨ]
* 植物園:
[ɕ o k ɯ b ɯ ts ɨ e ɴ]
* ブレン:
[b ɯ ɾ e ɴ]
Occurrences:
15,624
Examples:
* kbs:
[k e s ɨ]
* 手引き:
[t e i c i]
* ビーバー:
[ b ]
* くびれ:
[k ɯ i ɾ e]
Occurrences:
4
Examples:
Occurrences:
44
Examples:
* web:
[w e ɯ]
* ビッベ:
[ i e]
* やばい:
[j a ]
Occurrences:
150,999
Examples:
* 短プラ:
[t a m p ɯ ɾ a]
* 取り込める:
[t o ɾʲ i k o m e ɾ ɯ]
* 採って:
[t o e]
* だして:
[d a ɕ t e]
Occurrences:
3,857
Examples:
* ティアラ:
[ i a ɾ a]
* オーティス:
[ i s ɨ]
* ティー:
[ ]
* ゴーティエ:
[ɡ i e]
Occurrences:
160
Examples:
* メノッティ:
[m e n o tʲː ]
* ブガッティ:
[b ɯ ɡ a tʲː i]
* スコッティ:
[s ɨ k o tʲː i]
Occurrences:
16,804
Examples:
* 当たったり:
[a t a a ɾʲ i]
* 突っ張って:
[ts ɨ a e]
* 目立って:
[m e d a e]
* っぽかった:
[ o k a a]
Occurrences:
68,752
Examples:
* キリン堂:
[c i ɾʲ i n d ]
* 電線向け:
[d e ɰ̃ s e ɯ k e]
* 追い出す:
[o i d a s ɨ]
* 運んだり:
[h a k o n d a ɾʲ i]
Occurrences:
3,717
Examples:
* レディース:
[ɾ e s ɨ]
* ディリップ:
[ i ɾʲ i ɯ]
* デイトン:
[ i t o ɴ]
* デュアル:
[ ɨ a ɾ ɯ]
Occurrences:
3
Examples:
Occurrences:
984
Examples:
* レッド:
[ɾ e o]
* オッド:
[o o]
* リクッド:
[ɾʲ i k ɯ o]
* ヘッド:
[h e o]
Occurrences:
83,466
Examples:
* 競争力:
[c s ɾʲ o k ɯ]
* 動き始めて:
[ɯ ɡ o c i h a ʑ i m e t e]
* 着替え:
[c i ɡ a e]
* プリキュア:
[p ɯ ɾʲ i c ɨ a]
Occurrences:
2,913
Examples:
* 日記的:
[ɲ i i t e c i]
* 六キロ:
[ɾ o i ɾ o]
* くっきり:
[k ɯ̥ i ɾʲ i]
* 消去法:
[ɕ o h ]
Occurrences:
18,348
Examples:
* 営業力:
[ ɟ ɾʲ o k ɯ]
* 岐阜県:
[ɟ i ɸ ɯ k e ɴ]
* 行革審:
[ɟ k a k ɕ i ɴ]
* 工業品:
[k ɟ ç i ɴ]
Occurrences:
10
Examples:
* マッギー:
[m a ɟː ]
Occurrences:
298,910
Examples:
* 虫除け:
[m ɯ ɕ i j o k e]
* 合言葉:
[a i k o t o b a]
* 財テク:
[dz a i t e k ɯ]
* 空気等:
[k ɯː c t ]
Occurrences:
12,247
Examples:
* コサック:
[k o s a ɯ]
* 作家様:
[s a a s a m a]
* 三日目:
[ i a m e]
* ギックリ:
[ɟ i ɯ ɾʲ i]
Occurrences:
86,852
Examples:
* 科学者:
[k a ɡ a k ɕ a]
* 人ごみ:
[ç t o ɡ o i]
* グダグダ:
[ɡ ɯ d a ɡ ɯ d a]
* 慰められる:
[n a ɡ ɯ s a m e ɾ a ɾ e ɾ ɯ]
Occurrences:
479
Examples:
* フラッグス:
[ɸ ɯ ɾ a ɡː ɯ s ɨ]
* レッグ:
[ɾ e ɡː ɯ]
* ドッグ:
[d o ɡː ɯ]
* タッグ:
[t a ɡː ɯ]
Occurrences:
207
Examples:
* ニョロッ:
[ɲ o ɾ o ʔ]
* ポチッ:
[p o i ʔ]
* キャッ:
[c a ʔ]
* しよっ:
[ɕ i j o ʔ]

Affricate

Occurrences:
55,275
Examples:
* 詰め寄った:
[ts ɨ m e j o a]
* 八カ月:
[h a k a ɡ e ts ɨ]
* 郵便物:
[j ɨː i m b ɯ ts ɨ]
* ついつい:
[ts ɨ i ts ɨ i]
Occurrences:
507
Examples:
* 引っ掴んで:
[ç i tsː ɨ k a n d e]
* しかめっ面:
[ɕ i k a m e tsː ɨ ɾ a]
* くっついて:
[k ɯ tsː ɨ i t e]
Occurrences:
10,688
Examples:
* 造幣局:
[dz h c o k ɯ]
* ズオウ:
[dz ɨ ]
* メンズ:
[m e n dz ɨ]
* 属した:
[dz o k ɯ ɕ i t a]
Occurrences:
63
Examples:
* ぜんぜん:
[dz e n dzː e ɴ]
* ドッズ:
[d o dzː ɨ]
* グッズ:
[ɡ ɯ dzː ɨ]
* レッズ:
[ɾ e dzː ɨ]
Occurrences:
47,557
Examples:
* 漁民たち:
[ɟ o i n t a ]
* 非嫡出子:
[ç i a k ɯ̥ ɕ ts ɨ ɕ i]
* 住民たち:
[ ɨː i n t a i]
* 口応え:
[k ɯ i ɡ o t a e]
Occurrences:
2,784
Examples:
* ちっちゃく:
[ i tɕː a k ɯ]
* 一人ぽっち:
[ç i t o ɾʲ i p o tɕː i]
* 殺虫剤:
[s a tɕː ɨː z a i]
* スケッチ:
[s ɨ k e tɕː i]
Occurrences:
22,009
Examples:
* 自律的:
[ i ɾʲ i ts ɨ t e c i]
* ジャスパー:
[ a s p ]
* ジレンマ:
[ i ɾ e a]
* 純文学:
[ ɨ m b ɯ ŋ ɡ a k ɯ]
Occurrences:
266
Examples:
* ビレッジ:
[ i ɾ e dʑː i]
* エッジ:
[e dʑː i]
* ヘッジ売り:
[h e dʑː i ɯ ɾʲ i]
* カレッジ:
[k a ɾ e dʑː i]

Sibilant

Occurrences:
178,174
Examples:
* 励みます:
[h a ɡ e i m a s ɨ]
* 買います:
[k a i m a s ɨ]
* 促そう:
[ɯ n a ɡ a s ]
* 現れます:
[a ɾ a w a ɾ e m a s]
Occurrences:
2,887
Examples:
* 達する:
[t a ɨ ɾ ɯ]
* ベッセマー:
[b e e m ]
* 別世界:
[b e e k a i]
* 一センチ:
[i e ɲ i]
Occurrences:
37,143
Examples:
* 行わず:
[o k o n a w a z ɨ]
* 宝塚市:
[t a k a ɾ a z ɨ k a ɕ i]
* 貴族たち:
[c i z o k t a ]
* 尋ねました:
[t a z ɨ n e m a ɕ t a]
Occurrences:
125,822
Examples:
* 小沢氏:
[o z a w a ɕ]
* 経済誌:
[k z a i ɕ i]
* 歯医者:
[h a i ɕ a]
* 観音寺市:
[k a ɰ̃ o ɲ ʑ i ɕ i]
Occurrences:
3,790
Examples:
* らっしゃい:
[ɾ a ɕː a i]
* アッシャー:
[a ɕː ]
* がっしり:
[ɡ a ɕː i ɾʲ i]
* 合衆国:
[ɡ a ɕː ɨː k o k ɯ̥]
Occurrences:
34,150
Examples:
* 持ち味:
[m o i a ʑ i]
* 王子たち:
[ ʑ i t a i]
* 平城宮:
[h ʑ c ɨː]
* グルジア語:
[ɡ ɯ ɾ ɯ ʑ i a ɡ o]

Fricative

Occurrences:
26,333
Examples:
* 含まね:
[ɸ k ɯ m a n e]
* 振り分け:
[ɸ ɯ ɾʲ i w a k e]
* 不道徳:
[ɸ ɯ d t o k ɯ]
* 後円墳:
[k e ɴ ɸ ɯ ɴ]
Occurrences:
1,793
Examples:
* アルフィー:
[a ɾ ɯ ɸʲ ]
* アラフィフ:
[a ɾ a ɸʲ i ɸ ɯ̥]
* フィア:
[ɸʲ i a]
* フィリップ:
[ɸʲ i ɾʲ i ɯ]
Occurrences:
8
Examples:
* ミッフィー:
[ i ɸʲː ]
* バッフィー:
[b a ɸʲː ]
Occurrences:
152
Examples:
* ベグリッフ:
[b e ɡ ɯ ɾʲ i ɸː ɯ]
* スタッフ:
[s ɨ t a ɸː ɯ]
* ラッフルズ:
[ɾ a ɸː ɯ ɾ ɯ z ɨ]
Occurrences:
675
Examples:
* バーサス:
[v s a s ɨ]
* ノヴェロ:
[n o v e ɾ o]
* アルヴァ:
[a ɾ ɯ v a]
* ヴァン:
[v a ɴ]
Occurrences:
281
Examples:
* ヴィーナス:
[ n a s ɨ]
* ヴィッツ:
[ i tsː ɨ]
* ヴィオラ:
[ i o ɾ a]
* ヴャジマ公:
[ a ʑ i m a k ]
Occurrences:
23,438
Examples:
* 百ドル:
[ç a k ɯ d o ɾ ɯ]
* ひとまず:
[ç t o m a z ɨ]
* 光輝いて:
[ç i k a ɾʲ i k a ɡ a j a i t e]
* 左わき:
[ç i d a ɾʲ i w a c i]
Occurrences:
8
Examples:
Occurrences:
55,324
Examples:
* 入れば:
[h a i ɾ e b a]
* 話し方:
[h a n a ɕ i k a t a]
* 情報等:
[ h t ]
* 開放的:
[k a i h t e c ]
Occurrences:
59
Examples:
* ゲルラッハ:
[ɡ e ɾ ɯ ɾ a a]
* ゼンパッハ:
[dz e m p a a]
* バッハ:
[b a a]

Approximant

Occurrences:
29,803
Examples:
* 救われる:
[s ɨ k ɯ w a ɾ e ɾ ɯ]
* 使わず:
[ts ɨ k a w a z ɨ]
* ワンちゃん:
[w a ɲ a ɴ]
* 加古川市:
[k a k o ɡ a w a ɕ i]
Occurrences:
1
Examples:
Occurrences:
53,417
Examples:
* 易しく:
[j a s a ɕ i k ɯ]
* ヤフー:
[j a ɸ ɯː]
* 吉賀町:
[j o ɕ i k a ]
* 偏った:
[k a t a j o a]
Occurrences:
36,021
Examples:
* 万円強:
[m a ɰ̃ e ɲ c ]
* ラーメン屋:
[ɾ m e ɰ̃ j a]
* ヴァンス:
[v a ɰ̃ s ɨ]
* 本社屋:
[h o ɰ̃ ɕ a j a]

Tap

Occurrences:
189,362
Examples:
* 序列的:
[ o ɾ e ts ɨ t e c i]
* 早良区:
[s a w a ɾ a k ɯ]
* よろこび:
[j o ɾ o k o i]
* エレアコ:
[e ɾ e a k o]
Occurrences:
65,243
Examples:
* がっかり:
[ɡ a a ɾʲ i]
* 支給率:
[ɕ i c ɨː ɾʲ i ts ɨ]
* 建てたり:
[t a t e t a ɾʲ i]
* チューリヒ:
[ ɨː ɾʲ i ç i]
Occurrences:
13
Examples:
Occurrences:
35
Examples:
* バガレッラ:
[b a ɡ a ɾ e ɾː a]

Vowels#

Vowel symbols to the left of are unrounded and those to the right are rounded.

Front

Near-Front

Central

Near-Back

Back

Close

Occurrences:
477,954
Examples:
* 命じられた:
[m ʑ i ɾ a ɾ e t a]
* カチッと:
[k a i o]
* いりません:
[i ɾʲ i m a s e ɴ]
* 国東市:
[k ɯ ɲ i s a c i ɕ i]
Occurrences:
21,669
Examples:
* チーム名:
[ m ɯ m ]
* 引いた:
[ç t a]
* リール:
[ɾʲ ɾ ɯ]
* チーム力:
[ m ɯ ɾʲ o k ɯ]
Occurrences:
1,339
Examples:
* 開放的:
[k a i h t e c ]
* 生徒たち:
[s t o t a ]
* 落とし:
[o t o ɕ ]
* 支払う:
[ɕ h a ɾ a ɯ]
Occurrences:
145,870
Examples:
* イラついて:
[i ɾ a ts ɨ i t e]
* 脱ぎます:
[n ɯ ɟ i m a s ɨ]
* ひきずって:
[ç i c i z ɨ e]
* 支払います:
[ɕ i h a ɾ a i m a s ɨ]
Occurrences:
35,411
Examples:
* 支給額:
[ɕ i c ɨː ɡ a k ɯ]
* 充電池:
[ ɨː d e ɲ i]
* 遊歩道:
[j ɨː h o d ]
* 真最中:
[m a a i ɨː]
Occurrences:
1,306
Examples:
* やります:
[j a ɾʲ i m a s ɨ̥]
* みます:
[ i m a s ɨ̥]
* 特殊的:
[t o k ɕ ɨ̥ t e c i]
* 出資者:
[ɕ ɨ̥ ɕː ɕ a]
Occurrences:
278,480
Examples:
* パスカル:
[p a s k a ɾ ɯ]
* ふれさせ:
[ɸ ɯ ɾ e s a s e]
* 近づく:
[ i k a z ɨ k ɯ]
* 買い込む:
[k a i k o m ɯ]
Occurrences:
5,603
Examples:
* 制空権:
[s k ɯː k e ɴ]
* トゥルー:
[t ɯ ɾ ɯː]
* クーン:
[k ɯː ɴ]
* ムーン:
[m ɯː ɴ]
Occurrences:
1,202
Examples:
* 九カ国:
[c ɨː k a k o k ɯ̥]
* 食ってる:
[k ɯ̥ e ɾ ɯ]
* ガクガク:
[ɡ a k ɯ ɡ a k ɯ̥]
* サクサク:
[s a k ɯ̥ s a k ɯ̥]

Close-Mid

Occurrences:
248,778
Examples:
* 渡された:
[w a t a s a ɾ e t a]
* 追い掛ける:
[o i k a k e ɾ ɯ]
* 挙げられる:
[a ɡ e ɾ a ɾ e ɾ ɯ]
* かかって:
[k a k a e]
Occurrences:
57,741
Examples:
* チェーホフ:
[ h o ɸ ɯ]
* 脱力系:
[d a ts ɨ ɾʲ o k ɯ k ]
* 生徒さん:
[s t o s a ɴ]
* 十世紀:
[ ɨː s c i]
Occurrences:
305,162
Examples:
* 同じく:
[o n a ʑ i k]
* キリスト:
[c i ɾʲ i s ɨ t o]
* アストル:
[a s ɨ t o ɾ ɯ]
* 越生町:
[k o ɕ i o ]
Occurrences:
155,521
Examples:
* 武豊町:
[t a k e t o j o ]
* 衝動的:
[ɕ d t e c i]
* 甲良町:
[k ɾ a ]
* し庁内:
[ɕ i n a i]

Open-Mid

Open

Occurrences:
608,366
Examples:
* 共和党寄り:
[c w a t j o ɾʲ i]
* つくられた:
[ts ɨ k ɯ ɾ a ɾ e t a]
* 五十人弱:
[ɡ o ʑ ɨː ɲ i ɲ a k ɯ]
* 七メートル:
[n a n a m t o ɾ ɯ]
Occurrences:
19,644
Examples:
* オギャー:
[o ɟ ]
* フッカー:
[ɸ ɯ ]
* アメーバー:
[a m b ]
* ターナー:
[t n ]