Japanese MFA dictionary v2.0.1a#

  • Maintainer: Montreal Forced Aligner

  • Language: Japanese

  • Dialect: N/A

  • Phone set: MFA

  • Number of words: 21,175

  • Phones: a b c d dz dzː dʑː e h i j k m mʲː n o p pʲː s t ts tsː tɕː tʲː v w z ç ŋ ɕ ɕː ɟ ɟː ɡ ɡː ɨ ɨː ɨ̥ ɯ ɯː ɯ̥ ɰ̃ ɲ ɲː ɴ ɴː ɸ ɸʲ ɸʲː ɸː ɾ ɾʲ ʑ ʔ

  • License: CC BY 4.0

  • Compatible MFA version: v2.1.0

  • Citation:

@techreport{mfa_japanese_mfa_dictionary_2023,
	author={McAuliffe, Michael and Sonderegger, Morgan},
	title={Japanese MFA dictionary v2.0.1a},
	address={\url{https://mfa-models.readthedocs.io/pronunciation dictionary/Japanese/Japanese MFA dictionary v2_0_1a.html}},
	year={2023},
	month={Jan},
}
../../_images/full_logo_yellow.svg

Installation#

Install from the MFA command line:

mfa model download dictionary japanese_mfa

Or download from the release page.

The dictionary available from the release page and command line installation has pronunciation and silence probabilities estimated as part acoustic model training (see Silence probability format and training pronunciation probabilities for more information. If you would like to use the version of this dictionary without probabilities, please see the [plain dictionary](https://raw.githubusercontent.com/MontrealCorpusTools/mfa-models/main/dictionary/japanese/mfa/Japanese MFA dictionary v2_0_1a.dict).

Intended use#

This dictionary is intended for forced alignment of Japanese transcripts.

This dictionary uses the MFA phone set for Japanese, and was used in training the Japanese MFA acoustic model. Pronunciations can be added on top of the dictionary, as long as no additional phones are introduced.

Performance Factors#

When trying to get better alignment accuracy, adding pronunciations is generally helpful, especially for different styles and dialects. The most impactful improvements will generally be seen when adding reduced variants that involve deleting segments/syllables common in spontaneous speech. Alignment must include all phones specified in the pronunciation of a word, and each phone has a minimum duration (by default 10ms). If a speaker pronounces a multisyllabic word with just a single syllable, it can be hard for MFA to fit all the segments in, so it will lead to alignment errors on adjacent words as well.

Ethical considerations#

Deploying any Speech-to-Text model into any production setting has ethical implications. You should consider these implications before use.

Demographic Bias#

You should assume every machine learning model has demographic bias unless proven otherwise. For pronunciation dictionaries, it is often the case that transcription accuracy and lexicon coverage for the prestige variety modeled in this dictionary compared to other variants. If you are using this dictionary in production, you should acknowledge this as a potential issue.

IPA Charts#

Consonants#

Obstruent symbols to the left of are unvoiced and those to the right are voiced.

Manner

Labial

Labiodental

Alveolar

Palatal

Velar

Uvular

Glottal

Nasal

Occurrences:
6,024
Examples:
* まくる:
[m a k ɯ ɾ ɯ]
* 近江八幡:
[ i h a i m a ɴ]
* とんでも:
[t o n d e m o]
* 如きもの:
[ɡ o t o c i m o n o]
Occurrences:
1,452
Examples:
* 海沿い:
[ɯ i z o i]
* 南相馬:
[ i n a i s m a]
* 見守り:
[ i m a m o ɾʲ i]
* 明るみ:
[a k a ɾ ɯ i]
Occurrences:
43
Examples:
* シンミリ:
[ɕ i mʲː i ɾʲ i]
Occurrences:
119
Examples:
* まんま:
[m a a]
* ジレンマ:
[ i ɾ e a]
* 真ん前:
[m a a e]
* ヴェンメン:
[v e e ɰ̃]
Occurrences:
5,210
Examples:
* モーガン:
[m ɡ a n]
* マグネット:
[m a ɡ ɯ n e o]
* guns:
[ɡ a n dz ɨ]
* 狙って:
[n e ɾ a e]
Occurrences:
168
Examples:
* すんな:
[s ɨ a]
* 大なれ:
[d a a ɾ e]
* まん中:
[m a a k a]
* どんな:
[d o a]
Occurrences:
2,562
Examples:
* ワクチン:
[w a k ɯ̥ i ɲ]
* アンチ:
[a ɲ i]
* のんき:
[n o ɲ c i]
* シーニック:
[ɕ ɲ i ɯ]
Occurrences:
79
Examples:
* ニンニク:
[ɲ i ɲː i k]
* コンニャク:
[k o ɲː a k ɯ]
* にんにく:
[ɲ i ɲː i k ɯ]
* なんに:
[n a ɲː i]
Occurrences:
2,101
Examples:
* スーザン:
[s ɨː z a ŋ]
* パンク:
[p a ŋ k ɯ]
* カンニング:
[k a ɲː i ŋ ɡ ɯ]
* クウェーン:
[k ɯ w ŋ]
Occurrences:
2,625
Examples:
* 一トン:
[i o ɴ]
* オリン:
[o ɾʲ i ɴ]
* アメリカン:
[a m e ɾʲ i k a ɴ]
* シャイアン:
[ɕ a i a ɴ]
Occurrences:
5
Examples:
* うーん:
[ɴː]

Stop

Occurrences:
1,115
Examples:
* スポンサー:
[s ɨ p o ɰ̃ s ]
* ピレネー:
[ i ɾ e ]
* ピオリア:
[ i o ɾʲ i a]
* パトリシア:
[p a t o ɾʲ i ɕ i a]
Occurrences:
305
Examples:
* 酔っぱらい:
[j o a ɾ a i]
* ストップ:
[s ɨ̥ t o ɯ̥]
* ホップ:
[h o ɯ̥]
* いっぺん:
[i e ɰ̃]
Occurrences:
2,817
Examples:
* バルーン:
[b a ɾ ɯː ŋ]
* ビッグ:
[ i ɡː ɯ]
* アリバイ:
[a ɾʲ i b a i]
* ブレーク:
[b ɯ ɾ k ɯ̥]
Occurrences:
1
Examples:
* やばい:
[j a ]
Occurrences:
6,811
Examples:
* ニチイ:
[ɲ i ]
* 子育て:
[k o s o d a t e]
* 行って:
[i e]
* とろろ:
[t o ɾ o ɾ o]
Occurrences:
1,060
Examples:
* のぼった:
[n o b o a]
* 貯まった:
[t a m a a]
* まちがって:
[m a i ɡ a e]
* ありったけ:
[a ɾʲ i a k e]
Occurrences:
2,931
Examples:
* デビュー:
[d e ɨː]
* タンザニア:
[t a n dz a ɲ i a]
* ジャリ:
[ a ɾʲ i]
* まるで:
[m a ɾ ɯ d e]
Occurrences:
29
Examples:
* ピラミッド:
[ i ɾ a i o]
* デッドキー:
[d e o c ]
* レッド:
[ɾ e o]
* ゴッド:
[ɡ o o]
Occurrences:
4,140
Examples:
* スキイ:
[s ɨ c i i]
* sqb:
[e s ɨ c ɨː ]
* メキシコ:
[m e c ɕ k o]
* 教科書:
[c k a ɕ o]
Occurrences:
142
Examples:
* すっきり:
[s ɨ̥ i ɾʲ i]
* ひっきり:
[ç i i ɾʲ i]
* 大っきく:
[o i k ɯ]
* っきり:
[ i ɾʲ i]
Occurrences:
604
Examples:
* ギレスピー:
[ɟ i ɾ e s ɨ ]
* ぎょっと:
[ɟ o o]
* ギテガ:
[ɟ i t e ɡ a]
* つなぎ:
[ts ɨ n a ɟ i]
Occurrences:
1
Examples:
* マッギー:
[m a ɟː ]
Occurrences:
14,171
Examples:
* 政財界:
[s z a i k a i]
* クール:
[k ɯː ɾ ɯ]
* コンウェイ:
[k o ɰ̃ w e i]
* 替わる:
[k a w a ɾ ɯ]
Occurrences:
747
Examples:
* どっかり:
[d o a ɾʲ i]
* 引っかき:
[ç i a c i]
* 撤回し:
[t e a i ɕ i]
* スナック:
[s ɨ n a ɯ̥]
Occurrences:
3,340
Examples:
* 大げさ:
[ ɡ e s a]
* 手掛かり:
[t e ɡ a k a ɾʲ i]
* モデルガン:
[m o d e ɾ ɯ ɡ a m]
* すぐれ:
[s ɨ ɡ ɯ ɾ e]
Occurrences:
21
Examples:
* すごい:
[s ɨ ɡː ]
* エッグ:
[e ɡː ɯ]
* ウィッグ:
[w i ɡː ɯ]
* ドッグ:
[d o ɡː ɯ]
Occurrences:
17
Examples:
* わるっ:
[w a ɾ ɯ ʔ]
* ギュッ:
[ɟ ɨ ʔ]
* おおっ:
[ ʔ]
* うまい:
[ɯ m a ʔ]

Affricate

Occurrences:
3,406
Examples:
* そいつ:
[s o i ts ɨ̥]
* つけよう:
[ts ɨ k e j ]
* 手伝った:
[t e ts ɨ d a a]
* アーツ:
[ ts ɨ]
Occurrences:
47
Examples:
* 突っつく:
[ts ɨ tsː ɨ k ɯ]
* やっつけ:
[j a tsː ɨ̥ k e]
* フィッツ:
[ɸʲ i tsː]
* いっつ:
[i tsː ɨ]
Occurrences:
649
Examples:
* づきあい:
[dz ɨ c i a i]
* 半蔵門:
[h a n dz m o ɴ]
* ずっしり:
[dz ɨ ɕː i ɾʲ i]
* ざんまい:
[dz a a i]
Occurrences:
3
Examples:
* キッズ:
[c i dzː ɨ]
* グッズ:
[ɡ ɯ dzː ɨ]
* ぜんぜん:
[dz e n dzː e ɴ]
Occurrences:
2,443
Examples:
* リチャード:
[ɾʲ i d o]
* 賃上げ:
[ i ɰ̃ a ɡ e]
* チューリヒ:
[ ɨː ɾʲ i ç i]
* 飛ばっちり:
[t o b a tɕː i ɾʲ i]
Occurrences:
154
Examples:
* 一長一短:
[i tɕː i a ŋ]
* マッチング:
[m a tɕː i ŋ ɡ ɯ]
* 小っちゃい:
[ tɕː ]
* スイッチ:
[s ɨ i tɕː]
Occurrences:
984
Examples:
* ジェリー:
[ e ɾʲ ]
* ジグソー:
[ i ɡ ɯ s ]
* ジェラルド:
[ e ɾ a ɾ ɯ d o]
* ジャック:
[ a ɯ]
Occurrences:
9
Examples:
* カレッジ:
[k a ɾ e dʑː i]
* ヘッジ:
[h e dʑː i]
* ビレッジ:
[ i ɾ e dʑː i]
* レッジ:
[ɾ e dʑː i]

Sibilant

Occurrences:
8,206
Examples:
* マスタード:
[m a s ɨ t d o]
* マックス:
[m a ɯ s ɨ]
* スポーン:
[s ɨ p ɴ]
* ストーカー:
[s ɨ̥ t k ]
Occurrences:
231
Examples:
* くっついて:
[k ɯ tsː ɨ i t e]
* 一センチ:
[i e ɲ i]
* フィッツ:
[ɸʲ i tsː]
* 没する:
[b o ɨ ɾ ɯ]
Occurrences:
1,651
Examples:
* ブラザーズ:
[b ɯ ɾ a z z ɨ]
* 恥ずかしい:
[h a z ɨ k a ɕ ]
* バンザイ:
[b a n dz a i]
* バイザー:
[b a i z ]
Occurrences:
6,728
Examples:
* シュー:
[ɕ ɨː]
* 台無し:
[d a i n a ɕ i]
* ピンチ:
[ i ɲ ]
* チャーター:
[ t ]
Occurrences:
265
Examples:
* ボッシ:
[b o ɕː i]
* スケッチ:
[s k e tɕː ]
* フィッシュ:
[ɸʲ i ɕː ɨ]
* 独りぼっち:
[ç i t o ɾʲ i b o tɕː i]
Occurrences:
1,814
Examples:
* 初めて:
[h a ʑ i m e t e]
* ngo:
[e n ɯ ʑ ]
* おやじ:
[o j a ʑ i]
* じょうぶ:
[ʑ b ɯ]

Fricative

Occurrences:
1,494
Examples:
* 踏んで:
[ɸ ɯ n d e]
* アウフゲ:
[a ɯ ɸ ɯ ɡ e]
* ファスト:
[ɸ a s ɨ t o]
* ファイヴ:
[ɸ a i v ɯ]
Occurrences:
69
Examples:
* フィル:
[ɸʲ i ɾ ɯ]
* フィジー:
[ɸʲ i ʑ ]
* フィッシュ:
[ɸʲ i ɕː ɨ]
* カダフィ:
[k a d a ɸʲ i]
Occurrences:
1
Examples:
* ダッフィ:
[d a ɸʲː i]
Occurrences:
10
Examples:
* ベグリッフ:
[b e ɡ ɯ ɾʲ i ɸː ɯ̥]
* ラッフルズ:
[ɾ a ɸː ɯ ɾ ɯ z ɨ]
* スタッフ:
[s t a ɸː]
Occurrences:
32
Examples:
* ヴャジマ:
[ a ʑ i m a]
* レヴァン:
[ɾ e v a ɴ]
* ヴェーザー:
[v z ]
* ヴィシュヌ:
[ i ɕ ɨ n ɯ]
Occurrences:
19
Examples:
* catv:
[ɕ i]
* リヴィウ:
[ɾʲ i i ɯ]
* ヴャジマ:
[ a ʑ i m a]
* ヴィシュヌ:
[ i ɕ ɨ n ɯ]
Occurrences:
1,151
Examples:
* ひろげ:
[ç i ɾ o ɡ e]
* ひいて:
[ç t e]
* ヒーター:
[ç t ]
* 引いた:
[ç t a]
Occurrences:
2,516
Examples:
* はいり:
[h a i ɾʲ i]
* ほんと:
[h o n t o]
* 法省令:
[h s ɾ ]
* 懸け橋:
[k a k e h a ɕ i]
Occurrences:
2
Examples:
* ゼンパッハ:
[dz e m p a a]
* バッハ:
[b a a]

Approximant

Occurrences:
1,474
Examples:
* 申し訳:
[m ɕ i w a k e]
* スウェット:
[s ɨ w e o]
* まわって:
[m a w a e]
* 壊れる:
[k o w a ɾ e ɾ ɯ]
Occurrences:
2,406
Examples:
* やたら:
[j a t a ɾ a]
* 見よう:
[ i j ]
* 沈みゆく:
[ɕ i z ɨ i j ɨ k ɯ]
* かゆい:
[k a j ɨ i]
Occurrences:
2,755
Examples:
* エチレン:
[e i ɾ e ɰ̃]
* ベルグソン:
[b e ɾ ɯ ɡ ɯ s o ɰ̃]
* エッセンス:
[e e ɰ̃ s ɨ]
* オレゴン:
[o ɾ e ɡ o ɰ̃]

Tap

Occurrences:
6,087
Examples:
* ラテン:
[ɾ a t e ɲ]
* 繰返さ:
[k ɯ ɾʲ i k a e s a]
* イコール:
[i k ɾ ɯ]
* エリオット:
[e ɾʲ i o o]
Occurrences:
2,568
Examples:
* チャリング:
[ a ɾʲ i ŋ ɡ ɯ]
* エミリー:
[e i ɾʲ ]
* 盛りあげる:
[m o ɾʲ i a ɡ e ɾ ɯ]
* 送り状:
[o k ɯ ɾʲ i ʑ ]

Vowels#

Vowel symbols to the left of are unrounded and those to the right are rounded.

Front

Near-Front

Central

Near-Back

Back

Close

Occurrences:
19,507
Examples:
* 追い詰め:
[o i ts ɨ m e]
* 拾った:
[ç i ɾ o a]
* 東広島:
[ç i ɡ a ɕ i ç i ɾ o ɕ i m a]
* ワサビ:
[w a s a i]
Occurrences:
879
Examples:
* リピート:
[ɾʲ i t o]
* 新しい:
[a t a ɾ a ɕ ]
* ディーラー:
[ ɾ ]
* アリーナ:
[a ɾʲ n a]
Occurrences:
2,349
Examples:
* 執よう:
[ɕ ts ɨ j ]
* 然らざれ:
[ɕ k a ɾ a z a ɾ e]
* 押し掛け:
[o ɕ k a k e]
* きたして:
[c t a ɕ t e]
Occurrences:
5,721
Examples:
* すっごい:
[s ɨ ɡː o i]
* 起こす:
[o k o s ɨ̥]
* けん銃:
[k e ɲ ɨː]
* リンス:
[ɾʲ i ɰ̃ s ɨ]
Occurrences:
1,542
Examples:
* ヒューズ:
[ç ɨː z ɨ]
* コミューン:
[k o ɨː ŋ]
* 富士通:
[ɸ ɯ ʑ i ts ɨː]
* スーダン:
[s ɨː d a ɴ]
Occurrences:
1,709
Examples:
* ホノリウス:
[h o n o ɾʲ i ɯ s ɨ̥]
* スペシャル:
[s ɨ̥ p e ɕ a ɾ ɯ]
* ハリス:
[h a ɾʲ i s ɨ̥]
* マスタード:
[m a s ɨ̥ t d o]
Occurrences:
9,690
Examples:
* 炒める:
[i t a m e ɾ ɯ]
* 受け取る:
[ɯ k e t o ɾ ɯ]
* うなずい:
[ɯ n a z ɨ i]
* ぬるぬる:
[n ɯ ɾ ɯ n ɯ ɾ ɯ]
Occurrences:
232
Examples:
* クープ:
[k ɯː p ɯ]
* 封じ込めれ:
[ɸ ɯː ʑ i k o m e ɾ e]
* ループ:
[ɾ ɯː p ɯ]
* フード:
[ɸ ɯː d o]
Occurrences:
2,264
Examples:
* 福知山:
[ɸ ɯ̥ k ɯ i j a m a]
* 振って:
[ɸ ɯ̥ e]
* 読みふける:
[j o i ɸ ɯ̥ k e ɾ ɯ]
* 草分け:
[k ɯ̥ s a w a k e]

Close-Mid

Occurrences:
10,685
Examples:
* けんすい:
[k e ɰ̃ s ɨ i]
* メージャー:
[m ʑ ]
* ゆがめ:
[j ɨ ɡ a m e]
* はかれ:
[h a k a ɾ e]
Occurrences:
2,247
Examples:
* ケージ:
[k ʑ i]
* ベーシス:
[b ɕ i s ɨ]
* 安芸高田:
[a ɡ a t a]
* グレーター:
[ɡ ɯ ɾ t ]
Occurrences:
12,811
Examples:
* 乏しい:
[t o b o ɕ ]
* クボタ:
[k ɯ b o t a]
* フォルク:
[ɸ o ɾ ɯ k ɯ]
* スローガン:
[s ɨ ɾ ɡ a ɰ̃]
Occurrences:
5,471
Examples:
* レチノール:
[ɾ e i n ɾ ɯ]
* なろう:
[n a ɾ ]
* きのう:
[c i n ]
* コービー:
[k ]

Open-Mid

Open

Occurrences:
26,310
Examples:
* 日高川:
[ç i d a k a k a w a]
* せがむ:
[s e ɡ a m ɯ]
* ハーベイ:
[h b e i]
* 誤って:
[a j a m a e]
Occurrences:
954
Examples:
* ミスター:
[ i s ɨ t ]
* 棚上げ:
[t a n ɡ e]
* オファー:
[o ɸ ]
* ワーナー:
[w n ]