Hubert-common_voice-phonemes-debug

This model is a fine-tuned version of rinna/japanese-hubert-base on the MOZILLA-FOUNDATION/COMMON_VOICE_13_0 - JA dataset. It achieves the following results on the evaluation set:

Loss: 0.4214
Wer: 0.9845
Cer: 0.1934

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 16
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 12500
num_epochs: 30.0
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer	Cer
No log	0.2660	100	18.5364	1.0645	1.8292
No log	0.5319	200	8.2791	1.0	0.9813
No log	0.7979	300	7.0224	1.0	0.9813
No log	1.0638	400	6.3106	1.0	0.9813
8.9892	1.3298	500	5.5223	1.0	0.9813
8.9892	1.5957	600	4.7121	1.0	0.9813
8.9892	1.8617	700	4.0028	1.0	0.9813
8.9892	2.1277	800	3.4755	1.0	0.9813
8.9892	2.3936	900	3.1988	1.0	0.9813
3.7187	2.6596	1000	3.0792	1.0	0.9813
3.7187	2.9255	1100	3.0459	1.0	0.9813
3.7187	3.1915	1200	3.0360	1.0	0.9813
3.7187	3.4574	1300	3.0084	1.0	0.9813
3.7187	3.7234	1400	2.4956	1.0	0.9343
2.783	3.9894	1500	1.4418	1.0	0.3331
2.783	4.2553	1600	1.0228	1.0	0.2753
2.783	4.5213	1700	0.8218	1.0	0.2532
2.783	4.7872	1800	0.7084	1.0	0.2433
2.783	5.0532	1900	0.6306	1.0	0.2337
0.8659	5.3191	2000	0.5934	1.0	0.2310
0.8659	5.5851	2100	0.5648	1.0	0.2284
0.8659	5.8511	2200	0.5330	1.0	0.2214
0.8659	6.1170	2300	0.5139	1.0	0.2209
0.8659	6.3830	2400	0.4907	1.0	0.2159
0.5271	6.6489	2500	0.4640	1.0	0.2160
0.5271	6.9149	2600	0.4609	1.0	0.2112
0.5271	7.1809	2700	0.4550	1.0001	0.2097
0.5271	7.4468	2800	0.4601	0.9992	0.2100
0.5271	7.7128	2900	0.4290	0.9953	0.2051
0.4244	7.9787	3000	0.4256	0.9971	0.2024
0.4244	8.2447	3100	0.4135	0.9999	0.2014
0.4244	8.5106	3200	0.4125	0.9956	0.1999
0.4244	8.7766	3300	0.3886	0.9942	0.1927
0.4244	9.0426	3400	0.3833	1.0006	0.1911
0.3373	9.3085	3500	0.3611	1.0364	0.1887
0.3373	9.5745	3600	0.3585	1.0080	0.1843
0.3373	9.8404	3700	0.3562	0.9981	0.1855
0.3373	10.1064	3800	0.3412	0.9883	0.1799
0.3373	10.3723	3900	0.3561	0.9835	0.1846
0.2779	10.6383	4000	0.3482	0.9772	0.1798
0.2779	10.9043	4100	0.3266	0.9795	0.1793
0.2779	11.1702	4200	0.3484	0.9792	0.1789
0.2779	11.4362	4300	0.3378	0.9992	0.1799
0.2779	11.7021	4400	0.3330	0.9764	0.1795
0.2409	11.9681	4500	0.3208	0.9781	0.1792
0.2409	12.2340	4600	0.3602	0.9757	0.1805
0.2409	12.5	4700	0.3363	0.9939	0.1788
0.2409	12.7660	4800	0.3253	0.9732	0.1795
0.2409	13.0319	4900	0.3285	0.9711	0.1762
0.2104	13.2979	5000	0.3233	0.9729	0.1769
0.2104	13.5638	5100	0.3363	0.9775	0.1827
0.2104	13.8298	5200	0.3371	0.9684	0.1759
0.2104	14.0957	5300	0.3464	0.9731	0.1778
0.2104	14.3617	5400	0.3450	0.9777	0.1783
0.1947	14.6277	5500	0.3442	0.9681	0.1773
0.1947	14.8936	5600	0.3346	0.9858	0.1780
0.1947	15.1596	5700	0.3524	0.9732	0.1771
0.1947	15.4255	5800	0.3414	0.9782	0.1774
0.1947	15.6915	5900	0.3438	1.0019	0.1766
0.1892	15.9574	6000	0.3391	0.9706	0.1802
0.1892	16.2234	6100	0.3505	0.9782	0.1803
0.1892	16.4894	6200	0.3467	0.9736	0.1767
0.1892	16.7553	6300	0.3681	0.9946	0.1792
0.1892	17.0213	6400	0.3557	1.0104	0.1769
0.1749	17.2872	6500	0.3446	0.9770	0.1787
0.1749	17.5532	6600	0.3496	0.9839	0.1803
0.1749	17.8191	6700	0.3585	1.0012	0.1806
0.1749	18.0851	6800	0.3562	0.9717	0.1799
0.1749	18.3511	6900	0.3722	1.0504	0.1835
0.1717	18.6170	7000	0.3554	0.9772	0.1809
0.1717	18.8830	7100	0.3678	0.9684	0.1788
0.1717	19.1489	7200	0.4938	1.0419	0.1854
0.1717	19.4149	7300	0.3926	0.9827	0.1805
0.1717	19.6809	7400	0.3581	1.0001	0.1819
0.1715	19.9468	7500	0.3569	0.9929	0.1840
0.1715	20.2128	7600	0.3911	0.9969	0.1814
0.1715	20.4787	7700	0.3973	1.0017	0.1808
0.1715	20.7447	7800	0.3943	0.9724	0.1839
0.1715	21.0106	7900	0.3984	0.9764	0.1823
0.1667	21.2766	8000	0.4306	1.0500	0.1840
0.1667	21.5426	8100	0.3794	0.9728	0.1882
0.1667	21.8085	8200	0.3966	0.9913	0.1834
0.1667	22.0745	8300	0.3981	0.9745	0.1838
0.1667	22.3404	8400	0.4328	0.9926	0.1826
0.1625	22.6064	8500	0.4087	0.9710	0.1835
0.1625	22.8723	8600	0.4149	1.0062	0.1861
0.1625	23.1383	8700	0.4107	0.9921	0.1875
0.1625	23.4043	8800	0.4140	0.9835	0.1869
0.1625	23.6702	8900	0.4087	0.9918	0.1890
0.1647	23.9362	9000	0.4083	0.9842	0.1870
0.1647	24.2021	9100	0.4006	0.9858	0.1847
0.1647	24.4681	9200	0.4137	1.0015	0.1850
0.1647	24.7340	9300	0.4107	0.9994	0.1906
0.1647	25.0	9400	0.4209	0.9843	0.1912
0.1667	25.2660	9500	0.4373	0.9957	0.1893
0.1667	25.5319	9600	0.4390	0.9822	0.1890
0.1667	25.7979	9700	0.4539	0.9857	0.1964
0.1667	26.0638	9800	0.4381	1.0037	0.1933
0.1667	26.3298	9900	0.4227	0.9875	0.1865
0.1644	26.5957	10000	0.4802	1.0266	0.1884
0.1644	26.8617	10100	0.4389	0.9950	0.1958
0.1644	27.1277	10200	0.4744	0.9828	0.1939
0.1644	27.3936	10300	0.4494	1.0006	0.1983
0.1644	27.6596	10400	0.4414	0.9963	0.1961
0.1742	27.9255	10500	0.4668	0.9764	0.1932
0.1742	28.1915	10600	0.4284	0.9720	0.1878
0.1742	28.4574	10700	0.4258	1.0279	0.1944
0.1742	28.7234	10800	0.4251	1.0024	0.1892
0.1742	28.9894	10900	0.4597	1.0201	0.1978
0.1669	29.2553	11000	0.4414	0.9879	0.1919
0.1669	29.5213	11100	0.4473	0.9772	0.1909
0.1669	29.7872	11200	0.4527	0.9944	0.1933

Framework versions

Transformers 4.47.0.dev0
Pytorch 2.5.1+cu124
Datasets 3.1.0
Tokenizers 0.20.3

utakumi
/

Hubert-common_voice-phonemes-debug

Hubert-common_voice-phonemes-debug

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for utakumi/Hubert-common_voice-phonemes-debug

Evaluation results