bennexx
/

cl-tohoku-bert-base-japanese-v3-jlpt-classifier

Text Classification

Inference Endpoints

Model card Files Files and versions Community

bennexx commited on Jan 29, 2024

Commit

33ea441

·

verified ·

1 Parent(s): 5efa696

Update README.md

Files changed (1) hide show

README.md +15 -14

README.md CHANGED Viewed

@@ -11,20 +11,6 @@ This is a text classifier for assigning a [JLPT level](https://www.jlpt.jp/e/abo
 A pre-trained [cl-tohoku-bert-japanese-v3](https://huggingface.co/cl-tohoku/bert-base-japanese-v3) is finetuned on ~5000k labeled sentences obtained from language learning websites.
 Performance on same distribution data is good.
-```
-              precision    recall  f1-score   support
-          N5       0.62      0.66      0.64       145
-          N4       0.34      0.36      0.35       143
-          N3       0.33      0.67      0.45       197
-          N2       0.26      0.20      0.23       192
-          N1       0.59      0.08      0.15       202
-    accuracy                           0.38       879
-   macro avg       0.43      0.39      0.36       879
-weighted avg       0.42      0.38      0.34       879
-```
-But on test data consisting of official JLPT material it is not so good.
 ```
               precision    recall  f1-score   support
           N5       0.88      0.88      0.88        25
@@ -39,4 +25,19 @@ weighted avg       0.85      0.84      0.84       260
 Still, it can give a ballpark estimation of sentence difficulty, altough not very precise.

 A pre-trained [cl-tohoku-bert-japanese-v3](https://huggingface.co/cl-tohoku/bert-base-japanese-v3) is finetuned on ~5000k labeled sentences obtained from language learning websites.
 Performance on same distribution data is good.
 ```
               precision    recall  f1-score   support
           N5       0.88      0.88      0.88        25
+But on test data consisting of official JLPT material it is not so good.
+```
+              precision    recall  f1-score   support
+          N5       0.62      0.66      0.64       145
+          N4       0.34      0.36      0.35       143
+          N3       0.33      0.67      0.45       197
+          N2       0.26      0.20      0.23       192
+          N1       0.59      0.08      0.15       202
+    accuracy                           0.38       879
+   macro avg       0.43      0.39      0.36       879
+weighted avg       0.42      0.38      0.34       879
+```
 Still, it can give a ballpark estimation of sentence difficulty, altough not very precise.