Update README.md
Browse files
README.md
CHANGED
@@ -18,12 +18,12 @@ Here are the initial learning rates required to continue training at each checkp
|
|
18 |
|
19 |
- **[Doge-20M](https://huggingface.co/SmallDoge/Doge-20M-checkpoint)**: 8e-3
|
20 |
- **[Doge-60M](https://huggingface.co/SmallDoge/Doge-60M-checkpoint)**: 6e-3
|
21 |
-
- **[Doge-160M](
|
22 |
-
- **Doge-320M**: 2e-3
|
23 |
|
24 |
| Model | Learning Rate | Schedule | Warmup Steps | Stable Steps |
|
25 |
|-------|---------------|----------|--------------|--------------|
|
26 |
-
| Doge-20M | 8e-3 | wsd_scheduler | 800 | 6400 |
|
27 |
-
| Doge-60M | 6e-3 | wsd_scheduler | 1600 | 12800 |
|
28 |
-
| Doge-160M | 4e-3 | wsd_scheduler | 2400 | 19200 |
|
29 |
-
| Doge-320M | 2e-3 | wsd_scheduler | 3200 | 25600 |
|
|
|
18 |
|
19 |
- **[Doge-20M](https://huggingface.co/SmallDoge/Doge-20M-checkpoint)**: 8e-3
|
20 |
- **[Doge-60M](https://huggingface.co/SmallDoge/Doge-60M-checkpoint)**: 6e-3
|
21 |
+
- **[Doge-160M](https://huggingface.co/SmallDoge/Doge-160M-checkpoint)**: 4e-3
|
22 |
+
- **[Doge-320M](https://huggingface.co/SmallDoge/Doge-320M-checkpoint)**: 2e-3
|
23 |
|
24 |
| Model | Learning Rate | Schedule | Warmup Steps | Stable Steps |
|
25 |
|-------|---------------|----------|--------------|--------------|
|
26 |
+
| [Doge-20M](https://huggingface.co/SmallDoge/Doge-20M-checkpoint) | 8e-3 | wsd_scheduler | 800 | 6400 |
|
27 |
+
| [Doge-60M](https://huggingface.co/SmallDoge/Doge-60M-checkpoint) | 6e-3 | wsd_scheduler | 1600 | 12800 |
|
28 |
+
| [Doge-160M](https://huggingface.co/SmallDoge/Doge-160M-checkpoint) | 4e-3 | wsd_scheduler | 2400 | 19200 |
|
29 |
+
| [Doge-320M](https://huggingface.co/SmallDoge/Doge-320M-checkpoint) | 2e-3 | wsd_scheduler | 3200 | 25600 |
|