File size: 159 Bytes
5fb3686 |
1 2 3 4 5 6 7 8 |
---
datasets:
- EleutherAI/pile
language:
- en
---
Based model but uses layernorm instead of QK.sum(-1) for the normalization, for better hardware efficiency. |
5fb3686 |
1 2 3 4 5 6 7 8 |
---
datasets:
- EleutherAI/pile
language:
- en
---
Based model but uses layernorm instead of QK.sum(-1) for the normalization, for better hardware efficiency. |