Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,28 @@
|
|
1 |
---
|
2 |
license: llama2
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: llama2
|
3 |
---
|
4 |
+
|
5 |
+
These are exl2 quants of [Goliath-longLORA-120b-rope8-32k-fp16](https://huggingface.co/grimulkan/Goliath-longLORA-120b-rope8-32k-fp16).
|
6 |
+
Main branch includes `measurement.json` files for the new default dataset and the one used by [goliath-120b-exl2-rpcal](https://huggingface.co/Panchovix/goliath-120b-exl2-rpcal).
|
7 |
+
|
8 |
+
# Available versions
|
9 |
+
[2.65bpw](https://huggingface.co/aikitoria/Goliath-longLORA-120b-rope8-32k-exl2/tree/2.65bpw) using default dataset
|
10 |
+
[4.35bpw](https://huggingface.co/aikitoria/Goliath-longLORA-120b-rope8-32k-exl2/tree/4.35bpw) using default dataset
|
11 |
+
[4.35bpw-rpcal](https://huggingface.co/aikitoria/Goliath-longLORA-120b-rope8-32k-exl2/tree/4.35bpw-rpcal) using PIPPA dataset
|
12 |
+
|
13 |
+
|
14 |
+
# Memory usage tests
|
15 |
+
|
16 |
+
### 2.65bpw
|
17 |
+
context 16k, cache 16: 46.9GiB (fits in 2x 3090)
|
18 |
+
context 32k, cache 8: 47GiB (fits in 2x 3090)
|
19 |
+
|
20 |
+
### 4.35bpw
|
21 |
+
context 16k, cache 16: 70.1GiB (fits in 3x 3090)
|
22 |
+
context 32k, cache 8: 70.3GiB (fits in 3x 3090)
|
23 |
+
context 32k, cache 16: 78.7GiB (fits in A100 80GB)
|
24 |
+
|
25 |
+
# Super epic scientific test results
|
26 |
+
The 2.65bpw version suffered greatly, it's not completely broken, but it's not good either.
|
27 |
+
The 4.35bpw version is worse than normal goliath but better than goliath with rope scale applied for long context.
|
28 |
+
The version using the PIPPA dataset produces worse results than the one using the default dataset on any context length.
|