Update README.md
Browse files
README.md
CHANGED
@@ -8,9 +8,6 @@ https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct
|
|
8 |
Mostly an experiment trying to completely uncensor the model, doesn't seem to be near as good as the original in reasoning and knowledge. It is however pretty good for RP.
|
9 |
|
10 |
|
11 |
-
Will soon have quants uploaded here on HF and have it up on https://awanllm.com for anyone to try.
|
12 |
-
|
13 |
-
|
14 |
Training:
|
15 |
- 4096 sequence length, while the base model is 8192 sequence length. From testing it still performs the same 8192 context just fine.
|
16 |
- Training duration is around 3 days on an RTX 4090, using 4-bit loading and Qlora 64-rank 128-alpha resulting in ~2% trainable weights.
|
|
|
8 |
Mostly an experiment trying to completely uncensor the model, doesn't seem to be near as good as the original in reasoning and knowledge. It is however pretty good for RP.
|
9 |
|
10 |
|
|
|
|
|
|
|
11 |
Training:
|
12 |
- 4096 sequence length, while the base model is 8192 sequence length. From testing it still performs the same 8192 context just fine.
|
13 |
- Training duration is around 3 days on an RTX 4090, using 4-bit loading and Qlora 64-rank 128-alpha resulting in ~2% trainable weights.
|