SandLogicTechnologies commited on
Commit
99941f6
·
verified ·
1 Parent(s): 72cec4e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +113 -2
README.md CHANGED
@@ -4,8 +4,119 @@ language:
4
  - en
5
  pipeline_tag: text-generation
6
  tags:
7
- - facebook
8
  - meta
9
  - Llama3
10
  - pytorch
11
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  - en
5
  pipeline_tag: text-generation
6
  tags:
 
7
  - meta
8
  - Llama3
9
  - pytorch
10
+ ---
11
+ # SandLogic Technology - Quantized Meta-Llama3-8b-Instruct Models
12
+
13
+ ## Model Description
14
+
15
+ We have quantized the Meta-Llama3-8b-Instruct model into three variants:
16
+
17
+ 1. Q5_KM
18
+ 2. Q4_KM
19
+ 3. IQ4_XS
20
+
21
+ These quantized models offer improved efficiency while maintaining performance.
22
+
23
+ ## Original Model Information
24
+
25
+ - **Name**: Meta-Llama3-8b-Instruct
26
+ - **Developer**: Meta
27
+ - **Release Date**: April 18, 2024
28
+ - **Model Type**: Auto-regressive language model
29
+ - **Architecture**: Optimized transformer with Grouped-Query Attention (GQA)
30
+ - **Parameters**: 8 billion
31
+ - **Context Length**: 8k tokens
32
+ - **Training Data**: New mix of publicly available online data (15T+ tokens)
33
+ - **Knowledge Cutoff**: March, 2023
34
+
35
+ ## Model Capabilities
36
+
37
+ Llama 3 is designed for multiple use cases, including:
38
+
39
+ - Responding to questions in natural language
40
+ - Writing code
41
+ - Brainstorming ideas
42
+ - Content creation
43
+ - Summarization
44
+
45
+ The model understands context and responds in a human-like manner, making it useful for various applications.
46
+
47
+ ## Use Cases
48
+
49
+ 1. **Chatbots**: Enhance customer service automation
50
+ 2. **Content Creation**: Generate articles, reports, blogs, and stories
51
+ 3. **Email Communication**: Draft emails and maintain consistent brand tone
52
+ 4. **Data Analysis Reports**: Summarize findings and create business performance reports
53
+ 5. **Code Generation**: Produce code snippets, identify bugs, and provide programming recommendations
54
+
55
+ ## Model Variants
56
+
57
+ We offer three quantized versions of the Meta-Llama3-8b-Instruct model:
58
+
59
+ 1. **Q5_KM**: 5-bit quantization using the KM method
60
+ 2. **Q4_KM**: 4-bit quantization using the KM method
61
+ 3. **IQ4_XS**: 4-bit quantization using the IQ4_XS method
62
+
63
+ These quantized models aim to reduce model size and improve inference speed while maintaining performance as close to the original model as possible.
64
+
65
+ ## Usage
66
+
67
+ ```bash
68
+ pip install llama-cpp-python
69
+ ```
70
+ Please refer to the llama-cpp-python [documentation](https://llama-cpp-python.readthedocs.io/en/latest/) to install with GPU support.
71
+
72
+ ### Basic Text Completion
73
+ Here's an example demonstrating how to use the high-level API for basic text completion:
74
+
75
+ ```bash
76
+ from llama_cpp import Llama
77
+
78
+ llm = Llama(
79
+ model_path="./models/7B/llama-model.gguf",
80
+ verbose=False,
81
+ # n_gpu_layers=-1, # Uncomment to use GPU acceleration
82
+ # n_ctx=2048, # Uncomment to increase the context window
83
+ )
84
+
85
+ output = llm(
86
+ "Q: Name the planets in the solar system? A: ", # Prompt
87
+ max_tokens=32, # Generate up to 32 tokens
88
+ stop=["Q:", "\n"], # Stop generating just before a new question
89
+ echo=False # Don't echo the prompt in the output
90
+ )
91
+
92
+ print(output["choices"][0]["text"])
93
+ ```
94
+
95
+ ## Download
96
+ You can download `Llama` models in `gguf` format directly from Hugging Face using the `from_pretrained` method. This feature requires the `huggingface-hub` package.
97
+
98
+ To install it, run: `pip install huggingface-hub`
99
+
100
+ ```bash
101
+ from llama_cpp import Llama
102
+
103
+ llm = Llama.from_pretrained(
104
+ repo_id="SandLogicTechnologies/Meta-Llama-3-8B-Instruct-GGUF",
105
+ filename="*Meta-Llama-3-8B-Instruct.Q5_K_M.gguf",
106
+ verbose=False
107
+ )
108
+ ```
109
+ By default, from_pretrained will download the model to the Hugging Face cache directory. You can manage installed model files using the huggingface-cli tool.
110
+
111
+
112
+ ## License
113
+
114
+ A custom commercial license is available at: https://llama.meta.com/llama3/license
115
+
116
+ ## Acknowledgements
117
+
118
+ We thank Meta for developing and releasing the original Llama 3 model.
119
+
120
+ ## Contact
121
+
122
+ For any inquiries or support, please contact us at` **[email protected]`** or visit our [support page](https://www.sandlogic.com/LingoForge/support).