|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
pipeline_tag: text-generation |
|
--- |
|
# Model Card for Xylaria-1.4-smol |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
**Xylaria-1.4-smol** is a highly compact Recurrent Neural Network (RNN) with just **1 MB of storage** and **2 million parameters**. Designed for efficiency, this model represents a breakthrough in lightweight neural network architecture, optimized for resource-constrained environments. |
|
|
|
- **Developed by:** Sk Md Saad Amin |
|
- **Model type:** Recurrent Neural Network (RNN) |
|
- **Parameters:** 2 million (approx) |
|
- **Storage Size:** 1 MB |
|
- **Language(s):** English |
|
- **License:** Apache-2.0 |
|
|
|
|
|
### Direct Use |
|
|
|
Xylaria-1.4-smol is ideal for: |
|
- Research |
|
- Education |
|
- Hobby |
|
|
|
### Downstream Use |
|
|
|
The model can be fine-tuned for various tasks such as: |
|
- Lightweight text generation |
|
- Simple sequence prediction |
|
- Embedded system applications |
|
- Educational demonstrations of efficient neural network design |
|
|
|
### Out-of-Scope Use |
|
|
|
- High-complexity natural language processing tasks |
|
- Applications requiring extensive computational resources |
|
- Tasks demanding state-of-the-art accuracy in complex domains |
|
- It doesn't shine in tasks that are very heavy as this is made for educational and research purposes only |
|
## Bias, Risks, and Limitations |
|
|
|
- Limited capacity due to compact design |
|
- Potential performance trade-offs for complexity |
|
- May not perform as well as larger models in nuanced tasks |
|
- Has extremely small vocab size of 108 |
|
|
|
### Recommendations |
|
|
|
- Carefully evaluate performance for specific use cases |
|
- Consider model limitations in critical applications |
|
- Potential for transfer learning and fine-tuning |
|
|
|
### Model Architecture and Objective |
|
|
|
- **Architecture:** Compact Recurrent Neural Network |
|
- **Objective:** Efficient sequence processing |
|
- **Key Features:** |
|
- Minimal parameter count |
|
- Reduced storage footprint |
|
- Low computational requirements |
|
|
|
|
|
#### Hardware |
|
- Suitable for: |
|
- Microcontrollers |
|
- Mobile devices |
|
- Edge computing platforms |
|
|
|
#### Software |
|
- Compatible with: |
|
- TensorFlow Lite |
|
- PyTorch Mobile |
|
|
|
## Citation (If you find my work helpful, please consider giving a cite) |
|
|
|
**BibTeX:** |
|
```bibtex |
|
@misc{xylaria2024smol, |
|
title={Xylaria-1.4-smol: A Compact Efficient RNN}, |
|
author={[Your Name]}, |
|
year={2024} |
|
} |
|
``` |
|
## One Can include the xylaria code like this |
|
```python |
|
import torch |
|
import torch.nn as nn |
|
|
|
class XylariaSmolRNN(nn.Module): |
|
def __init__(self, config): |
|
super(XylariaSmolRNN, self).__init__() |
|
|
|
|
|
self.vocab_size = config['vocab_size'] |
|
self.embedding_dim = config['embedding_dim'] |
|
self.hidden_dim = config['hidden_dim'] |
|
self.num_layers = config['num_layers'] |
|
self.char_to_idx = config['char_to_idx'] |
|
|
|
|
|
self.embedding = nn.Embedding( |
|
num_embeddings=self.vocab_size, |
|
embedding_dim=self.embedding_dim, |
|
padding_idx=self.char_to_idx['<PAD>'] |
|
) |
|
|
|
|
|
self.rnn = nn.LSTM( |
|
input_size=self.embedding_dim, |
|
hidden_size=self.hidden_dim, |
|
num_layers=self.num_layers, |
|
batch_first=True |
|
) |
|
|
|
|
|
self.fc = nn.Linear(self.hidden_dim, self.vocab_size) |
|
|
|
|
|
self.dropout = nn.Dropout(0.3) |
|
|
|
def forward(self, x): |
|
|
|
embedded = self.embedding(x) |
|
|
|
|
|
rnn_out, (hidden, cell) = self.rnn(embedded) |
|
|
|
|
|
rnn_out = self.dropout(rnn_out) |
|
|
|
|
|
output = self.fc(rnn_out) |
|
|
|
return output, (hidden, cell) |
|
|
|
def demonstrate_xylaria_model(): |
|
|
|
model_config = { |
|
"vocab_size": 108, |
|
"embedding_dim": 50, |
|
"hidden_dim": 128, |
|
"num_layers": 2, |
|
"char_to_idx": {" ": 1, "!": 2, "\"": 3, "#": 4, "$": 5, "%": 6, "&": 7, "'": 8, "(": 9, ")": 10, "*": 11, "+": 12, ",": 13, "-": 14, ".": 15, "/": 16, "0": 17, "1": 18, "2": 19, "3": 20, "4": 21, "5": 22, "6": 23, "7": 24, "8": 25, "9": 26, ":": 27, ";": 28, "<": 29, "=": 30, ">": 31, "?": 32, "A": 33, "B": 34, "C": 35, "D": 36, "E": 37, "F": 38, "G": 39, "H": 40, "I": 41, "J": 42, "K": 43, "L": 44, "M": 45, "N": 46, "O": 47, "P": 48, "Q": 49, "R": 50, "S": 51, "T": 52, "U": 53, "V": 54, "W": 55, "X": 56, "Y": 57, "Z": 58, "[": 59, "\\": 60, "]": 61, "^": 62, "_": 63, "a": 64, "b": 65, "c": 66, "d": 67, "e": 68, "f": 69, "g": 70, "h": 71, "i": 72, "j": 73, "k": 74, "l": 75, "m": 76, "n": 77, "o": 78, "p": 79, "q": 80, "r": 81, "s": 82, "t": 83, "u": 84, "v": 85, "w": 86, "x": 87, "y": 88, "z": 89, "{": 90, "}": 91, "°": 92, "²": 93, "à": 94, "á": 95, "æ": 96, "é": 97, "í": 98, "ó": 99, "ö": 100, "–": 101, "'": 102, "'": 103, """: 104, """: 105, "…": 106, "<PAD>": 0, "<UNK>": 107} |
|
} |
|
|
|
|
|
model = XylariaSmolRNN(model_config) |
|
|
|
|
|
total_params = sum(p.numel() for p in model.parameters()) |
|
trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad) |
|
|
|
print(f"Total Parameters: {total_params}") |
|
print(f"Trainable Parameters: {trainable_params}") |
|
print(f"Model Size Estimate: {total_params * 4 / 1024 / 1024:.2f} MB") |
|
|
|
|
|
batch_size = 1 |
|
sequence_length = 20 |
|
x = torch.randint(0, model_config['vocab_size'], (batch_size, sequence_length)) |
|
|
|
|
|
with torch.no_grad(): |
|
output, (hidden, cell) = model(x) |
|
print("Model Output Shape:", output.shape) |
|
print("Hidden State Shape:", hidden.shape) |
|
print("Cell State Shape:", cell.shape) |
|
|
|
|
|
try: |
|
|
|
scripted_model = torch.jit.script(model) |
|
scripted_model.save("xylaria_smol_model.pt") |
|
print("Model exported for deployment") |
|
except Exception as e: |
|
print(f"Export failed: {e}") |
|
|
|
|
|
def generate_text(model, start_char, max_length=100): |
|
|
|
current_char = torch.tensor([[model.char_to_idx.get(start_char, model.char_to_idx['<UNK>'])]]) |
|
|
|
|
|
hidden = None |
|
generated_text = [start_char] |
|
|
|
for _ in range(max_length - 1): |
|
with torch.no_grad(): |
|
|
|
embedded = model.embedding(current_char) |
|
if hidden is None: |
|
rnn_out, (hidden, cell) = model.rnn(embedded) |
|
else: |
|
rnn_out, (hidden, cell) = model.rnn(embedded, (hidden, cell)) |
|
|
|
|
|
output = model.fc(rnn_out) |
|
|
|
|
|
probabilities = torch.softmax(output[0, -1], dim=0) |
|
next_char_idx = torch.multinomial(probabilities, 1).item() |
|
|
|
|
|
idx_to_char = {idx: char for char, idx in model.char_to_idx.items()} |
|
next_char = idx_to_char.get(next_char_idx, '<UNK>') |
|
|
|
generated_text.append(next_char) |
|
current_char = torch.tensor([[next_char_idx]]) |
|
|
|
if next_char == '<UNK>': |
|
break |
|
|
|
return ''.join(generated_text) |
|
|
|
|
|
print("\nText Generation Example:") |
|
generated = generate_text(model, 'A') |
|
print(generated) |
|
|
|
if __name__ == "__main__": |
|
demonstrate_xylaria_model() |
|
``` |
|
PS: THE CODE MY BE A BIT WRONG SO, ADJUST ACCORDINGLY |
|
## More Information |
|
|
|
Xylaria-1.4-smol represents a significant step towards ultra-efficient neural network design, demonstrating that powerful machine learning can be achieved with minimal computational resources. |