Model Card for Xylaria-1.4-smol
Model Details
Model Description
Xylaria-1.4-smol is a highly compact Recurrent Neural Network (RNN) with just 1 MB of storage and 2 million parameters. Designed for efficiency, this model represents a breakthrough in lightweight neural network architecture, optimized for resource-constrained environments.
- Developed by: Sk Md Saad Amin
- Model type: Recurrent Neural Network (RNN)
- Parameters: 2 million (approx)
- Storage Size: 1 MB
- Language(s): English
- License: Apache-2.0
Direct Use
Xylaria-1.4-smol is ideal for:
- Research
- Education
- Hobby
Downstream Use
The model can be fine-tuned for various tasks such as:
- Lightweight text generation
- Simple sequence prediction
- Embedded system applications
- Educational demonstrations of efficient neural network design
Out-of-Scope Use
- High-complexity natural language processing tasks
- Applications requiring extensive computational resources
- Tasks demanding state-of-the-art accuracy in complex domains
- It doesn't shine in tasks that are very heavy as this is made for educational and research purposes only
Bias, Risks, and Limitations
- Limited capacity due to compact design
- Potential performance trade-offs for complexity
- May not perform as well as larger models in nuanced tasks
- Has extremely small vocab size of 108
Recommendations
- Carefully evaluate performance for specific use cases
- Consider model limitations in critical applications
- Potential for transfer learning and fine-tuning
Model Architecture and Objective
- Architecture: Compact Recurrent Neural Network
- Objective: Efficient sequence processing
- Key Features:
- Minimal parameter count
- Reduced storage footprint
- Low computational requirements
Hardware
- Suitable for:
- Microcontrollers
- Mobile devices
- Edge computing platforms
Software
- Compatible with:
- TensorFlow Lite
- PyTorch Mobile
Citation (If you find my work helpful, please consider giving a cite)
BibTeX:
@misc{xylaria2024smol,
title={Xylaria-1.4-smol: A Compact Efficient RNN},
author={[Your Name]},
year={2024}
}
One Can include the xylaria code like this
import torch
import torch.nn as nn
class XylariaSmolRNN(nn.Module):
def __init__(self, config):
super(XylariaSmolRNN, self).__init__()
self.vocab_size = config['vocab_size']
self.embedding_dim = config['embedding_dim']
self.hidden_dim = config['hidden_dim']
self.num_layers = config['num_layers']
self.char_to_idx = config['char_to_idx']
self.embedding = nn.Embedding(
num_embeddings=self.vocab_size,
embedding_dim=self.embedding_dim,
padding_idx=self.char_to_idx['<PAD>']
)
self.rnn = nn.LSTM(
input_size=self.embedding_dim,
hidden_size=self.hidden_dim,
num_layers=self.num_layers,
batch_first=True
)
self.fc = nn.Linear(self.hidden_dim, self.vocab_size)
self.dropout = nn.Dropout(0.3)
def forward(self, x):
embedded = self.embedding(x)
rnn_out, (hidden, cell) = self.rnn(embedded)
rnn_out = self.dropout(rnn_out)
output = self.fc(rnn_out)
return output, (hidden, cell)
def demonstrate_xylaria_model():
model_config = {
"vocab_size": 108,
"embedding_dim": 50,
"hidden_dim": 128,
"num_layers": 2,
"char_to_idx": {" ": 1, "!": 2, "\"": 3, "#": 4, "$": 5, "%": 6, "&": 7, "'": 8, "(": 9, ")": 10, "*": 11, "+": 12, ",": 13, "-": 14, ".": 15, "/": 16, "0": 17, "1": 18, "2": 19, "3": 20, "4": 21, "5": 22, "6": 23, "7": 24, "8": 25, "9": 26, ":": 27, ";": 28, "<": 29, "=": 30, ">": 31, "?": 32, "A": 33, "B": 34, "C": 35, "D": 36, "E": 37, "F": 38, "G": 39, "H": 40, "I": 41, "J": 42, "K": 43, "L": 44, "M": 45, "N": 46, "O": 47, "P": 48, "Q": 49, "R": 50, "S": 51, "T": 52, "U": 53, "V": 54, "W": 55, "X": 56, "Y": 57, "Z": 58, "[": 59, "\\": 60, "]": 61, "^": 62, "_": 63, "a": 64, "b": 65, "c": 66, "d": 67, "e": 68, "f": 69, "g": 70, "h": 71, "i": 72, "j": 73, "k": 74, "l": 75, "m": 76, "n": 77, "o": 78, "p": 79, "q": 80, "r": 81, "s": 82, "t": 83, "u": 84, "v": 85, "w": 86, "x": 87, "y": 88, "z": 89, "{": 90, "}": 91, "°": 92, "²": 93, "à": 94, "á": 95, "æ": 96, "é": 97, "í": 98, "ó": 99, "ö": 100, "–": 101, "'": 102, "'": 103, """: 104, """: 105, "…": 106, "<PAD>": 0, "<UNK>": 107}
}
model = XylariaSmolRNN(model_config)
total_params = sum(p.numel() for p in model.parameters())
trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
print(f"Total Parameters: {total_params}")
print(f"Trainable Parameters: {trainable_params}")
print(f"Model Size Estimate: {total_params * 4 / 1024 / 1024:.2f} MB")
batch_size = 1
sequence_length = 20
x = torch.randint(0, model_config['vocab_size'], (batch_size, sequence_length))
with torch.no_grad():
output, (hidden, cell) = model(x)
print("Model Output Shape:", output.shape)
print("Hidden State Shape:", hidden.shape)
print("Cell State Shape:", cell.shape)
try:
scripted_model = torch.jit.script(model)
scripted_model.save("xylaria_smol_model.pt")
print("Model exported for deployment")
except Exception as e:
print(f"Export failed: {e}")
def generate_text(model, start_char, max_length=100):
current_char = torch.tensor([[model.char_to_idx.get(start_char, model.char_to_idx['<UNK>'])]])
hidden = None
generated_text = [start_char]
for _ in range(max_length - 1):
with torch.no_grad():
embedded = model.embedding(current_char)
if hidden is None:
rnn_out, (hidden, cell) = model.rnn(embedded)
else:
rnn_out, (hidden, cell) = model.rnn(embedded, (hidden, cell))
output = model.fc(rnn_out)
probabilities = torch.softmax(output[0, -1], dim=0)
next_char_idx = torch.multinomial(probabilities, 1).item()
idx_to_char = {idx: char for char, idx in model.char_to_idx.items()}
next_char = idx_to_char.get(next_char_idx, '<UNK>')
generated_text.append(next_char)
current_char = torch.tensor([[next_char_idx]])
if next_char == '<UNK>':
break
return ''.join(generated_text)
print("\nText Generation Example:")
generated = generate_text(model, 'A')
print(generated)
if __name__ == "__main__":
demonstrate_xylaria_model()
PS: THE CODE MY BE A BIT WRONG SO, ADJUST ACCORDINGLY
More Information
Xylaria-1.4-smol represents a significant step towards ultra-efficient neural network design, demonstrating that powerful machine learning can be achieved with minimal computational resources.
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
HF Inference API was unable to determine this model's library.