ynakashima's picture
Update README.md
e77c18c verified
|
raw
history blame
2.16 kB
---
library_name: transformers
license: apache-2.0
language:
- en
- ja
base_model: Qwen/QwQ-32B-Preview
---
# KARAKURI LM 32B Thinking 2501 Experimental
## Model Details
### Model Description
- **Developed by:** [KARAKURI Inc.](https://about.karakuri.ai/)
- **Model type:** Causal Language Models
- **Languages**: Japanese
- **License:** Apache 2.0
- **Finetuned from model:** [Qwen/QwQ-32B-Preview](https://huggingface.co/Qwen/QwQ-32B-Preview)
- **Contact**: For questions and comments about the model, please email `[email protected]`
- **Demo**: https://lm.karakuri.cc/
## Usage
### Run the model
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "karakuri-ai/karakuri-lm-32b-thinking-2501-exp"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
messages = [
{"role": "user", "content": "こんにちは。"}
]
input_ids = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
return_tensors="pt",
).to(model.device)
outputs = model.generate(input_ids, max_new_tokens=512)
tokenizer.decode(outputs[0][input_ids.shape[-1]:])
```
## Training Details
### Training Infrastructure
- **Hardware**: The model was trained on 16 nodes of an Amazon EC2 trn1.32xlarge instance.
- **Software**: We use code based on [neuronx-nemo-megatron](https://github.com/aws-neuron/neuronx-nemo-megatron).
## Acknowledgments
This work was supported by the Ministry of Economy, Trade and Industry (METI) and the New Energy and Industrial Technology Development Organization (NEDO) through the [Generative AI Accelerator Challenge (GENIAC)](https://www.meti.go.jp/policy/mono_info_service/geniac/index.html).
## Citation
```
@misc{karakuri_lm_32b_thinking_2501_exp,
author = { {KARAKURI} {I}nc. },
title = { {KARAKURI} {LM} 32{B} {T}hinking 2501 {E}xperimental },
year = { 2025 },
url = { https://huggingface.co/karakuri-ai/karakuri-lm-32b-thinking-2501-exp },
publisher = { Hugging Face },
journal = { Hugging Face repository }
}
```