Sentence-Transformers Compatible Kanana-Nano-2.1b-Embedding

This repository contains a sentence-transformers compatible version of the Kanana-Nano-2.1b-Embedding model developed by Kakao.

For detailed information about the model architecture, training methodology, and comprehensive performance benchmarks, please refer to the original model repository and the Kanana technical report.

Key Adaptations

This version has been modified to work seamlessly with the sentence-transformers library with the following changes:

Implemented KananaEmbeddingWrapper module to enable loading via SentenceTransformer
Added L2 normalization within the KananaEmbeddingWrapper's forward method
max_seq_length is fixed with 8192.
Embed the query prompt related parts into the model. You can encode the query with query_name.

Usage

Installation

pip install sentence-transformers

Basic Usage

from sentence_transformers import SentenceTransformer

# Load the model
model = SentenceTransformer("datalama/kanana-nano-2.1b-embedding", device="cpu", trust_remote_code=True)

# Encode sentences
sentences = [
    "이 문장은 한국어로 작성되었습니다.",
    "This sentence is written in English."
]

embeddings = model.encode(sentences)

Advanced Usage with Query/Passage Format

You can use prompt_name or prompt.

import numpy as np
from sentence_transformers import SentenceTransformer

model = SentenceTransformer("datalama/kanana-nano-2.1b-embedding", device="cpu", trust_remote_code=True)

# For retrieval tasks
instruction = "Given a question, retrieve passages that answer the question"
queries = [
    "are judo throws allowed in wrestling?", 
    "how to become a radiology technician in michigan?",
]


# You can encode query by prompt_name with predefiend prompt template.
embedding_a = model.encode(queries, prompt_name="query")

# You can directly encode the query with prompt.
prompt_template = """Instruct: {instruction}\nQuery: """
embedding_b = model.encode(queries, prompt=prompt_template.format(instruction=instruction))

# compare input.
np.allclose(embedding_a, embedding_b)
# True

Compare embedding with original code.

import torch.nn.functional as F
import numpy as np
from transformers import AutoModel
from sentence_transformers import SentenceTransformer

# For retrieval tasks
instruction = "Given a question, retrieve passages that answer the question"
queries = [
    "are judo throws allowed in wrestling?", 
    "how to become a radiology technician in michigan?",
]

passages = [
    "Since you're reading this, you are probably someone from a judo background or someone who is just wondering how judo techniques can be applied under wrestling rules. So without further ado, let's get to the question. Are Judo throws allowed in wrestling? Yes, judo throws are allowed in freestyle and folkstyle wrestling. You only need to be careful to follow the slam rules when executing judo throws. In wrestling, a slam is lifting and returning an opponent to the mat with unnecessary force.",
    "Below are the basic steps to becoming a radiologic technologist in Michigan:Earn a high school diploma. As with most careers in health care, a high school education is the first step to finding entry-level employment. Taking classes in math and science, such as anatomy, biology, chemistry, physiology, and physics, can help prepare students for their college studies and future careers.Earn an associate degree. Entry-level radiologic positions typically require at least an Associate of Applied Science. Before enrolling in one of these degree programs, students should make sure it has been properly accredited by the Joint Review Committee on Education in Radiologic Technology (JRCERT).Get licensed or certified in the state of Michigan.",
]

# compare originaml model and this model.
model_a = AutoModel.from_pretrained("kakaocorp/kanana-nano-2.1b-embedding",trust_remote_code=True,).to("cpu")
model_b = SentenceTransformer("datalama/kanana-nano-2.1b-embedding", device="cpu", trust_remote_code=True)

# original encoding method. 
max_length = 512
query_embeddings = model_a.encode(queries, instruction=instruction, max_length=max_length)
passage_embeddings = model_a.encode(passages, instruction="", max_length=max_length)

query_embeddings = F.normalize(query_embeddings, p=2, dim=1)
passage_embeddings = F.normalize(passage_embeddings, p=2, dim=1)

scores_a = (query_embeddings @ passage_embeddings.T) * 100

# sentence_transformers compatible encoding method.
query_embeddings = model_b.encode(queries, prompt_name="query")
passage_embeddings = model_b.encode(passages)

scores_b = (query_embeddings @ passage_embeddings.T) * 100

# compare embedding
np.allclose(scores_a.cpu().numpy(), scores_b)
# True

Note: Unlike the original model, you don't need to manually perform L2 normalization as this is handled by the KananaEmbeddingWrapper module during the forward pass.

License

This model is licensed under CC-BY-NC-4.0.

Citation

If you use this model, please cite the original work:

@misc{kananallmteam2025kananacomputeefficientbilinguallanguage,
      title={Kanana: Compute-efficient Bilingual Language Models}, 
      author={Kanana LLM Team and Yunju Bak and Hojin Lee and Minho Ryu and Jiyeon Ham and Seungjae Jung and Daniel Wontae Nam and Taegyeong Eo and Donghun Lee and Doohae Jung and Boseop Kim and Nayeon Kim and Jaesun Park and Hyunho Kim and Hyunwoong Ko and Changmin Lee and Kyoung-Woon On and Seulye Baeg and Junrae Cho and Sunghee Jung and Jieun Kang and EungGyun Kim and Eunhwa Kim and Byeongil Ko and Daniel Lee and Minchul Lee and Miok Lee and Shinbok Lee and Gaeun Seo},
      year={2025},
      eprint={2502.18934},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2502.18934}, 
}

Acknowledgements

Original model developed by the Kanana LLM Team at Kakao
Adaptation to sentence-transformers format by datalama