distilbert-base-turkish-case trained on AllNLI Turkish translate triplets

This is a sentence-transformers model finetuned from dbmdz/distilbert-base-turkish-cased. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: dbmdz/distilbert-base-turkish-cased
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity
  • Language: tr
  • License: apache-2.0

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: DistilBertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("orhanxakarsu/sentence-distilbert-turkish")
# Run inference
sentences = [
    "İki kadın, Çin'deki bir markette bir ürüne bakıyor.",
    'Alışveriş yapan iki kadın',
    'Kadınlar bir spor salonunda çalışıyorlar.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Triplet

Metric Value
cosine_accuracy 0.9802

Training Details

Training Dataset

Unnamed Dataset

  • Size: 814,596 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 3 tokens
    • mean: 18.16 tokens
    • max: 91 tokens
    • min: 4 tokens
    • mean: 10.54 tokens
    • max: 136 tokens
    • min: 4 tokens
    • mean: 10.73 tokens
    • max: 29 tokens
  • Samples:
    anchor positive negative
    Beyaz gömlekli ve güneş gözlüklü bir kadın, kucağında bir bebekle dışarıda bir sandalyede oturuyor. Bebek yerden yukarıda oturuyor Adam bir top atıyor
    Mavi yakalı gömlek giyen ve kazaklı bir adam ve beyaz gömlek giyen hasır şapka takan bir kadın. Yan yana bir erkek ve bir kadın var. Evli bir çift akşam yemeği yiyor.
    Adam içeride. Siyah fötr şapkalı bir adam bir arenada boğaya biniyor. Yeşil üniforma giyen beş subayla birlikte taş bir binanın önünde cep telefonuyla konuşan bir papaz; ikisi ayakta, diğerleri oturuyor.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 8,229 evaluation samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 4 tokens
    • mean: 17.91 tokens
    • max: 80 tokens
    • min: 4 tokens
    • mean: 10.62 tokens
    • max: 35 tokens
    • min: 4 tokens
    • mean: 11.01 tokens
    • max: 33 tokens
  • Samples:
    anchor positive negative
    Patlamanın büyüklüğünün güçlü bir örneği, Haragosha Tapınağı'nda bulunur, burada tapınağın kemerinin üst crosebar'ını görebilirsiniz, geri kalanı sertleşmiş lav tarafından batırılmıştır. Patlamanın büyüklüğünün sonucu Haragosha Tapınağı'nda görülüyor. Haragosha Tapınağı bu güne kadar tamamen sağlamdır.
    Arkeolojik kazı yapan iki kişi. Kazı yapan insanlar var. Kimse kazmıyor.
    İşçiler, Martins'in ünlü Louisiana sosis satıcısı çadırının önünde sıraya giren müşterilere hizmet veriyor Müşteriler bir satıcı çadırının önünde sıraya giriyor. Pamuk şeker yiyen bir grup insan var.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 64
  • learning_rate: 2e-05
  • num_train_epochs: 10
  • warmup_ratio: 0.1
  • fp16: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 64
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 10
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • eval_use_gather_object: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss all-nli-turkish-dev_cosine_accuracy
0 0 - - 0.5808
0.0786 1000 3.5327 1.9481 0.7607
0.1571 2000 1.5833 1.2787 0.8260
0.2357 3000 1.2338 1.0960 0.8533
0.3142 4000 1.1031 0.9897 0.8695
0.3928 5000 0.998 0.9077 0.8793
0.4714 6000 0.9412 0.8434 0.8914
0.5499 7000 0.8703 0.7904 0.8982
0.6285 8000 0.8094 0.7311 0.9068
0.7070 9000 0.7653 0.6894 0.9086
0.7856 10000 0.7248 0.6509 0.9162
0.8642 11000 0.673 0.6145 0.9205
0.9427 12000 0.6514 0.5762 0.9273
1.0213 13000 0.6259 0.5463 0.9334
1.0999 14000 0.5874 0.5276 0.9332
1.1784 15000 0.5518 0.5053 0.9366
1.2570 16000 0.5277 0.4783 0.9391
1.3355 17000 0.5075 0.4571 0.9419
1.4141 18000 0.4906 0.4379 0.9454
1.4927 19000 0.475 0.4234 0.9465
1.5712 20000 0.447 0.4046 0.9499
1.6498 21000 0.4307 0.3908 0.9508
1.7283 22000 0.4126 0.3773 0.9548
1.8069 23000 0.3985 0.3654 0.9564
1.8855 24000 0.3748 0.3582 0.9560
1.9640 25000 0.3675 0.3449 0.9581
2.0426 26000 0.3545 0.3390 0.9586
2.1211 27000 0.3456 0.3335 0.9595
2.1997 28000 0.3295 0.3255 0.9626
2.2783 29000 0.3198 0.3146 0.9624
2.3568 30000 0.3107 0.3101 0.9642
2.4354 31000 0.3139 0.3014 0.9665
2.5139 32000 0.2982 0.3005 0.9659
2.5925 33000 0.2903 0.2891 0.9663
2.6711 34000 0.2778 0.2859 0.9662
2.7496 35000 0.2731 0.2812 0.9667
2.8282 36000 0.2613 0.2757 0.9677
2.9067 37000 0.2566 0.2680 0.9689
2.9853 38000 0.2488 0.2674 0.9699
3.0639 39000 0.2434 0.2594 0.9694
3.1424 40000 0.2375 0.2574 0.9705
3.2210 41000 0.2295 0.2553 0.9706
3.2996 42000 0.223 0.2501 0.9703
3.3781 43000 0.2209 0.2455 0.9719
3.4567 44000 0.2211 0.2409 0.9711
3.5352 45000 0.2097 0.2396 0.9728
3.6138 46000 0.2068 0.2345 0.9734
3.6924 47000 0.1994 0.2298 0.9731
3.7709 48000 0.1986 0.2299 0.9730
3.8495 49000 0.1878 0.2271 0.9728
3.9280 50000 0.1872 0.2244 0.9739
4.0066 51000 0.1821 0.2249 0.9734
4.0852 52000 0.1823 0.2188 0.9739
4.1637 53000 0.1736 0.2176 0.9748
4.2423 54000 0.1691 0.2152 0.9745
4.3208 55000 0.1665 0.2148 0.9753
4.3994 56000 0.1663 0.2133 0.9748
4.4780 57000 0.1666 0.2123 0.9755
4.5565 58000 0.1589 0.2082 0.9758
4.6351 59000 0.155 0.2053 0.9762
4.7136 60000 0.155 0.2037 0.9762
4.7922 61000 0.1536 0.2031 0.9764
4.8708 62000 0.1443 0.2020 0.9759
4.9493 63000 0.146 0.1999 0.9752
5.0279 64000 0.1417 0.1969 0.9764
5.1064 65000 0.1407 0.1966 0.9761
5.1850 66000 0.1342 0.1981 0.9757
5.2636 67000 0.1342 0.1933 0.9768
5.3421 68000 0.1312 0.1944 0.9758
5.4207 69000 0.1329 0.1932 0.9772
5.4993 70000 0.1304 0.1908 0.9768
5.5778 71000 0.1247 0.1880 0.9772
5.6564 72000 0.1221 0.1861 0.9779
5.7349 73000 0.1225 0.1831 0.9784
5.8135 74000 0.1205 0.1854 0.9790
5.8921 75000 0.1152 0.1815 0.9789
5.9706 76000 0.1161 0.1827 0.9782
6.0492 77000 0.1151 0.1819 0.9781
6.1277 78000 0.113 0.1818 0.9780
6.2063 79000 0.1102 0.1823 0.9784
6.2849 80000 0.1067 0.1798 0.9780
6.3634 81000 0.1067 0.1782 0.9790
6.4420 82000 0.1116 0.1779 0.9782
6.5205 83000 0.107 0.1752 0.9782
6.5991 84000 0.1039 0.1739 0.9792
6.6777 85000 0.1013 0.1728 0.9789
6.7562 86000 0.1029 0.1713 0.9786
6.8348 87000 0.0972 0.1721 0.9791
6.9133 88000 0.0991 0.1703 0.9790
6.9919 89000 0.0955 0.1708 0.9791
7.0705 90000 0.097 0.1715 0.9786
7.1490 91000 0.0941 0.1716 0.9793
7.2276 92000 0.0922 0.1712 0.9795
7.3062 93000 0.0921 0.1706 0.9789
7.3847 94000 0.091 0.1691 0.9793
7.4633 95000 0.0942 0.1689 0.9787
7.5418 96000 0.0905 0.1678 0.9790
7.6204 97000 0.0871 0.1664 0.9792
7.6990 98000 0.0859 0.1666 0.9793
7.7775 99000 0.0876 0.1656 0.9785
7.8561 100000 0.084 0.1643 0.9795
7.9346 101000 0.0853 0.1654 0.9795
8.0132 102000 0.083 0.1640 0.9789
8.0918 103000 0.0849 0.1637 0.9795
8.1703 104000 0.0816 0.1626 0.9797
8.2489 105000 0.0803 0.1627 0.9796
8.3274 106000 0.0802 0.1623 0.9796
8.4060 107000 0.0808 0.1622 0.9798
8.4846 108000 0.0836 0.1632 0.9792
8.5631 109000 0.0791 0.1612 0.9796
8.6417 110000 0.0761 0.1609 0.9798
8.7202 111000 0.0782 0.1604 0.9797
8.7988 112000 0.0784 0.1604 0.9803
8.8774 113000 0.0737 0.1600 0.9804
8.9559 114000 0.0762 0.1602 0.9799
9.0345 115000 0.0764 0.1597 0.9802
9.1130 116000 0.0761 0.1600 0.9799
9.1916 117000 0.0729 0.1592 0.9797
9.2702 118000 0.0728 0.1595 0.9803
9.3487 119000 0.0722 0.1590 0.9798
9.4273 120000 0.0745 0.1591 0.9797
9.5059 121000 0.0741 0.1591 0.9798
9.5844 122000 0.0715 0.1587 0.9797
9.6630 123000 0.0719 0.1581 0.9799
9.7415 124000 0.0716 0.1578 0.9799
9.8201 125000 0.0714 0.1582 0.9801
9.8987 126000 0.0712 0.1579 0.9803
9.9772 127000 0.0707 0.1581 0.9802

Framework Versions

  • Python: 3.12.4
  • Sentence Transformers: 3.3.1
  • Transformers: 4.44.2
  • PyTorch: 2.4.1+cu124
  • Accelerate: 0.33.0
  • Datasets: 3.1.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
4
Safetensors
Model size
67.5M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for orhanxakarsu/sentence-distilbert-turkish

Finetuned
(10)
this model

Evaluation results