Model Description

Fine-tuned Whisper-tiny on SwissDial-ZH dataset for Swiss German dialects.

Model Details

Training

  • Duration: 4 hours
  • Hardware: NVIDIA RTX 3080
  • Batch Size: 32
  • Train/Test Split: 90%/10% (specific sentence selection)

Performance

  • WER: ~37% on test set

Usage

from transformers import WhisperForConditionalGeneration, WhisperProcessor

model_name = "nizarmichaud/whisper-tiny-swiss-german"
model = WhisperForConditionalGeneration.from_pretrained(model_name)
processor = WhisperProcessor.from_pretrained(model_name)

audio_input = ...  # Your audio input here
inputs = processor(audio_input, return_tensors="pt", sampling_rate=16000)
generated_ids = model.generate(inputs["input_features"])
transcription = processor.batch_decode(generated_ids, skip_special_tokens=True)

print(transcription)

license: mit

Downloads last month
12
Safetensors
Model size
37.8M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.