whisper-lang-id

This model is a fine-tuned version of openai/whisper-tiny on mozilla-foundation/common_voice_11_0 dataset

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

Mozilla foundation/common_voice_11.0

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 1

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1
No log	1.0	175	0.0148	0.995	0.9950

Framework versions

Transformers 4.48.0.dev0
Pytorch 2.5.1+cu121
Datasets 3.2.0
Tokenizers 0.21.

Example Usage

Here is an example of how to use the model for Language Idenfication with Gradio:

import torch
from transformers import pipeline
import gradio as gr

# Use a pipeline as a high-level helper
pipe = pipeline("audio-classification", model="Lingalingeswaran/whisper-lang-id")

def identify_language(audio_file):
    """Identifies the language of an audio file."""
    try:
        result = pipe(audio_file)
        predicted_label = result[0]['label']
        score = result[0]['score']

        if predicted_label == "LABEL_0":
            predicted_label = "Tamil"
        elif predicted_label == "LABEL_1":
            predicted_label = "English"
        else:
            predicted_label = predicted_label

        return f"Predicted Language: {predicted_label}, Score: {score:.4f}"
    except Exception as e:
        return f"Error during language identification: {e}"

# Gradio interface
def create_gradio_interface():
    with gr.Blocks() as demo:
        gr.Markdown("### Language Identification from Audio File")
        gr.Markdown("Upload an audio file or use your microphone to detect the language spoken.")

        # Corrected the sources argument
        audio_input = gr.Audio(sources=["microphone", "upload"], type="filepath", label="Record or Upload Audio")
        result_output = gr.Textbox(label="Language Identification Result", interactive=False)

        # Submit button
        submit_btn = gr.Button("Submit")
        submit_btn.click(identify_language, inputs=audio_input, outputs=result_output)

        # Clear button
        clear_btn = gr.Button("Clear")
        clear_btn.click(lambda: (None, None), outputs=[audio_input, result_output])  # Clear audio and result

    demo.launch()

# Run the Gradio interface
create_gradio_interface()

Lingalingeswaran
/

whisper-lang-id