--- license: mit language: - as base_model: - openai/whisper-medium --- # Assamese Dialect Classification Model This repository contains a trained model for classifying Assamese dialects based on speech inputs. The model was developed to assist in identifying and understanding regional variations of the Assamese language. --- ## **Model Purpose** The purpose of this model is to classify different dialects of Assamese speech. It is useful for linguistic research, speech analysis, and creating dialect-aware applications in natural language processing (NLP) and automatic speech recognition (ASR). --- ## **Dialects Recognized** The model is trained to recognize the following four dialects of the Assamese language: 1. **Darangia** 2. **Kamrupia** 3. **Nalbaria** 4. **Upper Assam** --- ## **Training Dataset** The model was trained on a dataset of 300 speech samples, curated to include diverse speakers, phrases, and dialect features. The dataset includes: - **Diverse Data**: Various accents, speaker genders, and age groups. - **Metadata**: Information about speaker age, gender, district, and speech duration. - **Common Phrases**: Speech samples based on frequently used phrases in Assamese. ---