Collections
Discover the best community collections!
Collections including paper arxiv:2006.11477
-
Latent Positional Information is in the Self-Attention Variance of Transformer Language Models Without Positional Embeddings
Paper • 2305.13571 • Published • 2 -
BERTs are Generative In-Context Learners
Paper • 2406.04823 • Published • 1 -
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference
Paper • 2412.13663 • Published • 121 -
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations
Paper • 2006.11477 • Published • 5
-
WavLLM: Towards Robust and Adaptive Speech Large Language Model
Paper • 2404.00656 • Published • 10 -
Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization
Paper • 2404.09956 • Published • 11 -
Long-form music generation with latent diffusion
Paper • 2404.10301 • Published • 24 -
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations
Paper • 2006.11477 • Published • 5
-
WhisperX: Time-Accurate Speech Transcription of Long-Form Audio
Paper • 2303.00747 • Published • 4 -
Custom Data Augmentation for low resource ASR using Bark and Retrieval-Based Voice Conversion
Paper • 2311.14836 • Published • 2 -
SONAR: Sentence-Level Multimodal and Language-Agnostic Representations
Paper • 2308.11466 • Published • 1 -
W2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training
Paper • 2108.06209 • Published • 1