-
UniAudio: An Audio Foundation Model Toward Universal Audio Generation
Paper • 2310.00704 • Published • 21 -
Structural Similarities Between Language Models and Neural Response Measurements
Paper • 2306.01930 • Published • 2 -
Streaming Transformer ASR with Blockwise Synchronous Beam Search
Paper • 2006.14941 • Published • 2 -
NU-GAN: High resolution neural upsampling with GAN
Paper • 2010.11362 • Published • 2
Collections
Discover the best community collections!
Collections including paper arxiv:2006.11477
-
Robust Speech Recognition via Large-Scale Weak Supervision
Paper • 2212.04356 • Published • 25 -
Conformer: Convolution-augmented Transformer for Speech Recognition
Paper • 2005.08100 • Published -
wav2vec: Unsupervised Pre-training for Speech Recognition
Paper • 1904.05862 • Published -
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations
Paper • 2006.11477 • Published • 5
-
Large-scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption Augmentation
Paper • 2211.06687 • Published • 3 -
EnCLAP: Combining Neural Audio Codec and Audio-Text Joint Embedding for Automated Audio Captioning
Paper • 2401.17690 • Published • 5 -
Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
Paper • 2312.09911 • Published • 53 -
Audiobox: Unified Audio Generation with Natural Language Prompts
Paper • 2312.15821 • Published • 13
-
facebook/wav2vec2-large-960h-lv60-self
Automatic Speech Recognition • Updated • 1.07M • 141 -
facebook/wav2vec2-large-960h
Automatic Speech Recognition • Updated • 64.3k • 28 -
facebook/wav2vec2-base-960h
Automatic Speech Recognition • Updated • 1.57M • • 312 -
facebook/wav2vec2-base-100h
Automatic Speech Recognition • Updated • 1.67k • 6