Speaker-Reasoner: Scaling Interaction Turns and Reasoning Patterns for Timestamped Speaker-Attributed ASR
ASLP-lab
ASLP-lab
AI & ML interests
None yet
Recent Activity
updated a collection about 1 hour ago
Speaker-Reasoner updated a collection about 1 hour ago
Speaker-Reasoner updated a model about 2 hours ago
ASLP-lab/Speaker-ReasonerOrganizations
None yet
YingMusic-Singer-Plus
YingMusic-Singer-Plus: Controllable Singing Voice Synthesis with Flexible Lyric Manipulation and Annotation-free Melody Guidance
VoiceSculptor
An instruct text-to-speech model developed by ASLP.
Easy Turn
WenetSpeech-Chuan
a large-scale open-source corpus with a full processing pipeline and benchmarks for ASR and TTS
OSUM
OSUM: Open Speech Understanding Model, open-sourced by ASLP@NPU.
C2SER
Speaker-Reasoner
Speaker-Reasoner: Scaling Interaction Turns and Reasoning Patterns for Timestamped Speaker-Attributed ASR
OmniCodec
Low Frame Rate Universal Audio Codec with SemanticβAcoustic Disentanglement
YingMusic-Singer-Plus
YingMusic-Singer-Plus: Controllable Singing Voice Synthesis with Flexible Lyric Manipulation and Annotation-free Melody Guidance
WenetSpeech-Wu
VoiceSculptor
An instruct text-to-speech model developed by ASLP.
SenSE
Easy Turn
SongFormer
WenetSpeech-Chuan
a large-scale open-source corpus with a full processing pipeline and benchmarks for ASR and TTS
WenetSpeech-Yue
A Large-scale Cantonese Speech Corpus with Multi-dimensional Annotation
OSUM
OSUM: Open Speech Understanding Model, open-sourced by ASLP@NPU.
LLaSE
C2SER
DiffRhythm