31 3 40

ASLP-lab

jeradf's profile picture

pcemlee's profile picture

jackwolf2025's profile picture

http://www.nwpu-aslp.org/

ASLP-lab

AI & ML interests

None yet

Recent Activity

updated a dataset 10 days ago

ASLP-lab/UrduSpeech

liked a model 10 days ago

Soul-AILab/SoulX-Transcriber

liked a dataset 16 days ago

ASLP-lab/HumDial-EIBench

View all activity

Organizations

None yet

ASLP-lab 's collections 15

FMSU

Towards Fine-Grained Multi-Dimensional Speech Understanding: Data Pipeline, Benchmark, and Model

ASLP-lab/FMSU-Bench

Updated 24 days ago • 77 • 1
ASLP-lab/FM-Speech

Audio Classification • Updated 23 days ago • 2

OmniCodec

Low Frame Rate Universal Audio Codec with Semantic–Acoustic Disentanglement

ASLP-lab/OmniCodec

Feature Extraction • Updated Apr 8 • 1

WenetSpeech-Wu

ASLP-lab/WenetSpeech-Wu-Speech-Understanding

Updated Feb 2 • 2
ASLP-lab/WenetSpeech-Wu-Bench

Viewer • Updated Feb 8 • 242 • 427 • 4
ASLP-lab/WenetSpeech-Wu-Speech-Generation

Text-to-Speech • Updated Feb 1 • 3
ASLP-lab/WenetSpeech-Wu

Updated Feb 5 • 79 • 1

SenSE

ASLP-lab/SenSE

Updated Oct 16, 2025 • 10 • 7

SongFormer

Configuration error

Agents

22

SongFormer

🎵

22

State-of-the-art music analysis with multi-scale datasets
ASLP-lab/SongFormer

0.7B • Updated about 1 month ago • 286 • 17
ASLP-lab/SongFormDB

Updated 30 days ago • 6.96k • 9
ASLP-lab/SongFormBench

Viewer • Updated about 1 month ago • 3.82k • 531 • 3

WenetSpeech-Yue

A Large-scale Cantonese Speech Corpus with Multi-dimensional Annotation

Runtime error

Agents

12

WenetSpeech Yue

🔥

12

Large-Scale Cantonese Speech Corpus
ASLP-lab/WenetSpeech-Yue

Updated Feb 5 • 450 • 42
ASLP-lab/WSYue-ASR-eval

Viewer • Updated Sep 8, 2025 • 7.8k • 127 • 3
ASLP-lab/WSYue-TTS-eval

Viewer • Updated Sep 8, 2025 • 1.31k • 41 • 1

LLaSE

ASLP-lab/LLaSE-G1

Audio-to-Audio • Updated Mar 14, 2025 • 28

DiffRhythm

ASLP-lab/DiffRhythm-vae

Updated May 8, 2025 • 42
ASLP-lab/DiffRhythm-base

Updated Mar 26, 2025 • 26 • 171
ASLP-lab/DiffRhythm-full

Updated Mar 26, 2025 • 26 • 50
ASLP-lab/DiffRhythm-1_2

Updated May 8, 2025 • 33 • 17

Speaker-Reasoner

Speaker-Reasoner: Scaling Interaction Turns and Reasoning Patterns for Timestamped Speaker-Attributed ASR

ASLP-lab/Speaker-Reasoner-4194h

32B • Updated Apr 24 • 142 • 1
ASLP-lab/Speaker-Reasoner

32B • Updated Apr 24 • 18 • 2

YingMusic-Singer-Plus

YingMusic-Singer-Plus: Controllable Singing Voice Synthesis with Flexible Lyric Manipulation and Annotation-free Melody Guidance

Configuration error

Agents

8

YingMusic-Singer-Plus

🎤

8

Edit lyrics, keep the melody
ASLP-lab/YingMusic-Singer-Plus

0.7B • Updated Apr 9 • 968 • 7
ASLP-lab/LyricEditBench

Viewer • Updated Apr 9 • 7.2k • 502 • 2

VoiceSculptor

An instruct text-to-speech model developed by ASLP.

ASLP-lab/VoiceSculptor-VD

Text-to-Speech • 4B • Updated Feb 26 • 18 • 18
Runtime error

Agents

1

VoiceSculptor

📚

1

Easy Turn

ASLP-lab/Easy-Turn

Updated Oct 11, 2025 • 48 • 15
ASLP-lab/Easy-Turn-Trainset

Viewer • Updated Oct 18, 2025 • 1.91k • 639 • 9
ASLP-lab/Easy-Turn-Testset

Updated Sep 30, 2025 • 646 • 7

WenetSpeech-Chuan

a large-scale open-source corpus with a full processing pipeline and benchmarks for ASR and TTS

ASLP-lab/WSChuan-ASR

Automatic Speech Recognition • Updated Jan 9 • 6
ASLP-lab/WSChuan-TTS

Updated Sep 24, 2025 • 4
ASLP-lab/WSC-Train

Preview • Updated Apr 21 • 359 • 125
ASLP-lab/WSC-Eval

Viewer • Updated Dec 10, 2025 • 1.19k • 1.61k • 7

OSUM

OSUM: Open Speech Understanding Model, open-sourced by ASLP@NPU.

ASLP-lab/OSUM

Updated Feb 16, 2025 • 12

C2SER

ASLP-lab/Emotion2Vec-S

Updated Feb 27, 2025 • 4
ASLP-lab/C2SER-LLM

Updated Mar 3, 2025
ASLP-lab/Emo-Emilia

Viewer • Updated Feb 27, 2025 • 1.4k • 376 • 10

FMSU

Towards Fine-Grained Multi-Dimensional Speech Understanding: Data Pipeline, Benchmark, and Model

ASLP-lab/FMSU-Bench

Updated 24 days ago • 77 • 1
ASLP-lab/FM-Speech

Audio Classification • Updated 23 days ago • 2

Speaker-Reasoner

Speaker-Reasoner: Scaling Interaction Turns and Reasoning Patterns for Timestamped Speaker-Attributed ASR

ASLP-lab/Speaker-Reasoner-4194h

32B • Updated Apr 24 • 142 • 1
ASLP-lab/Speaker-Reasoner

32B • Updated Apr 24 • 18 • 2

OmniCodec

Low Frame Rate Universal Audio Codec with Semantic–Acoustic Disentanglement

ASLP-lab/OmniCodec

Feature Extraction • Updated Apr 8 • 1

YingMusic-Singer-Plus

YingMusic-Singer-Plus: Controllable Singing Voice Synthesis with Flexible Lyric Manipulation and Annotation-free Melody Guidance

Configuration error

Agents

8

YingMusic-Singer-Plus

🎤

8

Edit lyrics, keep the melody
ASLP-lab/YingMusic-Singer-Plus

0.7B • Updated Apr 9 • 968 • 7
ASLP-lab/LyricEditBench

Viewer • Updated Apr 9 • 7.2k • 502 • 2

WenetSpeech-Wu

ASLP-lab/WenetSpeech-Wu-Speech-Understanding

Updated Feb 2 • 2
ASLP-lab/WenetSpeech-Wu-Bench

Viewer • Updated Feb 8 • 242 • 427 • 4
ASLP-lab/WenetSpeech-Wu-Speech-Generation

Text-to-Speech • Updated Feb 1 • 3
ASLP-lab/WenetSpeech-Wu

Updated Feb 5 • 79 • 1

VoiceSculptor

An instruct text-to-speech model developed by ASLP.

ASLP-lab/VoiceSculptor-VD

Text-to-Speech • 4B • Updated Feb 26 • 18 • 18
Runtime error

Agents

1

VoiceSculptor

📚

1

SenSE

ASLP-lab/SenSE

Updated Oct 16, 2025 • 10 • 7

Easy Turn

ASLP-lab/Easy-Turn

Updated Oct 11, 2025 • 48 • 15
ASLP-lab/Easy-Turn-Trainset

Viewer • Updated Oct 18, 2025 • 1.91k • 639 • 9
ASLP-lab/Easy-Turn-Testset

Updated Sep 30, 2025 • 646 • 7

SongFormer

Configuration error

Agents

22

SongFormer

🎵

22

State-of-the-art music analysis with multi-scale datasets
ASLP-lab/SongFormer

0.7B • Updated about 1 month ago • 286 • 17
ASLP-lab/SongFormDB

Updated 30 days ago • 6.96k • 9
ASLP-lab/SongFormBench

Viewer • Updated about 1 month ago • 3.82k • 531 • 3

WenetSpeech-Chuan

a large-scale open-source corpus with a full processing pipeline and benchmarks for ASR and TTS

ASLP-lab/WSChuan-ASR

Automatic Speech Recognition • Updated Jan 9 • 6
ASLP-lab/WSChuan-TTS

Updated Sep 24, 2025 • 4
ASLP-lab/WSC-Train

Preview • Updated Apr 21 • 359 • 125
ASLP-lab/WSC-Eval

Viewer • Updated Dec 10, 2025 • 1.19k • 1.61k • 7

WenetSpeech-Yue

A Large-scale Cantonese Speech Corpus with Multi-dimensional Annotation

Runtime error

Agents

12

WenetSpeech Yue

🔥

12

Large-Scale Cantonese Speech Corpus
ASLP-lab/WenetSpeech-Yue

Updated Feb 5 • 450 • 42
ASLP-lab/WSYue-ASR-eval

Viewer • Updated Sep 8, 2025 • 7.8k • 127 • 3
ASLP-lab/WSYue-TTS-eval

Viewer • Updated Sep 8, 2025 • 1.31k • 41 • 1