AraMix is a SOTA Arabic pretraining dataset
Sultan Alrashed PRO
SultanR
AI & ML interests
Smol language modelling and Arabic!
Recent Activity
liked
a dataset
about 14 hours ago
Zyphra/dclm-dedup
published
a dataset
about 15 hours ago
SultanR/dclm-edu-ar-500k
updated
a dataset
1 day ago
SultanR/dclm-edu-ar-500k