AI & ML interests
None defined yet.
Recent Activity
textcleanlm/essentialweb-1.0-10B-clean-content
Viewer
• Updated
• 9.32M • 37
textcleanlm/essentialweb-1.0-10B-raw-content
Viewer
• Updated
• 9.32M • 50
textcleanlm/essentialweb-1.0-sample-10B
Viewer
• Updated
• 9.32M • 61
Viewer
• Updated
• 2.98M • 18
textcleanlm/med-domain-5b
Viewer
• Updated
• 4.07M • 8
textcleanlm/med-domain-data-sample1
Viewer
• Updated
• 814k • 4
textcleanlm/med-domain-data-sample
Viewer
• Updated
• 8.1k • 4
textcleanlm/fineweb-sample-10BT
Viewer
• Updated
• 14.9M • 23
textcleanlm/training-data-2
Viewer
• Updated
• 66.3k • 6
textcleanlm/textclean-10B
Viewer
• Updated
• 9.77M • 96
textcleanlm/textclean-2B-raw-cleaned
Viewer
• Updated
• 1.95M • 17
textcleanlm/textclean-2B-raw-sample
Viewer
• Updated
• 100 • 4
textcleanlm/textclean-2B-raw
Viewer
• Updated
• 1.97M • 7
textcleanlm/textclean-sft
Viewer
• Updated
• 894k • 8
Viewer
• Updated
• 91.7k • 6
textcleanlm/textclean-200M
Viewer
• Updated
• 581k • 15
textcleanlm/100M-raw-webtext-to-denoised-text
Viewer
• Updated
• 179k • 104
textcleanlm/annotation_example
Viewer
• Updated
• 1.82k • 99
Viewer
• Updated
• 1.82k • 107
textcleanlm/textclean-20M
Viewer
• Updated
• 18.3k • 167
textcleanlm/textclean-corpus-10M-deepseek-ablation
Viewer
• Updated
• 18.1k • 5
textcleanlm/textclean-corpus-1M-variant-ablation-research
Viewer
• Updated
• 1.82k • 86
textcleanlm/textclean-corpus-1M-old
Viewer
• Updated
• 1.82k • 82
• 1
textcleanlm/textclean-corpus-1M-o4-mini
Viewer
• Updated
• 1.82k • 82