BigScience Workshop

non-profit

https://bigscience.huggingface.co

bigscience-workshop

AI & ML interests

A one-year long research workshop on large language models: the Summer of Language Models 21 🌸

Recent Activity

NohTow authored a paper 6 days ago

ColBERT-Zero: To Pre-train Or Not To Pre-train ColBERT models

nihalnayak authored a paper 8 days ago

A Critical Look at Targeted Instruction Selection: Disentangling What Matters (and What Doesn't)

nihalnayak submitted a paper 9 days ago

A Critical Look at Targeted Instruction Selection: Disentangling What Matters (and What Doesn't)

View all activity

submitted a paper to Daily Papers 26 days ago

Shaping capabilities with token-level data filtering

Paper • 2601.21571 • Published 27 days ago • 27

afaji

authored a paper 29 days ago

PingPong: A Natural Benchmark for Multi-Turn Code-Switching Dialogues

Paper • 2601.17277 • Published Jan 24 • 6

authored a paper 29 days ago

PingPong: A Natural Benchmark for Multi-Turn Code-Switching Dialogues

Paper • 2601.17277 • Published Jan 24 • 6

shubhamagarwal92

authored a paper about 2 months ago

BhashaKritika: Building Synthetic Pretraining Data at Scale for Indic Languages

Paper • 2511.10338 • Published Nov 13, 2025

in bigscience/bloomz-560m 3 months ago

Fails to load with transformers v4.57+

#14 opened 3 months ago by

authored a paper 3 months ago

Economies of Open Intelligence: Tracing Power & Participation in the Model Ecosystem

Paper • 2512.03073 • Published Nov 27, 2025 • 6

posted an update 3 months ago

Post

434

PatchDNA, a DNA foundation model based on Meta's BLT tokenization strategy https://www.biorxiv.org/content/10.1101/2025.11.28.691095v1

in bigscience/petals-api 3 months ago

Bloom

#2 opened 3 months ago by

authored a paper 4 months ago

Grounding Computer Use Agents on Human Demonstrations

Paper • 2511.07332 • Published Nov 10, 2025 • 106

authored a paper 4 months ago

OpenSIR: Open-Ended Self-Improving Reasoner

Paper • 2511.00602 • Published Nov 1, 2025 • 21

Zaid

authored a paper 4 months ago

Global PIQA: Evaluating Physical Commonsense Reasoning Across 100+ Languages and Cultures

Paper • 2510.24081 • Published Oct 28, 2025 • 19

authored a paper 4 months ago

The German Commons - 154 Billion Tokens of Openly Licensed Text for German Language Models

Paper • 2510.13996 • Published Oct 15, 2025 • 9

authored 8 papers 4 months ago

Unveiling and Consulting Core Experts in Retrieval-Augmented MoE-based LLMs

Paper • 2410.15438 • Published Oct 20, 2024

PosterSum: A Multimodal Benchmark for Scientific Poster Summarization

Paper • 2502.17540 • Published Feb 24, 2025 • 3

Self-Training Large Language Models for Tool-Use Without Demonstrations

Paper • 2502.05867 • Published Feb 9, 2025

Q-Filters: Leveraging QK Geometry for Efficient KV Cache Compression

Paper • 2503.02812 • Published Mar 4, 2025 • 10

Parameter-Efficient Fine-Tuning of LLaMA for the Clinical Domain

Paper • 2307.03042 • Published Jul 6, 2023

An Analysis of Decoding Methods for LLM-based Agents for Faithful Multi-Hop Question Answering

Paper • 2503.23415 • Published Mar 30, 2025 • 1

MedDistant19: Towards an Accurate Benchmark for Broad-Coverage Biomedical Relation Extraction

Paper • 2204.04779 • Published Apr 10, 2022

PiCSAR: Probabilistic Confidence Selection And Ranking

Paper • 2508.21787 • Published Aug 29, 2025 • 4