the-hf-stack/dagster-hf-datasets-examples
210 kB
Infrastructure, integrations and tooling for the Hugging Face ecosystem.
HFStack is an open-source organization focused on building reproducible ML infrastructure around the Hugging Face stack.
Building reproducible workflows around Hugging Face Datasets and HF Buckets.
Experiment tracking, artifact lineage, and reproducible evaluation pipelines using Trackio.
Inference benchmarking, optimization workflows, and runtime evaluation tooling.
Composable integrations with tools like Dagster and ecosystem-native ML workflows.
HFStack focuses on the systems surrounding modern ML:
The goal is to make Hugging Face workflows easier to build and operationalize.
dagster-hf-datasets: Dagster-HF-Datasets integrates Hugging Face datasets with Dagster for building reproducible, observable data pipelines.
Load datasets directly as Dagster assets, apply transformations, and publish results back to the Hub.Contributions and ecosystem collaborations are welcome.