geronimo

g-ronimo

https://medium.com/@geronimo7

geronimi73

AI & ML interests

fafo

Recent Activity

liked a dataset 19 days ago

tencent/MegaStyle-1.4M

upvoted an article 23 days ago

Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective

commentedon an article 23 days ago

Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective

View all activity

Organizations

upvoted an article 23 days ago

Article

Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective

•

Jan 27

• 75

upvoted 3 articles 3 months ago

Article

PRX Part 3 — Training a Text-to-Image Model in 24h!

Photoroom

•

Mar 3

• 64

Article

Training Design for Text-to-Image Models: Lessons from Ablations

Photoroom

•

Feb 3

• 73

Article

Custom Kernels for All from Codex and Claude

burtenshaw, sayakpaul, ariG23498, evalstate

•

Feb 13

• 77

upvoted 2 articles 4 months ago

Article

Swift Transformers Reaches 1.0 – and Looks to the Future

pcuenq, FL33TW00D-HF, mattt, reach-vb

•

Sep 26, 2025

• 43

Article

Small Yet Mighty: Improve Accuracy In Multimodal Search and Visual Document Retrieval with Llama Nemotron RAG Models

nvidia

•

Jan 6

• 28

upvoted 4 articles 6 months ago

Article

Continuous batching from first principles

ror, ArthurZ, mcpotato

•

Nov 25, 2025

• 385

Article

Text-to-image Architectural Experiments

Photoroom

•

Nov 13, 2025

• 57

Article

We’re open-sourcing our text-to-image model and the process behind it

Photoroom

•

Nov 12, 2025

• 99

Article

Diffusers welcomes FLUX-2

YiYiXu, dg845, sayakpaul, OzzyGT, dn6, ariG23498, linoyts, multimodalart

•

Nov 25, 2025

• 191

upvoted a collection 8 months ago

Granite Embedding

Collection

Embedding models (bi‑encoders and rerankers) for RAG, semantic search, and retrieval tasks. • 9 items • Updated 18 days ago • 44

upvoted a paper 10 months ago

MetaCLIP 2: A Worldwide Scaling Recipe

Paper • 2507.22062 • Published Jul 29, 2025 • 37

upvoted 2 articles 10 months ago

Article

Extending Transformer layers as Painters to DiT's

NagaSaiAbhinay

•

Aug 31, 2024

• 16

Article

LeRobot.js

NERDDISCO

•

Jul 14, 2025

• 16

upvoted an article 11 months ago

Article

Learn the Hugging Face Kernel Hub in 5 Minutes

drbh, danieldk, Narsil, pcuenq, pagezyhf, merve, reach-vb

•

Jun 12, 2025

• 164

upvoted 2 articles 12 months ago

Article

KV Cache from scratch in nanoVLM

ariG23498, kashif, lusxvr, andito, pcuenq

•

Jun 4, 2025

• 119

Article

nanoVLM: The simplest repository to train your VLM in pure PyTorch

ariG23498, lusxvr, andito, sergiopaniego, merve, pcuenq, reach-vb

•

May 21, 2025

• 258

upvoted a paper about 1 year ago

OpenVision: A Fully-Open, Cost-Effective Family of Advanced Vision Encoders for Multimodal Learning

Paper • 2505.04601 • Published May 7, 2025 • 29

upvoted a collection about 1 year ago

Vision

Collection

164 items • Updated Mar 18 • 1

upvoted an article about 1 year ago

Article

Remote VAEs for decoding with Inference Endpoints 🤗

hlky, sayakpaul

•

Feb 24, 2025

• 41

geronimo

AI & ML interests

Recent Activity

Organizations

g-ronimo's activity

Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective

PRX Part 3 — Training a Text-to-Image Model in 24h!

Training Design for Text-to-Image Models: Lessons from Ablations

Custom Kernels for All from Codex and Claude

Swift Transformers Reaches 1.0 – and Looks to the Future

Small Yet Mighty: Improve Accuracy In Multimodal Search and Visual Document Retrieval with Llama Nemotron RAG Models

Continuous batching from first principles

Text-to-image Architectural Experiments

We’re open-sourcing our text-to-image model and the process behind it

Diffusers welcomes FLUX-2

Extending *Transformer layers as Painters* to DiT's

LeRobot.js

Learn the Hugging Face Kernel Hub in 5 Minutes

KV Cache from scratch in nanoVLM

nanoVLM: The simplest repository to train your VLM in pure PyTorch

Remote VAEs for decoding with Inference Endpoints 🤗

Extending Transformer layers as Painters to DiT's