RedHatAI/Phi-3-medium-128k-instruct-quantized.w8a8
Text Generation • 14B • Updated • 21 • 2
OpenSource and AI
SNLP: Layer-Parallel Inference via Structured Newton Corrections
S2D2: Fast Decoding for Diffusion LLMs via Training-Free Self-Speculation