πŸš€ JoyAI-LLM-Flash Official Collection Notice

#17
by Mingke977 - opened
JD Open Source org

We are excited to announce that the official JoyAI-LLM-Flash collection by jdopensource now provides a complete lineup of models designed to support diverse hardware environments β€” from high-performance GPUs to memory-constrained edge devices.

All models below are part of the official Hugging Face collection.

πŸ”· Core Models

  1. jdopensource/JoyAI-LLM-Flash
    Full-precision flagship model for high-quality inference and advanced reasoning.
    πŸ‘‰ https://huggingface.co/jdopensource/JoyAI-LLM-Flash

  2. jdopensource/JoyAI-LLM-Flash-Base
    Streamlined base version for rapid experimentation and foundational deployment.
    πŸ‘‰ https://huggingface.co/jdopensource/JoyAI-LLM-Flash-Base

πŸ”Ά Optimized & Quantized Variants

  1. jdopensource/JoyAI-LLM-Flash-FP8
    FP8 quantization β€” strong balance between performance and efficiency.
    πŸ‘‰ https://huggingface.co/jdopensource/JoyAI-LLM-Flash-FP8

  2. jdopensource/JoyAI-LLM-Flash-INT4
    Ultra-compact INT4 version for extremely limited VRAM environments.
    πŸ‘‰ https://huggingface.co/jdopensource/JoyAI-LLM-Flash-INT4

  3. jdopensource/JoyAI-LLM-Flash-Block-INT8
    Block-wise INT8 quantization for improved memory efficiency.
    πŸ‘‰ https://huggingface.co/jdopensource/JoyAI-LLM-Flash-Block-INT8

  4. jdopensource/JoyAI-LLM-Flash-Channel-INT8
    Channel-wise INT8 optimization for balanced throughput and accuracy.
    πŸ‘‰ https://huggingface.co/jdopensource/JoyAI-LLM-Flash-Channel-INT8

  5. jdopensource/JoyAI-LLM-Flash-GGUF
    GGUF format for broad compatibility (e.g., llama.cpp, Ollama, CPU inference).
    πŸ‘‰ https://huggingface.co/jdopensource/JoyAI-LLM-Flash-GGUF

🎯 Deploy Anywhere

Whether you are running:

  1. High-throughput GPU servers

  2. Consumer-grade GPUs

  3. Low-memory edge devices

  4. CPU-only local inference

The official JoyAI-LLM-Flash collection provides a tailored option for your hardware stack.

Choose the precision. Match your memory budget. Maximize performance.

Mingke977 pinned discussion

Sign up or log in to comment