Instructions to use jdopensource/JoyAI-LLM-Flash with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use jdopensource/JoyAI-LLM-Flash with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="jdopensource/JoyAI-LLM-Flash", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("jdopensource/JoyAI-LLM-Flash", trust_remote_code=True, dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use jdopensource/JoyAI-LLM-Flash with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "jdopensource/JoyAI-LLM-Flash"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "jdopensource/JoyAI-LLM-Flash",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/jdopensource/JoyAI-LLM-Flash

SGLang

How to use jdopensource/JoyAI-LLM-Flash with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "jdopensource/JoyAI-LLM-Flash" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "jdopensource/JoyAI-LLM-Flash",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "jdopensource/JoyAI-LLM-Flash" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "jdopensource/JoyAI-LLM-Flash",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use jdopensource/JoyAI-LLM-Flash with Docker Model Runner:
```
docker model run hf.co/jdopensource/JoyAI-LLM-Flash
```

Question regarding high accuracy on the GSM8K dataset

#15

by wzy1023 - opened Feb 25

Discussion

wzy1023

Feb 25

Hello! I noticed that the model achieves high accuracy on the GSM8K dataset, and I have a few questions regarding this:
Training Data Usage
Could you please clarify whether the model was trained using the GSM8K train dataset? If so, could you provide details on the specific training setup?
Evaluation Rigor
If the reported accuracy result involved using the train data, does this mean the evaluation was not performed on a completely independent test set? Could this potentially introduce overfitting risks and affect the credibility of the results?
Experimental Setup
Could you please share the complete experimental setup? For example, did you follow the standard train/validation/test split to ensure the validity of the results?
Thank you for your help!

Mingke977 changed discussion status to closed Mar 5

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment