Instructions to use iboing/CorDA_IPA_math_finetuned_math with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use iboing/CorDA_IPA_math_finetuned_math with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="iboing/CorDA_IPA_math_finetuned_math", trust_remote_code=True)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("iboing/CorDA_IPA_math_finetuned_math", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("iboing/CorDA_IPA_math_finetuned_math", trust_remote_code=True)

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use iboing/CorDA_IPA_math_finetuned_math with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "iboing/CorDA_IPA_math_finetuned_math"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "iboing/CorDA_IPA_math_finetuned_math",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/iboing/CorDA_IPA_math_finetuned_math

SGLang

How to use iboing/CorDA_IPA_math_finetuned_math with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "iboing/CorDA_IPA_math_finetuned_math" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "iboing/CorDA_IPA_math_finetuned_math",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "iboing/CorDA_IPA_math_finetuned_math" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "iboing/CorDA_IPA_math_finetuned_math",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use iboing/CorDA_IPA_math_finetuned_math with Docker Model Runner:
```
docker model run hf.co/iboing/CorDA_IPA_math_finetuned_math
```

CorDA_IPA_math_finetuned_math / README.md

iboing

Update README.md

a4f57b2 verified almost 2 years ago

preview code

raw

history blame contribute delete

877 Bytes

metadata

license: llama2

The LLaMA-2-7b model finetuned on the Math task using CorDA in the IPA mode with MetaMath.

Method	TriviaQA	NQ open	GSM8k	Math
LoRA	44.17	1.91	42.68	5.92
CorDA (KPA with nqopen)	45.23	10.44	45.64	6.94
CorDA (IPA with MetaMath)	-	-	54.59	8.54

You can evaluate the model's performance following the step-3 in CorDA github repo.

Note: The model trained using CorDA adapter is based on customized code. If you want to restore the original LLaMA architecture, execute merge_adapter_for_corda.py in CorDA github repo.