Instructions to use TildeAI/TildeOpen-30b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use TildeAI/TildeOpen-30b with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="TildeAI/TildeOpen-30b")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("TildeAI/TildeOpen-30b") model = AutoModelForCausalLM.from_pretrained("TildeAI/TildeOpen-30b") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use TildeAI/TildeOpen-30b with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "TildeAI/TildeOpen-30b" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "TildeAI/TildeOpen-30b", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/TildeAI/TildeOpen-30b
- SGLang
How to use TildeAI/TildeOpen-30b with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "TildeAI/TildeOpen-30b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "TildeAI/TildeOpen-30b", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "TildeAI/TildeOpen-30b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "TildeAI/TildeOpen-30b", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use TildeAI/TildeOpen-30b with Docker Model Runner:
docker model run hf.co/TildeAI/TildeOpen-30b
New all layer 160M token finetune tildeopen-30b-mu-instruct
SPACE: https://huggingface.co/spaces/martinsu/tildeopen-30b-mu-instruct-space
Probably will run with resetting quotas at midnight.
MODEL: https://huggingface.co/martinsu/tildeopen-30b-mu-instruct
Kudos to the Tilde team for a great base model and for taming that large LUMI beast β that was probably a crazy journey!
This is a fine-tuned 30B multilingual instruction model. It shows strong performance on the EuroBlocks multilingual evaluation compared to similarly sized models, with notably concise outputs.
These benchmarks for now are basically smoke tests to verify I didn't create a disaster. Always run your own evaluations for your specific use case.
I'll definitely use it for LV language work as a Gemma 3 replacement; it seems more capable. It seems to have acquired proper alignment from broad training sets too, at least at a basic level.
I'll run and publish more tests, perhaps using quantization.
On top of this fine-tune, one can use a lighter touch to nudge the model toward the right predictions.
Any suggestions? Ideas? As for me, i now have new local LV LLM workhorse, hope others too will have this useful.
Hyperparameters:
LR: 2e-5, cosine schedule, 3% warmup
Batch: 24 effective
Seq length: 4096
Weight decay: 0.01, grad clip: 1.0
Steps: 7,514 (1 epoch)
Data (163M tokens, 181K examples):
HuggingFaceH4/ultrachat_200k (20% sampling) β 41.6K examples, 59.7M tokens
utter-project/EuroBlocks-SFT-Synthetic-1124 (20% sampling) β 85K examples, 58.6M tokens
galileo-ai/ragbench all 12 subsets (30% sampling) β 22K examples, 26.4M tokens
Subsets: covidqa, cuad, delucionqa, emanual, expertqa, finqa, hagrid, hotpotqa, msmarco, pubmedqa, tatqa, techqa
martinsu/latvian-wikipedia-qa-gemma3 (20% sampling, filtered) β 22.3K examples, 16.7M tokens
yahma/alpaca-cleaned (20% sampling) β 10.4K examples, 2.5M tokens
Language breakdown (163M tokens across 25 languages):
English: 117.7M (72%) - primary language
Latvian: 16.7M (10%) - European focus
Chinese: 10.1M (6%) - Asian coverage
Portuguese: 3.0M (2%) - Romance
Italian: 2.3M (1.4%) - Romance
Spanish: 2.1M (1.3%) - Romance
Hindi: 2.0M (1.2%)
French: 1.8M (1.1%) - Romance
German: 1.4M (0.8%) - Germanic
Dutch: 1.1M (0.7%) - Germanic
Plus 15 more: Japanese, Ukrainian, Swedish, Hungarian, Polish, Czech, Russian, Korean, Romanian, Finnish, Greek, Slovak, Norwegian, Slovenian, Estonian (4.9M combined, 3%)
Response-only training: Custom collator masks user/system messages, loss only on assistant responses.
ChatML Template Format.
Hi, Martins!
Thanks for the great work on the model, it is quite impressive! It is nice to see the base model being put to a real use by the community.
If you want to get in touch for some feedback and exchange experience in greater detail, please feel free to contact me: martins.kronis@tilde.lv
Best,
Martins from Tilde Open Team