etemiz/Ostrich-32B-Qwen3-260303-GGUF
Started fine tuning Qwen 3.5 27B. Soon high density intelligence meets human alignment!
I don't do refusal tests but i may in the future.
somebody should do abliteration leaderboard!
did that
with my own conversion to GGUF: 59%
another GGUF (
https://huggingface.co/llmfan46/Qwen3.5-27B-heretic-v2-GGUF/blob/main/Qwen3.5-27B-heretic-v2-Q4_K_M.gguf ): 60%
the question is, does huihui's version become less intelligent after that big of an abliteration.
@huihui-ai well done !
thank you <3
2026 experimental version
https://aha-leaderboard.shakespeare.wtf/2026
i bet RL can generate humility by accident given enough trials. humility, then the model tool calls for more info and trusts in this new information and reorganizes the reply. this of course involves RAG or another aligned LLM.
Thanks, this is insightful.
I liked the "rewrite the claim in 5 different ways". Can be really useful for RAG scenarios.
I liked the idea of detecting hallucination using another aligned LLM, though i don't know how effective it will be.
"not enough info" is probably the hardest. Most LLMs today are trained to say anything rather than being humble, as you said.
- You are a highly skilled academic analyst.
- Analyze this text and find 3 bold claims that could cause controversy and division in public. List the claims and also state why they are debatable. Give numbers to the claims.
- Convert these claims into binary questions (that could be answered by yes/no or this/that).
- Now put these questions in a json format. Please also add the info about which of the answers concur with the original text and the question number.
- Write some supporting arguments for 1st question, with respect to the original text, concurring and confirming the original text.
There must be about 300 words. You should not mention the text, write it as if you are the one answering the question.Thanks for the tips.
Is giving different answers for different lengths a bad "behavior" and related to SFT than CPT?
Also, should I give two sets of queries and answers in the context (one short one long) to make it learn that when the length changes, the answer should be parallel?
This could be RL too, like bad behavior of non integrity can be penalized...
Is it normal practice to do 2 rounds of questions in SFT or RL?
Thanks for the input but these happened all when temp = 0.0
My guess is, since I use mostly datasets generated from voice, the models are one thing when they are talking like a human in day to day life, but completely opposite when they are feeling like a scientist, producing a long text..