Post
72
I dropped a new scheduler I created last week without much of an explanation of what it was or how it worked called the Lucky Pick Scheduler. It was just a modal ready app that anyone could have launched and troubleshot their way around.
I've decided I'm going to enter it into the AMD hackathon. Today I started putting together a Github repo with a few extra additions to the scheduler itself.
Essentially it's a training scheduler that randomly drops layers/heads/channels every ~50 steps during fine-tuning, holds the topology frozen, then reshuffles. In theory the model has to build distributed representations because it never trains through the same compute path for long.
And with less gradient memory, bigger models are able fit on smaller hardware.
It's now close to fully capable of automatically configuring itself to any language mode. I've tested it on:
-Qwen-2.5-3b-Instruct
-Falcon-E-3B-Instruct
-SmolLM2-360M
-Ministral-3-3B-Instruct-2512
-Doge-320M
-Llama-3.2-3b
-Gemma-4-e4b
-Phi-4-mini
-OLMo-2-0425-1B
-Phi-tiny-MoE-instruct
Feel free to check it out at Github: https://github.com/JuiceB0xC0de/lucky-pick-scheduler.git
I've decided I'm going to enter it into the AMD hackathon. Today I started putting together a Github repo with a few extra additions to the scheduler itself.
Essentially it's a training scheduler that randomly drops layers/heads/channels every ~50 steps during fine-tuning, holds the topology frozen, then reshuffles. In theory the model has to build distributed representations because it never trains through the same compute path for long.
And with less gradient memory, bigger models are able fit on smaller hardware.
It's now close to fully capable of automatically configuring itself to any language mode. I've tested it on:
-Qwen-2.5-3b-Instruct
-Falcon-E-3B-Instruct
-SmolLM2-360M
-Ministral-3-3B-Instruct-2512
-Doge-320M
-Llama-3.2-3b
-Gemma-4-e4b
-Phi-4-mini
-OLMo-2-0425-1B
-Phi-tiny-MoE-instruct
Feel free to check it out at Github: https://github.com/JuiceB0xC0de/lucky-pick-scheduler.git