Unsloth

Overview

Unsloth is an open-source tooling project focused on efficient LLM fine-tuning and quantization. In 2026 it has become a prominent provider of GGUF quantizations for community open-weight models, known for rapid same-week turnaround from upstream model releases.

Timeline

2026-05-12-AI-Digest — Unsloth released GGUF builds of Qwen3.6-27B and Qwen3.6-35B-A3B with the multi-token-prediction layer preserved, enabling speculative-style MTP inference. Users still need to build llama.cpp from the open MTP PR, but the ready-made GGUFs lower the barrier for the local-inference community to benchmark real-world MTP speed gains. The release continues Unsloth’s pattern of same-week GGUF delivery for new open-weight releases.

Key Developments

Qwen3.6 MTP GGUFs: First provider to ship GGUF builds of Qwen3.6-27B and 35B-A3B with the multi-token-prediction layer intact, lowering the barrier to benchmark speculative-inference throughput gains without having to build the full llama.cpp MTP branch from source.