🤗 HuggingFaceSignificantLewis Tunstall
Qwen3-35B-A3B: Running a 35B Parameter MoE Model on Local Hardware
Qwen3-35B-A3B is a new model from Alibaba's Qwen3 series. This tweet announces local inference capability using llama.cpp (the popular gguf-format inference engine) combined with Unsloth's 4-bit quantization, enabling a…