quantization - Creative AI News

News May 28, 2026

NVIDIA ships an NVFP4 4-bit quantized build of Qwen3.6-35B-A3B, cutting GPU memory 3x with under 1% accuracy loss on eight benchmarks.