Explanation of the GPTQ Algorithm

When researching optimisation techniques for reducing the computational load and memory footprint of ML models, I found many wrong and confusing explanations of the GPTQ algorithm. I hope to explain it better in this blog post.

Will be written up whenever I’m not overloaded with uni work!