Asset Details

MbrlCatalogueTitleDetail

Do you wish to reserve the book?

Neuron-by-Neuron Quantization for Efficient Low-Bit QNN Training

by Limonova, Elena , Nikolaev, Dmitry , Sher, Artem , Trusov, Anton , Arlazarov, Vladimir V.

in Accuracy / Datasets / Efficiency / Field programmable gate arrays / Internet of Things / layer-by-layer / Linear programming / low-bit quantization / Measurement / Methods / Neural networks / neuron-by-neuron training / Neurons / quantized neural network / Training

2023

Yes Please

Hey, we have placed the reservation for you!

By the way, why not check out events that you can attend while you pick your title.

Oops! Something went wrong.

Looks like we were not able to place the reservation. Kindly try again later.

Are you sure you want to remove the book from the shelf?

Neuron-by-Neuron Quantization for Efficient Low-Bit QNN Training

by Limonova, Elena , Nikolaev, Dmitry , Sher, Artem , Trusov, Anton , Arlazarov, Vladimir V.

2023

Confirm

Do you wish to request the book?

Neuron-by-Neuron Quantization for Efficient Low-Bit QNN Training

by Limonova, Elena , Nikolaev, Dmitry , Sher, Artem , Trusov, Anton , Arlazarov, Vladimir V.

2023

Please be aware that the book you have requested cannot be checked out. If you would like to checkout this book, you can reserve another copy

How would you like to get it?

Submit

We have requested the book for you!

Your request is successful and it will be processed during the Library working hours. Please check the status of your request in My Requests.

Oops! Something went wrong.

Looks like we were not able to place your request. Kindly try again later.

Journal Article

Neuron-by-Neuron Quantization for Efficient Low-Bit QNN Training

Limonova, Elena,

Nikolaev, Dmitry,

Sher, Artem,

Trusov, Anton,

Arlazarov, Vladimir V.

2023

Overview

Quantized neural networks (QNNs) are widely used to achieve computationally efficient solutions to recognition problems. Overall, eight-bit QNNs have almost the same accuracy as full-precision networks, but working several times faster. However, the networks with lower quantization levels demonstrate inferior accuracy in comparison to their classical analogs. To solve this issue, a number of quantization-aware training (QAT) approaches were proposed. In this paper, we study QAT approaches for two- to eight-bit linear quantization schemes and propose a new combined QAT approach: neuron-by-neuron quantization with straight-through estimator (STE) gradient forwarding. It is suitable for quantizations with two- to eight-bit widths and eliminates significant accuracy drops during training, which results in better accuracy of the final QNN. We experimentally evaluate our approach on CIFAR-10 and ImageNet classification and show that it is comparable to other approaches for four to eight bits and outperforms some of them for two to three bits while being easier to implement. For example, the proposed approach to three-bit quantization of the CIFAR-10 dataset results in 73.2% accuracy, while baseline direct and layer-by-layer result in 71.4% and 67.2% accuracy, respectively. The results for two-bit quantization for ResNet18 on the ImageNet dataset are 63.69% for our approach and 61.55% for the direct baseline.

Share this book

Add to My Shelf

Publisher

MDPI AG

Subject

Accuracy

/ Datasets

/ Efficiency

/ Field programmable gate arrays

/ Internet of Things

/ layer-by-layer

/ Linear programming