Benchmarking and Analysing the NIST PQC Finalist Lattice-Based Signature Schemes on the ARM Cortex M7


March 31, 2022







James Howe, Bas Westerbaan


This paper presents an analysis of the two lattice-based digital signature schemes, Dilithium and Falcon, which have been chosen by NIST for standardisation, on the ARM Cortex M7 using the STM32F767ZI NUCLEO-144 development board. This research is motivated by the ARM Cortex M7 device being the only processor in the Cortex-M family to offer a double precision (i.e., 64-bit) floating-point unit, making Falcon's implementations, requiring 53 bits of double precision, able to fully run native floating-point operations without any emulation. When benchmarking natively, Falcon shows significant speed-ups between 6.2-8.3x in clock cycles, 6.2-11.8x in runtime, and Dilithium does not show much improvement other than those gained by the slightly faster processor. We then present profiling results of the two schemes on the ARM Cortex M7 to show their respective bottlenecks and operations where the improvements are and can be made. This demonstrates, for example, that some operations in Falcon's procedures observe speed-ups by an order of magnitude. Finally, since Falcon's use of floating points is so rare in cryptography, we test the native FPU instructions on 4 different STM32 development boards with the ARM Cortex M7 and also a Raspberry Pi 3 which is used in some of Falcon's official benchmarking results. We find constant-time irregularities in all of these devices, which makes Falcon insecure on these devices for applications where signature generation can be timed by an attacker.

Download Paper
Back to all publications