algorithm - Fastest Cortex M0+ Thumb 32x32=64 multiplication function? -
does have (or can write) optimal inline assembly function arm cortex m0+ processor in thumb mode multiply 2 32-bit numbers , return 64-bit number?
as m0+ not have long multiply, way can accomplished through primitive multiplication, compiler calls __aeabi_lmul
performs 64x64=64 multiplication in 34 instructions. i'm hoping faster algorithm exists, given inputs 32 bits.
so talking unsigned or signed multiplication? if signed doing 64x64=64 anyway not 32x32=64. if unsigned take source code gcc library function , modify since know upper halves of operands zero.
or @ hackers delight (hackersdelight.org) , see if there algorithm implements faster gcc library.
Comments
Post a Comment