c++ - Properly specify constraint for rotate? -


i'm investigating potential speedups respect constant time rotate not violate standards.

a rotate on x86/x64 has following. simplicity, i'm going discuss rotating byte (so don't tangled in immediate-8 versus 16, 32 or 64):

  • the "value" can in register or in memory
  • the "count" can in register or immediate

the processor expects count in cl when using register. processor performs rotate masking lower 5 bits of count.

below, value x, , count y.

template<> inline byte rotleft<byte>(byte x, unsigned int y) {     __asm__ __volatile__("rolb %b1, %0" : "=mq" (x) : "ci" (y), "0" (x));     return x; } 

since x both read , write, think should using + somewhere. can't assembler take it.

my question is, constraints represented correctly?


edit: based on jester's feedback, function changed to:

template<> inline byte rotleft<byte>(byte x, unsigned int y) {     __asm__ __volatile__("rolb %b1, %0" : "+mq" (x) : "ci" (y));     return x; } 

references:

you should use correct sized type operands rather trying force register correct size using operand modifer. in case truncate immediate operand correct size if it's big. david wohlferd said, don't want make asm statement volatile prevent optimizer removing if it's unused.

template<> inline byte rotleft<byte>(byte x, unsigned int y) {      asm ("rolb %1, %0" : "+mq" (x) : "ci" ((byte)y));      return x; } 

Comments

Popular posts from this blog

python - argument must be rect style object - Pygame -

webrtc - Which ICE candidate am I using and why? -

c# - Better 64-bit byte array hash -