2021/10/24
Is this an Undefined Behavior?
Automated tests of Gauche HEAD on Windows platform started failing since several days ago. The log showed SHA1 digest result didn't match. It's weird, for I haven't touched that part of code for long time.
I isolated the reproducible condition. It happens with the fairly
new gcc (11.2.0) with -O2
. It doesn't exhibit without optimization,
nor with gcc 10.2.0 or other previous versions of gcc I have.
The problematic code is Aaron D. Gifford's SHA implementation sha2.c ( http://www.aarongifford.com/computers/sha.html ). It was last updated in January 2004, so it's pretty old, but I think it's still widely used.
I narrowed down the problem to around here:
/* Set the bit count: */ #if BYTE_ORDER == LITTLE_ENDIAN /* Convert FROM host byte order */ REVERSE64(context->s1.bitcount,context->s1.bitcount); #endif void *buf56 = &context->s1.buffer[56]; *(sha_word64*)buf56 = context->s1.bitcount; /* Final transform: */ SHA1_Internal_Transform(context, (sha_word32*)context->s1.buffer);
In our case, BYTE_ORDER
is LITTLE_ENDIAN
. REVERSE64
is a macro to swap the byte order of a 64bit word. context->s1.bitcount
is uint64_t
, and context->s1.buffer
is an array of unsigned chars.
What it does is to store 64bit-word of bitcount
into the buffer
from 56th octet in the network byte order, and calls
SHA1_Internal_Transform
.
It compiles to this code with optimization:
25ca75e9d: 48 8b 53 18 mov 0x18(%rbx),%rdx 25ca75ea1: 48 0f ca bswap %rdx 25ca75ea4: 48 89 53 18 mov %rdx,0x18(%rbx) 25ca75ea8: 48 89 d9 mov %rbx,%rcx 25ca75eab: 4c 89 e2 mov %r12,%rdx 25ca75eae: e8 8d fa ff ff call 25ca75940 <SHA1_Internal_Transform>
Here, %rbx
contains the pointer to context
, and %r12
to
context->s1.buffer
. The first three instructions swap the
64bit word. (By the way, REVERSE64
macro is written with shifts and
bitmasks. Gcc cleverly figures out its intent and
replaces the whole expression by a bswap
instrucion.)
The next three instruction is the calling sequence of
SHA1_Internal_Transform
.
Wait. There're no instructions emitted to store bitcount
into *buf56
. I checked the assembly after this but there're no
instructions for the store either.
If I insert a dummy external function call before
SHA1_Internal_Transform
like this:
/* Set the bit count: */ #if BYTE_ORDER == LITTLE_ENDIAN /* Convert FROM host byte order */ REVERSE64(context->s1.bitcount,context->s1.bitcount); #endif void *buf56 = &context->s1.buffer[56]; *(sha_word64*)buf56 = context->s1.bitcount; puts("foo"); /* Final transform: */ SHA1_Internal_Transform(context, (sha_word32*)context->s1.buffer);
Then the storing to *buf56
appears (mov %rdx, 0x58(%rbx)
):
25ca75e9d: 48 8b 53 18 mov 0x18(%rbx),%rdx 25ca75ea1: 48 0f ca bswap %rdx 25ca75ea4: 48 89 53 18 mov %rdx,0x18(%rbx) 25ca75ea8: 48 8d 0d 8f c7 00 00 lea 0xc78f(%rip),%rcx # 25ca8263e <.rdata+0x9e> 25ca75eaf: 48 89 53 58 mov %rdx,0x58(%rbx) 25ca75eb3: e8 60 2e 00 00 call 25ca78d18 <puts> 25ca75eb8: 4c 89 e2 mov %r12,%rdx 25ca75ebb: 48 89 d9 mov %rbx,%rcx 25ca75ebe: e8 7d fa ff ff call 25ca75940 <SHA1_Internal_Transform>
Now, accessing type punned pointer can break the strict aliasing rule.
The gcc might have figured the storing
into *buf56
had nothing to do with SHA1_Internal_Transform
.
But I feel there still needs to be
a leap that it completely eliminates the store instruction.
Does *(sha_word64*)buf56 = context->s1.bitcount
triggers
Undefined Behavior? That's why gcc is entitled to remove that code?
Tags: gcc, Optimization, UndefinedBehavior
Gambiteer (2021/10/25 22:25:35):
shiro (2021/11/09 07:37:58):