Gauche Devlog

< Better test failure report | Running gosh without installing >

2021/10/24

Is this an Undefined Behavior?

Automated tests of Gauche HEAD on Windows platform started failing since several days ago. The log showed SHA1 digest result didn't match. It's weird, for I haven't touched that part of code for long time.

I isolated the reproducible condition. It happens with the fairly new gcc (11.2.0) with -O2. It doesn't exhibit without optimization, nor with gcc 10.2.0 or other previous versions of gcc I have.

The problematic code is Aaron D. Gifford's SHA implementation sha2.c ( http://www.aarongifford.com/computers/sha.html ). It was last updated in January 2004, so it's pretty old, but I think it's still widely used.

I narrowed down the problem to around here:

        /* Set the bit count: */
#if BYTE_ORDER == LITTLE_ENDIAN
        /* Convert FROM host byte order */
        REVERSE64(context->s1.bitcount,context->s1.bitcount);
#endif
        void *buf56 = &context->s1.buffer[56];
        *(sha_word64*)buf56 = context->s1.bitcount;

        /* Final transform: */
        SHA1_Internal_Transform(context, (sha_word32*)context->s1.buffer);

In our case, BYTE_ORDER is LITTLE_ENDIAN. REVERSE64 is a macro to swap the byte order of a 64bit word. context->s1.bitcount is uint64_t, and context->s1.buffer is an array of unsigned chars. What it does is to store 64bit-word of bitcount into the buffer from 56th octet in the network byte order, and calls SHA1_Internal_Transform.

It compiles to this code with optimization:

   25ca75e9d:   48 8b 53 18             mov    0x18(%rbx),%rdx
   25ca75ea1:   48 0f ca                bswap  %rdx
   25ca75ea4:   48 89 53 18             mov    %rdx,0x18(%rbx)
   25ca75ea8:   48 89 d9                mov    %rbx,%rcx
   25ca75eab:   4c 89 e2                mov    %r12,%rdx
   25ca75eae:   e8 8d fa ff ff          call   25ca75940 <SHA1_Internal_Transform>

Here, %rbx contains the pointer to context, and %r12 to context->s1.buffer. The first three instructions swap the 64bit word. (By the way, REVERSE64 macro is written with shifts and bitmasks. Gcc cleverly figures out its intent and replaces the whole expression by a bswap instrucion.) The next three instruction is the calling sequence of SHA1_Internal_Transform.

Wait. There're no instructions emitted to store bitcount into *buf56. I checked the assembly after this but there're no instructions for the store either.

If I insert a dummy external function call before SHA1_Internal_Transform like this:

        /* Set the bit count: */
#if BYTE_ORDER == LITTLE_ENDIAN
        /* Convert FROM host byte order */
        REVERSE64(context->s1.bitcount,context->s1.bitcount);
#endif
        void *buf56 = &context->s1.buffer[56];
        *(sha_word64*)buf56 = context->s1.bitcount;

        puts("foo");

        /* Final transform: */
        SHA1_Internal_Transform(context, (sha_word32*)context->s1.buffer);

Then the storing to *buf56 appears (mov %rdx, 0x58(%rbx)):

   25ca75e9d:   48 8b 53 18             mov    0x18(%rbx),%rdx
   25ca75ea1:   48 0f ca                bswap  %rdx
   25ca75ea4:   48 89 53 18             mov    %rdx,0x18(%rbx)
   25ca75ea8:   48 8d 0d 8f c7 00 00    lea    0xc78f(%rip),%rcx        # 25ca8263e <.rdata+0x9e>
   25ca75eaf:   48 89 53 58             mov    %rdx,0x58(%rbx)
   25ca75eb3:   e8 60 2e 00 00          call   25ca78d18 <puts>
   25ca75eb8:   4c 89 e2                mov    %r12,%rdx
   25ca75ebb:   48 89 d9                mov    %rbx,%rcx
   25ca75ebe:   e8 7d fa ff ff          call   25ca75940 <SHA1_Internal_Transform>

Now, accessing type punned pointer can break the strict aliasing rule. The gcc might have figured the storing into *buf56 had nothing to do with SHA1_Internal_Transform. But I feel there still needs to be a leap that it completely eliminates the store instruction.

Does *(sha_word64*)buf56 = context->s1.bitcount triggers Undefined Behavior? That's why gcc is entitled to remove that code?

Tags: gcc, Optimization, UndefinedBehavior

Past comment(s)

Gambiteer (2021/10/25 22:25:35):

It may violate strict-aliasing. Try adding -fno-strict-aliasing to the compilation. If you want to do something like this, use a union.

shiro (2021/11/09 07:37:58):

@Gambiteer Right, it is a violation of strict aliasing. I was not sure if it immediately triggers UB, but it is indeed the case. ( https://www.jpcert.or.jp/sc-rules/c-exp39-c.html )

Post a comment

Name: