Problem of sign extension

Tags: / c /

About sign extension with an example C snippet .

Consider the following program

#include<stdio.h>

int main()
{
 char c = 255;
 int a = (int)(unsigned char)c;
 int b = (int)c;
            printf("\na = %d", a);
            printf("\nb = %d", b);
 return 0;
}

The output is interesting:

a = 255
b = -1

This is because of sign extension.

In a = (int)c, c has the value 255. As char is of 8 bits (usually), the binary representation in memory would be:

1111 1111

When this value is cast into int which means signed int, this 1111 1111 is considered a signed value. Like this

1 111 1111

with the first bit denoting the sign of the number.

Since the sign bit is 1 here, c is assumed to be negative.

sizeof(int) is greater than sizeof(char) so there are some bits to fill in the int representation of c.

Attempt to preserve the sign as well as the value of the number. So the lower bits of c assigned to a is unchanged and the ‘vacant’ bits of the int representation are filled with the value of the sign bit.

ie, the binary representation of a would be (assuming sizeof(int) is 4 bytes)

1111 1111 1111 1111 1111 1111 1111 1111

which is the binary for -1 (in 2’s complement form) when considered as a signed int.

Whereas in b = (int) (unsigned char)c;, c is first cast into an unsigned char. Therefore its sign won’t be considered when it is cast into a signed int afterwards and hence sign extension won’t be done.

Ie, the binary representation of b would be

0000 0000 0000 0000 0000 0000 1111 1111

as the sign bit of c won’t be copied to the ‘vacant’ bits of b.

See: https://en.wikipedia.org/wiki/Sign_extension