Bitwise Operators in C – A Short Overview

The logical operators, logical AND (&&), logical OR (||) and logical NOT (!) are some of the most fundamental in the C programming language, and their use is typically demonstrated almost as soon as one has seen the if statement. Their use allows multiple statement evaluations within a single if, while, do … while or for loop, which is clearly very useful.

C has another set of operators which do Boolean evaluations and operations, but at a bitwise level. The so-called bitwise operators require two integer variables (i.e. variables of type char, short, int, long int, or in ISO C99 and the GNU extensions to C, long long int). Each bit in both of the integer variables is then checked, and the Boolean operation corresponding to the bitwise operator is then performed.

It is more difficult to explain why we might use these operations than it is to demonstrate them, so I will begin by demonstrating these bitwise operators. There are six bitwise operators: bitwise AND, bitwise OR, bitwise XOR (exclusive-or), the one’s complement operator (corresponding to bitwise NOT), left shift and right shift.

The Boolean AND operation is performed between two binary operands, whereby the values of both are checked. Each of these binary operands can have, as the name suggests, one of two values: 0 or 1 (or on and off, or in a computer, high voltage and low voltage). In the AND operation, if both binary operands are 1, the result is 1; otherwise, the result is 0. This is straightforward enough, and the results are illustrated here in a truth table using the symbol for the C bitwise AND operator, which is &, a single ampersand:

& |0 1
------
0 |0 0
  |
1 |0 1

This operation is performed in the C language like so:

#include <stdio.h>

int main(void)
{
    char i = 75;
    char j = 25;
    printf("i & j == %d\n", i & j); /* Using the & operator here */
    return 0;
}

This returns the following:

i & j == 9

This result will make more sense if we look at the binary calculations involved in this calculation. 75 in base 10 corresponds to 1001011 in base 2, while 25 in base 10 corresponds to 11001 in base 2. We will pad these values to eight bits to correspond to the eight-bit nature of a char variable in C. Having done this, we can then illustrate the bitwise operator using these binary operands:

  01001011
& 00011001
----------
  00001001
      *  *
in base 10: 8 + 1 = 9

Note that the bits in the result become 1 only if both of the corresponding bits in the operands are 1. This corresponds exactly to the definition of the Boolean AND operator that we wrote above in the truth table, and therefore, we get the expected answer.

Another of these operators is the bitwise OR operator. In this circumstance, the bits of the binary operands are checked, and if one or both of the values correspond to 1, the corresponding bit in the result will be 1. The truth table for this operation looks like this:

| |0 1
------
0 |0 1
  |
1 |1 1

The symbol for the bitwise OR is |, a single pipe symbol. The bitwise OR operator is demonstrated below:

#include <stdio.h>

int main(void)
{
    char i = 75;
    char j = 25;
    printf("i | j == %d\n", i | j); /* Using the OR operator here */
    return 0;
}

The result for this program is:

i | j == 91

Again, this answer can be better explained by showing the operation in binary.

  01001011
| 00011001
----------
  01011011
   * ** **
in base 10: 64 + 16 + 8 + 2 + 1 = 91

Again, we can see that the OR operator works as expected: The result has bits with value 1 in each of the places where either or both of the operands have bits with value 1. The next of the binary bitwise operators, the XOR (or exclusive-or) operator, works very similarly to the previous two operators, and especially like the OR operator. The difference between OR and XOR is that while the result for each bit operated upon by bitwise OR will equal 1 if one or both of the bits in the operands equals 1, the result of bitwise XOR will only equal 1 if one, but -not both- of the bits in the operands equals 1. The truth table for XOR looks like this:

^ | 0 1
-------
0 | 0 1
  |
1 | 1 0

We will use the same format that we have previously used to demonstrate the bitwise XOR operator.

#include <stdio.h>

int main(void)
{
char i = 75;
char j = 25;
printf(“i ^ j == %d\n”, i ^ j);
return 0;
}

The result from this program is:

i ^ j == 82

The binary calculations corresponding to this operation are as follows:

  01001011
^ 00011001
----------
  01010010
   * *  *
In base 10: 64 + 16 + 2 = 82

Once again, the operation has come up with the expected result. These three operators correspond to three of the standard Boolean operations. The fourth operator, the one’s complement operator, corresponds to the NOT Boolean operation. Unlike the previous three operators, the one’s complement operator is not a binary operator, but instead a unary operator, working on a single operand, and examines each bit in the operand. If the value of a bit in the operand is 1, the one’s complement operator changes it to a 0 and vice-versa. The truth table for this operation is as follows:

~ | 0 1
-------
R | 1 0

Using our previously defined values for the i and j variables, we can demonstrate the one’s complement operator below:

#include <stdio.h>

int main(void)
{
    char i = 75;
    char j = 25;
    printf("~ i == %d\n", ~i);
    printf("~ j == %d\n", ~j);
    return 0;
}

This program executes with the following results:

~ i == -76
~ j == -26

Before we do the binary proof for both of these values, note that the values are the negative value of the number minus one. This results from the format of the char variable in the C standard library of the computer this was written on; the char variable in this computer, an x86_64 AMD Athlon 64 X2 running on 64-bit Linux, is a signed integer value, and the most significant bit has been toggled with both of these operations, as will be shown below.

The reason that the values for both of these operations are equal to the negative value of the number -minus one- is because while this operation takes the one’s complement of the integral operand, the mathematics in C, as with most programming languages and computers, works on the two’s complement system, which can be explained by the use of a one’s complement operation on a positive variable and adding 1 to that value. (In fact, the x86 architecture has a machine instruction directly corresponding to two’s complement negation, and therefore, the explanation is merely metaphorical.)

Let’s do the binary operations on both of these values.

~ 01001011
----------
  10110100
  * ** *

~ 00011001
----------
  11100110
  *** **

As mentioned above, the most significant bit has been toggled in both of these cases. With a signed variable, this makes both variables correspond to negative values, while with unsigned values, the values would become the maximum values that the variables can have (in the case of an unsigned char, 255) minus the previous value. The details of the mathematics behind these operations is beyond the scope of this tutorial.

That leaves two bitwise operators to cover, and both of them are very similar. The left and right shift operators, which are represented by the respective symbols << and >>, are both binary operators, with the first operand being an integral variable or constant which is to be operated on, and the second operand is the number of bits to shift the first operand. Given the following binary value, 00011000, which corresponds to 24 in base 10, and which we will represent as a char variable, we can see the results of left-shifting and right-shifting by a number of places:

00011000 << 1 == 00110000
00011000 >> 1 == 00001100
00011000 << 2 == 01100000
00011000 >> 2 == 00000110
00011000 << 3 == 11000000
00011000 >> 3 == 00000011

Bit-shifting this value by up to three places in either direction causes no problems. In the case of left-shifting, the bits on the right which have been vacated have been replaced by zeroes; correspondingly, with right-shifting, the bits on the left that were vacated have been replaced by zeroes. The problem arises when we try to bit-shift this value by four or more places in either direction. Let’s see what happens if we try to do that:

00011000 << 4 == 0000000110000000 /* Promotion to short int */
00011000 >> 4 == 00000001 /* A bit has moved out of range */

Our bit-shifting operation has pushed a bit out of range in both occasions, replacing the vacated bit with a zero. In the case of the left-shifting operation, the result is correct, as the value is promoted to a short integer, filling 16 bits rather than 8. The problem is with the right shift, where the value is shifted out of range. Performing another right-shift on the second result by one more place will move the remaining bit out of range, giving a result of zero, as illustrated below:

10000000 << 1 == 0000001100000000 /* This is OK */
00000001 >> 1 == 00000000 /* Zero! Probably not what we want! */

This result is most likely not what we want, but it is defined behaviour for the shift operators in C. There are further rules for the operation of the shift operators defined in the ISO C standard, but instead of spending too long discussing this one set of operators, let’s write a program to show the results of the operations we’ve just shown above:

#include <stdio.h>
int main(void)
{
    unsigned char k = 24;
    int count;
    for (count = 1; count <= 5; count++) {
        printf("24 << %d == %d\n", count, k << count);
        printf("24 >> %d == %d\n", count, k >> count);
    }
    return 0;
}

The result from this program is the following:

24 << 1 == 48
24 >> 1 == 12
24 << 2 == 96
24 >> 2 == 6
24 << 3 == 192
24 >> 3 == 3
24 << 4 == 384
24 >> 4 == 1
24 << 5 == 768
24 >> 5 == 0

Notice that each time we perform a shift operation, the value of the number is equivalent to a multiplication or division by two. The question is why we would use these operators rather than simply multiplying or dividing a value by two. The answer is that unlike a multiplication or division procedure, which is often very complex for a machine and requires several machine instructions to perform, a processor has a shifting unit built into the machine, making it quicker and less complex to use a single machine instruction for shifting rather than five or six for multiplying or dividing.

Similarly, we can answer the question posed above about why we would use the other bitwise operators by noting their potential efficiency. In fact, these operations are typically used for programming device drivers, microcontrollers and other systems where low-level access is either necessary or very desirable. As well as this, bitwise operators are often used to simulate several Boolean “flags” inside a single integral variable, creating up to eight “flags”, for instance, in a single unsigned char variable.

The use of these techniques creates highly idiomatic code which is often tied to a limited set of hardware, but which can run very efficiently. These techniques are therefore often foregone for programs which are not heavily dependent on exact timing or efficient code, such as applications for home computers, and it is difficult to demonstrate a non-trivial application of these operations.

The following piece of source code from Quake III Arena demonstrates one of the idiomatic pieces of code where bitwise operations can really come in handy. The function described here is an application of the so-called “fast inverse square root” procedure, which manages to return a relatively accurate approximation to the inverse of a square root several times faster than previously-defined implementations, and was used for generating light rays in the game.

/* Original source: id Tech 3 game engine */
/* Source code released under the GNU General Public License in 2005. */

float Q_rsqrt( float number )
{
long i;
float x2, y;
const float threehalfs = 1.5F;

x2 = number * 0.5F;
y = number;
i = * ( long * ) &y; // evil floating point bit level hacking
i = 0x5f3759df - ( i >> 1 ); // what the fuck?
y = * ( float * ) &i;
y = y * ( threehalfs - ( x2 * y * y ) ); // 1st iteration
// y = y * ( threehalfs - ( x2 * y * y ) ); // 2nd iteration,
// this can be removed
return y;
}

The bitwise right-shift operation is demonstrated in the line of code aptly commented with the words, “what the fuck?”. In order to actually understand the code, one has to know various principles of mathematics, the IEEE 754 floating point specification, et cetera, which are difficult to understand and even more difficult to explain. All that really needs to be understood is that the number in y is arbitrarily treated as an integer, subtracted from the “magic number” in the “what the fuck?” line and then changed back into a floating point number. This could hardly be described as an intuitive operation, but it worked.

While that example demonstrated what could be done usefully with bitwise operations, it is not a terribly illuminating example unless you understand all of the principles illustrated by it. In an attempt to demonstrate some sort of understandable use of bitwise operators, we can look at the following example, which is not incredibly useful, but which can be explained more easily.

As we have previously established, text is generally stored in a computer in the ASCII, EBCDIC or Unicode systems, which are ways of encoding a character in a binary format so that a computer can read and store them. Knowing this, we can extract the individual bits from a character and print them to the screen.

#include <stdio.h>
#define CHAR_BITS 8

/* Function prototype */
int expt(int n, int power);

int main(void)
{
    char bits[CHAR_BITS]; /* Used to store the values of each bit */
    char sample='T'; /* A sample ASCII character */
    int i;

    for (i = 0; i < CHAR_BITS; i++) {
        /* Define each element of bits[] to be the corresponding bit */
        bits[i] = (sample & expt(2, 7-i)) >> 7 - i;
    }

    for (i = 0; i < CHAR_BITS; i++) {
        /* Print each bit */
        printf("%d ", bits[i]);
    }
    printf("\n");
    return 0;
}

int expt(int n, int power)
{
    int result = 1;
    for (power; power > 0; power--) {
        result *= n;
    }
    return result;
}

The result of this operation is:

0 1 0 1 0 1 0 0

Converting this value into base 10, we get 64 + 16 + 4 = 84, which is 65 + 19. ‘T’ is the twentieth letter in the alphabet, and as ‘A’ is defined in ASCII as 65 in base 10, we get the expected answer.

The program works in the following manner: An array of char variables is defined, followed by a sample character (in this case, ‘T’). A for loop counts through each variable in the array, assigning a value using the bitwise AND operator which will be equal to the corresponding bit in the ASCII binary representation for the character. The resultant number is then bit-shifted a number of places so that the answer for that element in the character array will be 1 if the bit is present in the ASCII code for the sample character, and 0 if it is not.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: