Binary Arithmetic (Integers)
Addition of binary numbers is very straightforward.
0 0 1 1 10 11 10 11 0 1 0 1 1 1 10 10 0 1 1 10 11 100 100 101
For computer arithmetic, we need to know the word length. If we have an 8-bit word then our answer must be held in 8 bits.
00001110 00001011 00011001
10101010 11111111 10101001the carry from the last addition is lost. The answer is WRONG! That is because 8 bits is not large enough to represent the sum. This is an overflow problem.
We can’t do subtraction until we have a representation for negative numbers.
There are three ways to get a negative expressed in binary number systems – sign magnitude, excess notation, and two’s complement. First we will look at sign magnitude, the simplest of the three systems.
Sign magnitude uses the leftmost bit for the sign – 0 is positive and 1 is negative. For simplicity, let’s assume a 3-bit word.
The big problem here is two representations for zero; one is positive, the other is negative. If we could ignore that, we still have trouble with arithmetic operations.
2 010 +(-1) 101 1 111 = (-3) not 1
This means we need a more complex algorithm for our representation.
Excess notation uses subtraction to determine the value of the integers. Again, let’s use a 3-bit word. What is the largest power of two expressable with 3 bits? 22 = 4. So we are going to use excess-4 notation. Subtract 4 from each number to get the new value.
This looks promising. There is only 1 zero. But does the arithmetic work correctly?
2 110 +(-1) 011 1 001 = (-3) not 1
That leaves us with two’s complement.
In today’s equipment, the two’s complement system is used with 32-bit words. This is too big for us, so we will use a 4-bit word.
Here is a list of the values:
There is a method to this pattern. First, the left most bit is the sign bit. For the positive bit patterns, just use the ‘regular’ binary system. The negative values are different.
To complement means to change the 0’s to 1’s and the 1’s to 0’s.
To find -1, use 0001. Copy until we have written the first 1: 1. Then we complement the remaining digits. Since they are all 0’s, we get 1111.
For larger words, we use the same algorithm. Find the two’s
complement for 01011100.
Note that 4 bits gives us 7 through -8. There is no +8. We can have numbers only in the range from +7 through -8. But this does give us easy subtraction.
3 3 0011 - 2 is really +(-2) 1110 1 0001Yeah! Finally the correct answer! And, yes, the last carry is lost. But this overflow is not a problem.
How about a harder problem:
3 3 0011 -(-2) becomes +2 0010 5 0101
Because we are using only 4-bit precision, we can have carry that is a problem.
4 0100 + 5 0101 9 1001 = -7OOPS!
Remember, the largest value we can express is +7. When two positive numbers add to a negative value (in two’s complement), there is a carry problem.
We need more bits to express our numbers. Using 32 bits gives us 2,147,483,647 as the largest value. Two billion is pretty big, but not large enough for all types of problems. We need a system that allows us to use really large values and fractional values. For that we need floating point representation.