Binary, Bits, Bytes and Hex

From ELC Wiki
Jump to: navigation, search

As a new concept all the words in the title can seem daunting and working with them can seem even worse. Hopefully the following notes can help demystify them a little bit.

Base Counting

Binary is a format for numbers. Any binary number can be shown as a decimal number, which is the format we are most used to. The best way to understand the difference is to visualize the numbers changing, so first we look at the familiar decimal format:

Value Decimal (Base 10)
x10 x1
0 0 0
1 0 1
2 0 2
3 0 3
4 0 4
5 0 5
6 0 6
7 0 7
8 0 8
9 0 9
10 1 0
11 1 1
12 1 2
13 1 3
14 1 4
15 1 5
16 1 6

As we can see, there is no single digit to indicate "10", it's actually 2 digits: ( 1x10 ) ( 0x1 ). Now we try it with base 2, or binary counting:

Value Binary (Base 2)
x16 x8 x4 x2 x1
0 0 0 0 0 0
1 0 0 0 0 1
2 0 0 0 1 0
3 0 0 0 1 1
4 0 0 1 0 0
5 0 0 1 0 1
6 0 0 1 1 0
7 0 0 1 1 1
8 0 1 0 0 0
9 0 1 0 0 1
10 0 1 0 1 0
11 0 1 0 1 1
12 0 1 1 0 0
13 0 1 1 0 1
14 0 1 1 1 0
15 0 1 1 1 1
16 1 0 0 0 0

When counting in base 2 or binary, we pretend that there is no digit to represent "2", so if we want to write "2" in binary, again it's 2 digits: "10", (1x2) (0x1). When counting in decimal numbers when we reach "99", we add another digit at the start for the next power value\[10^2=10*10=100\] so, if we want to write "105", it's actually 3 digits: (1x100)(0x10)(5x1).

The digits we write first, increase their multiplying factor by the power of the base we are counting in. So in normal numbers (base 10, or decimal)\[1\]\[10^1=10\]\[10^2=100\]\[10^3=1000\]\[10^4=10000\]

If we apply the same rule in base 2, or binary, these increase with the powers of 2\[1\]\[2^1=2\]\[2^2=4\]\[2^3=8\]\[2^4=16\] etc.

Hexadecimal

If you're still with me, hexadecimal or hex is another form of base counting but this time, the base is 16. We take the same concept of putting a new digit at the start when we reach the base value. Now, this is a slight problem because we don't have single digit numbers that go up to 16, so instead, we substitute alphabetical letters for the values above 9. Just pretend they're are new labels.

Value Hexadecimal (Base 16)
x16 x1
0 0 0
1 0 1
2 0 2
3 0 3
4 0 4
5 0 5
6 0 6
7 0 7
8 0 8
9 0 9
10 0 A
11 0 B
12 0 C
13 0 D
14 0 E
15 0 F
16 1 0

Binary Data

You've probably heard the words Bits and Bytes but what do these things mean exactly and why?

Bits

In the digital world things are based on logic, and a fundamental concept of logic is having a clear cut answer to any question, whether that answer could be "yes" or "no", "on" or "off". We can start to imagine the need for this as a light bulb. If the bulb is on the answer is "1", if the bulb is off the answer is "0". This is simple enough.

If the digital world used decimal we have 10 answers to display. Do we turn the bulb on a bit to indicate a "1", then a bit more to indicate 2, then a bit more for a "3"? Based on how good you are at judging light levels against each other, it becomes very difficult even with 10 possible answers, to work out what the precise answer is.

In digital electronics and logic, there is no room for "maybe". This is where binary data is incredibly useful because as soon as we know how to count in base 2 or binary format, we can display any numerical value exactly as long as we have enough "bulbs".

Of course, we don't need to use actual bulbs to represent this information, we can use binary digits which are called Bits.

Bytes and Nibbles

Now that we can represent any number exactly as long as we have enough bits, we need to arrange this data in a sensible way. For example, if we go back to our light bulbs, we know each light bulb costs money, so if we only ever want to display numbers from 0-255 then we only need 8 of them. We know this because \(2^8=256\). So if I'm never going to need to display a value higher than this, why buy more bulbs?

In addition to this, if I want to store several numerical values, I might need several sets of these bulbs, so the more numbers I want to store, the more sets of bulbs I have to buy.

Now I have a whole wall of bulbs for storing lots of different numbers. Certain sets of bulbs are displaying certain numerical values, but now I don't know where one value starts and another ends. I need to arrange the bulbs to show me which bulbs represent which values. So it makes sense to use the same amount of bulbs for each value. This is where the Nibble and the Byte come in.

The way to decide how many bulbs to use to store a value is to choose an amount that gives you enough resolution but also makes your life mathematically easy. e.g. it's easier to count in 10's that it is in 13's

I used 8 bulbs earlier as being able to display a reasonably large number. 7 bulbs would allow me to display \(2^7=128\), which is also a good number providing a decent amount of resolution but, a set of 7 bulbs doesn't arrange into a wall very nicely because it's sets of 7. The people who invented this stuff could have decided there were any number of bits in a byte, but it would have made mathematical calculations more difficult especially if they decided on 7 (as a prime number) or (9 as an odd number). It just makes things more complicated, so they chose 8.

If we arrange the bits in groups of 4, this is called a Nibble. Grouping binary data by nibble can make it easier to read.

Hexadecimal in the Digital World

Now we know about Bytes, Nibbles, and Bits, why do we care about Hexadecimal counting in the digital world. The answer is: for convenience. It is possible to represent all the numerical values it is possible to hold in an 8 digit binary number, in a 2 digit hexadecimal number.

So if the result of some digital operation gives me 01001110, that takes a long time to write, and it's difficult to read visually. If we write the same number in hex it is 4E. This is much easier for humans to read especially if you are looking at many different values. It's also easier to calculate quickly to a decimal value: (4x16)(Ex1)=64+14=78. (If we remember that the hex value for E is the decimal value of 14).

In the example I just gave, it's obvious that 4E is a hex value, because it contains a letter. So what happens if I we have a hex value of 28 for example. This is not obviously hex and could easily be misinterpreted as a decimal value. For that reason when we write hex values, we always add a prefix 0x. This way we know that decimal: 28 is not the same as hexadecimal: 0x28.

Converting between decimal, binary and hex values is something that you have to do quite frequently when developing embedded software, and although it is a good idea to be able to understand how to convert manually, there are plenty of tools to do it quickly for you such as the calculator built into Windows which has a "programmer" mode.

One more thing

If you're not totally confused at this point, read on. If you are confused, read it through again and try a few things out for yourself or look online for some more resources that explain things better than I have.

There is one more thing to get your head around; if you are familiar with "orders of magnitude" you'll know that when you put "K" in front of a unit it multiplies it by 1,000. If you put "M" in front of the units, it multiplies it by 1,000,000. Now in the world of bits and bytes, this is not always true, although it has now been standardized.

The issue arises from the digital memory industry. Memory is normally manufactured in base 2 divisible quantities as it is designed to store bytes of information. This means the sizes might be 1024, 2048, 65536 etc. which are numbers that don't look that elegant. Once these values get really large the problem is even worse e.g. \(2^{20}=1,048,576\) and it's much easier to just say it's approximately 1MB which technically would be 1,000,000 Bytes.

As I've said, this has officially been standardized now, (according to wikipedia) so now, if you mean 1024 Bytes, you're supposed to write 1024KiB short for KibiBytes. This applies through all of the orders of magnitude i.e. MiB, GiB etc. However people have not always used this notation method. It's good to be aware that if people are talking about digital memory (like how much flash memory has this IC got), then they probably are counting in orders of magnitude based on 1024 rather than 1000.