1

I'm still confused about the bits and bytes although I've been searching through the internet. Is that one character of ASCII = 1 bytes = 8 bits? So 8 bits have 256 unique pattern that covered all the ASCII code, what form is it stored in our computer?

And if I typed "Hello" does that mean this consists of 5 bytes?

Newbie
  • 13
  • 2

2 Answers2

1

Yes to everything you wrote. "Bit" is a binary digit: a 0 or a 1. Historically there existed bytes of smaller sizes; now "byte" only ever means "8 bits of information", or a number between 0 and 255.

Amadan
  • 191,408
  • 23
  • 240
  • 301
  • what about the form of the bits stored in our computer? How much storage does it takes? – Newbie May 11 '16 at 02:17
  • One bit of information takes up one bit of storage. Not sure how else to answer that. In RAM, a bit is a capacitor that can be full or not. In hard drives, it is a little piece of magnet whose North/South orientation is changed. In CDs, it's a small burn on the surface, or absence of it. – Amadan May 11 '16 at 02:36
  • Thanks for the explaination, although I'm confused by the another explanation below. Have a nice day today! :) – Newbie May 11 '16 at 03:33
  • @TomBlodget's explanation is more exhaustive and precise than mine regarding of relationship between ASCII and bytes. – Amadan May 11 '16 at 03:34
  • Owh I see, trying to digest it as possible as I could. Thanks! – Newbie May 11 '16 at 03:58
0

No. ASCII is a character set with 128 codepoints stored as the values 0-127. Modern computers predominantly address 8-bit memory and disk locations so a 7-bit ASCII value takes up 8 bits.

There is no text but encoded text. An encoding maps a member of a character set to one or more bytes. Unless you absolutely know you are using ASCII, you probably aren't. There are quite a few character sets with encodings that cover all 256 byte values and use any combination of byte values to encode a string. There are several character sets that are similar but have a few less than 256 characters. And others that use more than one byte to encode a codepoint and don't use every combination of byte values.

Just so you know, Unicode is the predominant character set except in very specialized situations. It has several encodings. UTF-8 is often used for storage and streams. UTF-16 is often used in memory, particularly in Java, .NET, JavaScript, XML, …. When text is communicated between systems, there has to be an agreement, specification, standard, or indication about which character set and encoding it uses so a sequence of bytes can be interpreted as characters.

To add to the confusion, programming languages have data types called char, Character, etc. You have to look at the specific language's reference manual to see what they mean. For example in C, char is simply an integer that is defined as the size of the encoding of character used by that C implementation. (C also calls this a "byte" and it is not necessarily 8 bits. In all other contexts, people mean 8 bits when they say "byte". If they want to be exceedingly unambiguous they might say "octet".)

"Hello" is five characters. In a specific character set, it is five codepoints. In a specific encoding for that character set, it could be 5, 10 or 20, or ??? bytes.

Also, in the source code of a specific language, a literal string like that might be "null-terminated". This means that you could say it is 6 "characters". Other languages might store a string as a counted sequence of code units. Again, you have to look at the language reference to know the underlying data structure of strings. Of, if the language and the libraries used with it are sufficiently high-level, you might never need to know such internals.

Tom Blodget
  • 20,260
  • 3
  • 39
  • 72
  • hmm, thanks a lot. Even though I tried to read through again and again but somehow didnt get to understand all of em. So codepoint is total different thing from ASCII? I saw that it is for unicode, like 'a' in for codepoint is 0061 but in ASCII is 65. And is 'encoding' a noun or verb, I learnt that it means converting something in a form to another form on Google. The rest is okay for me. Thanks! – Newbie May 11 '16 at 03:55
  • Just take it a "bit" at a time. My basic message is that the terms have very specific meanings and people make mistakes by reading too much into them. If you have a specific question, I'd be glad to clarify my answer. – Tom Blodget May 11 '16 at 03:59