In this class we are going to learn how to program in C/C++. C has been around since the 1970's and C++ since the early 80's. They are general purpose programming languages, just like java and python, meaning you can use them to write any kind of software. This is as opposed to a Domain Specific language like sql that is used specifically to query databases or HTML which is used to mark up the web. Going foreward I am going to say C when I am referring to the language we are using in this class.
Our Text Book: One of the best CS books ever written. It's a classic.
The fundamental object of computer hardware is a bit. In Hardware a bit is an ACTUAL switch that can be turned on and off. Inside of modern computers are millions if not billions of switches.
Bits are just switches. Groups of switches, that is bit patterns, set in whatever random positions, by themsleves, have no inherent meaning or significance. However Humans have been able to superimpose meaning onto bit patterns by using the binary numeric system. By using a binary numeric system we can overlay numeric values onto groups of switches.
Human thought, output and communication is facilitated and documented by using words,numbers, symbols, sounds, pictures, and manufactured objects. Objects are often depicted using pictures. Words are made of letters, sounds are made from combinations of frequencies, Pictures are made from value and filtration of light. Objects have position in space. Math is symbols and numbers. These atomic components of human output can all be represented by numeric values.
In current computer systems switches are grouped into sets of 8. These are called bytes.
Each value can represent some THING.
In order to communicate, the communicators have to agree on the language they are using. In computer systems we have come up with standards that super impose meaning on numbers. For example, CPU manufacturers of the x86 cpu,Intel, have give numeric values to all the instructions that are available in the cpu. These are called opcodes. These numeric values trigger the cpu to set and unset other switches in the computer. That's pretty META!
When we program, we input code into a computer, that eventually gets turned into data and instructions that a cpu can process.
To input and edit our code we need to use a very simple program called a text editor. A text editor is a type of computer program that is used to input, edit, and save plain text.
The code we type is just plain text.
Plain text is text that isn't formatted. It does not contain any special formatting, such as varying fonts, font sizes, bold, or italics. No special margins, headers, footers, etc. The set of plain text characters are
Everything we store in computers, is stored as numbers. Plain text is no different. Over the years standard have been established as to which number represent which characters in the computer. The most ubiquitous standard up until recently has been known as ascii. ascii is a one byte standard. Meaning that in the ascii standard we represent characters using a single byte of memory, 8 switches. How many characters can we represent in ascii?
Since ascii only uses a single byte to represent characters. It only allows us to represent 256 unique characters. In order to represent all the languages of the world, where some languages have thousands of characters other encodings were developed. These are known as utf-8 and utf-16. These are variable length encodings, meaning 1-4 bytes in utf-8. thus requiring more processing power to encode and decode. Emoji's are a part of the utf-8 standard, utf-8 covers almost all characters and symbols in the world.
A file is an array of bytes stored on disk. A plain text file is an array of ascii characters, 1 per byte from byte 0 till the last character. That's it no other information is stored in the file.
We create a plain text file by opening our text editor and typing some characters onto the keyboard. We then save the file
In this case we are going to create a file called example.txt and type the characters. Please note for this example I purposely set up my text editor to leave out end of line characters.
Hello World!
If we use the cat command to view the file. Cat will interpret the values of each byte of the file as ascii characters and print out the appropriate character according to the ascii standard. Notice there is no End of line character
If we use the ls -l command in linux it will show us how many bytes are in the file?
We can use the hexdump utitlity to see exactly what numeric value is stored in each byte, for every bytes that stored in the file.
you can think of this as viewing the bit pattern of the entire file as well
you can also think it has how the switches are set for each byte.
Many text editors that are designed to work with source code have additional functionality such as adding visual cues to help the programmer, organize and interact with their text in a meaningful way. Some of this functionality is
But all that is absolutely necessary is a simple plain text file.
Our code, our program, that is input into a text editor as plain text, then needs to be translated into a language that the cpu can understand. That is it needs to be translated into an executable file.
We will use programs gcc or g++ to take us thru the entire compilation process. These programs are written by the gnu foundation that provide FOSS software to the world. gcc is for compiling c code. g++ is for compiling c++ code. Each stage/phase of the compilation process is handled by these programs. The compilation process targets the instruction set for a specific cpu and a specific OS.
The translation process is a multistep process. We call the entire process the compilation process. But in actuality compilation is just one stage of the process. The first stage of the translation process is called the preprocessor stage. The prepocessor is a program that does text subsititution in your file before any of the other steps happen.
As Jenny said in the previous video. Any line starting with a # is a preprocessor instruction. Remember all the prepocessor does is text subsitution. So when you type #include "something" the plain text from the file something is pasted into the file where your #include statement is. When you do #define MACRO_NAME, wherever MACRO_NAME is present in your file will be replaces with what ever comes after it. You can see the output of the preprocessor by using the gcc -e option. Let's watch a little more. Remember preprocessor is prior to compilation
When reading other peoples source code you will see many preprocessor directives. One commonbb use is as a file guard. There are also predefined PreProcessor Macros that you can read about here. We will watch one more video about prepocessors and move onto the next stage of compilation
So the preprocessor is a program that replaces text in your source code file prior to any further compilation. Once the preprocessor is finished all the lines starting with # will have been replace or removed and all the comments will have been removed. We can examine the results of preprosssor phase by calling
gcc -E "file-name.c"
In many c programs you are going to #include the .h files from the c standard library. Remember that just means copy the plain text and paste it into your file. Depending on your OS these files may be stored in various places. Let's take alook at stdio.h. We can use the find command on a linux system to find the location of this file.
stdio.hOnce the preprocessor is done any line with a # will have been replaced with the proper text and all the comments will have been removed.