C/C++

Welcome

In this class we are going to learn how to program in C/C++. C has been around since the 1970's and C++ since the early 80's. They are general purpose programming languages, just like java and python, meaning you can use them to write any kind of software. This is as opposed to a Domain Specific language like sql that is used specifically to query databases or HTML which is used to mark up the web. Going foreward I am going to say C when I am referring to the language we are using in this class.

Our Text Book: One of the best CS books ever written. It's a classic.

Bits and information

The computer as switching network

The fundamental object of computer hardware is a bit. In Hardware a bit is an ACTUAL switch that can be turned on and off. Inside of modern computers are millions if not billions of switches.

Bit patterns as information

Bits are just switches. Groups of switches, that is bit patterns, set in whatever random positions, by themsleves, have no inherent meaning or significance. However Humans have been able to superimpose meaning onto bit patterns by using the binary numeric system. By using a binary numeric system we can overlay numeric values onto groups of switches.

Information as numbers

Human thought, output and communication is facilitated and documented by using words,numbers, symbols, sounds, pictures, and manufactured objects. Objects are often depicted using pictures. Words are made of letters, sounds are made from combinations of frequencies, Pictures are made from value and filtration of light. Objects have position in space. Math is symbols and numbers. These atomic components of human output can all be represented by numeric values.

Bytes

In current computer systems switches are grouped into sets of 8. These are called bytes.

  • 1 byte, 8 switches, can represent 256 numeric values
  • 2 bytes, 16 switches, can represent 65536 numeric values
  • 3 bytes, 24 swtiches, can represent 16,777,216 numeric values
  • 4 bytes, 32 switches, can represent 4,294,967,296 numeric values

Each value can represent some THING.

Information as numbers

In order to communicate, the communicators have to agree on the language they are using. In computer systems we have come up with standards that super impose meaning on numbers. For example, CPU manufacturers of the x86 cpu,Intel, have give numeric values to all the instructions that are available in the cpu. These are called opcodes. These numeric values trigger the cpu to set and unset other switches in the computer. That's pretty META!

Programming

Text Editor

When we program, we input code into a computer, that eventually gets turned into data and instructions that a cpu can process.

To input and edit our code we need to use a very simple program called a text editor. A text editor is a type of computer program that is used to input, edit, and save plain text.

The code we type is just plain text.

What is Plain Text ?

Plain text is text that isn't formatted. It does not contain any special formatting, such as varying fonts, font sizes, bold, or italics. No special margins, headers, footers, etc. The set of plain text characters are

  • Upper Case Letters
  • Lower Case Letters
  • Numbers
  • Punctuation Characters
  • White space characters, such as tab, space, end of line, carriage return

How is plain text stored in a computer

Everything we store in computers, is stored as numbers. Plain text is no different. Over the years standard have been established as to which number represent which characters in the computer. The most ubiquitous standard up until recently has been known as ascii. ascii is a one byte standard. Meaning that in the ascii standard we represent characters using a single byte of memory, 8 switches. How many characters can we represent in ascii?

American Standard Code for Information Interchange

Other character encoding standards

Since ascii only uses a single byte to represent characters. It only allows us to represent 256 unique characters. In order to represent all the languages of the world, where some languages have thousands of characters other encodings were developed. These are known as utf-8 and utf-16. These are variable length encodings, meaning 1-4 bytes in utf-8. thus requiring more processing power to encode and decode. Emoji's are a part of the utf-8 standard, utf-8 covers almost all characters and symbols in the world.

What is a plain text file ?

A file is an array of bytes stored on disk. A plain text file is an array of ascii characters, 1 per byte from byte 0 till the last character. That's it no other information is stored in the file.

Let's create a plain text file

We create a plain text file by opening our text editor and typing some characters onto the keyboard. We then save the file

In this case we are going to create a file called example.txt and type the characters. Please note for this example I purposely set up my text editor to leave out end of line characters.

		Hello World!
	  

view file using cat

If we use the cat command to view the file. Cat will interpret the values of each byte of the file as ascii characters and print out the appropriate character according to the ascii standard. Notice there is no End of line character

How many bytes are in the file?

If we use the ls -l command in linux it will show us how many bytes are in the file?

What is actually stored in each byte

We can use the hexdump utitlity to see exactly what numeric value is stored in each byte, for every bytes that stored in the file.

you can think of this as viewing the bit pattern of the entire file as well

you can also think it has how the switches are set for each byte.

text editors

  • Sublime
  • Atom
  • Vim
  • neovim
  • notepad
  • notepad++
  • Emacs: my text editor
  • Visual Studio Code (visual studio code is a text editor, visual studio is an ide)

Source Code editors

Many text editors that are designed to work with source code have additional functionality such as adding visual cues to help the programmer, organize and interact with their text in a meaningful way. Some of this functionality is

  • Syntax Highlighting
  • Code Completion
  • error detection
  • brace matching
  • auto indentation

But all that is absolutely necessary is a simple plain text file.

Once we input our code as plain text what heppens next?

Our code, our program, that is input into a text editor as plain text, then needs to be translated into a language that the cpu can understand. That is it needs to be translated into an executable file.

gcc and g++

We will use programs gcc or g++ to take us thru the entire compilation process. These programs are written by the gnu foundation that provide FOSS software to the world. gcc is for compiling c code. g++ is for compiling c++ code. Each stage/phase of the compilation process is handled by these programs. The compilation process targets the instruction set for a specific cpu and a specific OS.

What is the compilation process?

The translation process is a multistep process. We call the entire process the compilation process. But in actuality compilation is just one stage of the process. The first stage of the translation process is called the preprocessor stage. The prepocessor is a program that does text subsititution in your file before any of the other steps happen.

PreProcessor Directives

As Jenny said in the previous video. Any line starting with a # is a preprocessor instruction. Remember all the prepocessor does is text subsitution. So when you type #include "something" the plain text from the file something is pasted into the file where your #include statement is. When you do #define MACRO_NAME, wherever MACRO_NAME is present in your file will be replaces with what ever comes after it. You can see the output of the preprocessor by using the gcc -e option. Let's watch a little more. Remember preprocessor is prior to compilation

File Guards

When reading other peoples source code you will see many preprocessor directives. One commonbb use is as a file guard. There are also predefined PreProcessor Macros that you can read about here. We will watch one more video about prepocessors and move onto the next stage of compilation

preprocessor results

So the preprocessor is a program that replaces text in your source code file prior to any further compilation. Once the preprocessor is finished all the lines starting with # will have been replace or removed and all the comments will have been removed. We can examine the results of preprosssor phase by calling

gcc -E "file-name.c"

stdio.h

In many c programs you are going to #include the .h files from the c standard library. Remember that just means copy the plain text and paste it into your file. Depending on your OS these files may be stored in various places. Let's take alook at stdio.h. We can use the find command on a linux system to find the location of this file.

stdio.h

state when preprocessor is done

Once the preprocessor is done any line with a # will have been replaced with the proper text and all the comments will have been removed.