Skip to main content

Introduction to C

History

The C language is linked to the design of the UNIX system by Bell labs in the 1970s. Its development was influenced by two languages:

  • BCPL, developed in 1966 by Martin Richards
  • B, developed in 1970 at Bell labs

The first version of the C compiler was written in 1972 by Dennis Ritchie. From there, the language grew in popularity along the UNIX systems and numerous versions of the C language were created:

  • In 1978: K&R C, an informal specification based on the book The C Programming Language written by Brian Kernighan and Dennis Ritchie.
  • In 1989: ANSI C or C89, the first official C standard.
  • In 1990: ISO C or C90, the same language as C89 but published as an ISO standard.
  • In 1999: C99, major extensions to the standard C language.
  • In 2011: C11, minor new language features.
  • In 2017: C17, minor corrections of C11 without adding new features.
  • In 2023: C23, minor new language features, modernizations and cleanup.

A compiler is the tool used to translate source code into an executable program. You will study compiler details and inner-workings at length during your ING1.

At EPITA, we will be using the C99 standard.

The C programming language has been constantly ranked among the most popular programming languages since the 1980s according to the TIOBE index. Because of its dense history and low-level design, C is best known to be very portable, extremely efficient and a mature language. In the industry, it is widely used for: operating systems and kernel development, compilers and interpreters design, libraries, embedded systems, database management systems, ...

Syntax reminders

Comments

In C there are two types of comments: single-line comments and multi-line comments. See the examples for the syntax of each type of comment.

// I am a single-line comment

/* I am a single-line comment in the multi-line style */

/*
** I am a multi-line comment authorized by the EPITA standard
*/

/*
I am also a multi-line comment, but not authorized by the coding style
*/

Variables

A variable consists of:

  • A type which is one of built in C types or a user defined.

  • They have an identifier (a name) that must respect the following naming conventions:

    • Start with a letter or an underscore ('_').
    • Consist of a sequence of letters, numbers or underscores.
    • Be different from C keywords.
danger

Starting with '_' is forbidden by the coding style.

  • Possibly a value.
int     i;
int j = 3;
char c = 'a';
float f = 42.42;

You can then use declared variables in the program by using their identifiers.

int a = 1;
int b = 41;

int sum = a + b; // sum == 42

Predefined types

Basic data types of C

  • void: a variable cannot have this type, which means "having no type", this type is used for procedures (see below).
  • char: a character (which is actually a number) coded with a single byte.
  • int: an integer which memory space depends on the architecture of the machine (2 bytes on 16-bit architectures, 4 on 32 and 64-bit architectures).
  • float: a floating point number with simple precision (4 bytes).
  • double: a floating point number with double precision (8 bytes).

It is possible to apply a number of qualifiers to these data types, the followings apply to integers:

NameBytesPossible values ($-2^{n-1}$ to $2^{n-1}-1$)
short (int)2-32 768 to 32 767
int2 or 4$-2^{15}$ to $2^{15}-1$ or $-2^{31}$ to $2^{31}-1$
long (int)4 or 8$-2^{31}$ to $2^{31}-1$ or $-2^{63}$ to $2^{63}-1$
long long (int)8$-2^{63}$ to $2^{63}-1$

Note that the long qualifier depends on your architecture: on 32-bit architectures, it will be 4 bytes long, and on 64-bit architectures, it will be 8 bytes long.

tip

A bit can have two values, 0 or 1. A byte is 8 bits long, thus having values from 0 to 255 (11111111 in binary).

For example:

short int shortvar;
long int counter;

In that case, int is optional.

By default, data types are signed, which means that variables with these types can take negative or positive values. It is also possible to use unsigned types thanks to the keyword unsigned (and you specify that it is signed with the signed keyword, but integers are signed by default, so this keyword is rarely used).

danger
  • signed and unsigned qualifiers only apply to integer types (char and int).
  • char type is by default either signed or unsigned: it depends on your compiler.

Booleans

A boolean is a type that can be evaluated as either true or false. They are used in control structures.

In the beginning, there was no boolean type in C and integer types were used instead:

  • 0 stated as false.
  • Any other value stated as true.
note

C99 standard introduced _Bool type that can contain the values 0 and 1. The header stdbool.h was also added: it defines the bool type, a shortcut for _Bool and the values true and false. You will learn more about headers later.

Typecast (implicit type conversion)

When an expression involves data of different but compatible types, one can wonder about the result's type. The C compiler automatically performs conversion of "inferior" types to the biggest type used in the expression.

int   i = 42;
int j = 4;
float k = i / j; // k equals 10.0

The type of i and j variables is int, so the result of the division will have int type and will be 10. However, we want to have a float type as a result and so we use typecast:

int   i = 42;
int j = 4;
float t = i;
float k = t / j; // k equals 10.5

t being of float type, the result's type becomes implicitly float and the value 10.5 is stored in k.

Operators

Binary operators

Arithmetic operators

For arithmetic operations, the usual operators are available:

OperationOperator
addition+
subtraction-
multiplication*
division/
remainder%
danger

The result of a division between two integers is truncated.

Example:

float i = 5 / 2;    // i == 2.0
float j = 5. / 2.; // j == 2.5, note that 5. is equivalent to 5.0

Comparison operators

These operators return a boolean result that is either true (any value different from 0) or false (the value 0) depending on whether equalities or inequalities are, or are not, checked:

OperationOperator
equality==
difference!=
superior>
superior or equal>=
inferior<
inferior or equal<=

Logical operators

  • Logical OR ||:

    condition1 || condition2 || ... || conditionN

    The previous expression will be true if at least one of the conditions is true, false otherwise.

  • Logical AND &&:

    condition1 && condition2 && ... && conditionN

    The previous expression will be true if all conditions are true, false otherwise.

The execution of conditions is left to right. The following conditions are only evaluated when necessary (laziness). For example, with two conditions separated by &&, if the first one returns false, then the second one will not be evaluated (because the result is already known: false). The same goes for a true expression on the left of a ||, the result is obviously true.

Example:

int a = 42;
int b = 0;
(a == 1) && (b = 42);
// b equals 0, and not 42, because 'b = 42' has not been evaluated

Assignment Operators

  • Classical assignment: =. This operator allows to assign a value to a variable. The value returned by var = 4 + 2; is 6 (the assigned value). This property allows you to chain assignments:
int i, j, k;
i = j = k = 42; // i, j and k equal 42

Note that the coding style requires one declaration by line.

tip

+= is a shortcut for a = a + b, same goes for -=, *=, /= and %=.

int a = 5;
int b = 33;
a += b; // a == 38
int c += a; // does not compile because ``c`` does not exist

Unary operators

Negation

The operator -, is used to negate a numeric value. It is the same as a multiplication by -1.

int i = 2;
int j = -i; // j == -2

Increment/Decrement

In C you can use the ++ and the -- operators to respectively increment and decrement by 1 a variable.

When the ++ operator (or --) is placed on the left hand side, it is called pre-increment. It means that the variable will be first incremented and then used in the expression.

On the other hand, when the ++ operator (or --) is placed on the right hand side, it is called post-increment. The variable is first used in the expression and then incremented.

int i = 2;
int j;
int k;

j = i++; // j == 2 and i == 3
k = j + ++i; // k == 6 and i == 4

Not

The ! operator is used with a boolean condition. Its effect is to reverse the value of the condition:

  • if CONDITION is true, then !CONDITION is false;
  • if CONDITION is false, then !CONDITION is true.

Priorities

The following operators are given from highest to lowest priority. Their associativities are also given: left or right. This is not a list of ALL operators in C, rather the most common ones.

CategoryOperatorsAssociativity
parentheses()Left
unary+ - ++ -- !Right
arithmetic* / %Left
arithmetic+ -Left
comparisons< <= > >=Left
comparisons== !=Left
logical&&Left
logical||Left
ternary?:Right
assignment= += -= *= /= %=Right

In programming languages, associativity is to be understood as operator associativity. When two operators are of the same precedence, in order to determine how to resolve the order of execution, we look at their respective associativity.

Left associativity indicates that operations are resolved left to right.

Right associativity indicates that operations are resolved right to left.

Example

int a = 1;
int result = ! -- a == 3 / 3;

The following rules will be applied in this order to resolve priority issues:

  • The unary operators ! and -- are the ones with the highest priority. As both of them have right-to-left priority, -- will be solved before !.
  • The arithmetic division, /, is now the operator with highest priority, so the next operation will be 3 / 3.
  • Finally the ==, with the lowest priority, will be executed.

We could rewrite this whole operation as:

int result = (!(--a)) == (3 / 3);
tip

Associativity is not always obvious: do not hesitate to add parentheses, even if they are not required, to make some operator priorities explicit and ensure the code is easily readable.

ASCII

The American Standard Code for Information Interchange (abbreviated ASCII) is one of the most widely used encoding standards in the world. It was developed in the 1960s and maps 128 characters based on the English alphabet to numerical values.

tip

You can see the ASCII table by typing man ascii in your terminal.

You should really take a look at the ASCII table and notice a few things:

  • Characters are sorted logically, 'a' to 'z' are contiguous, as well as 'A' to 'Z' and '0' to '9'.
  • The character '0' does not have the value 0.
  • Some characters cannot be printed (for example ESC or DEL).

In C, a variable of type char can at least take values from 0 to 127, where each value in this range corresponds to a character following the ASCII table. The value of a char variable being a number, numerical operations can be performed on this variable.

#include <stdio.h>

int main(void)
{
char c = 'A';
c += 32;

if (c >= 97 && c <= 122)
puts("'c' has become a lowercase character!");

return 0;
}

However, this writing is not practical at all as it is hard to read. We will prefer the following:

#include <stdio.h>

int main(void)
{
char c = 'A';
c += 'a' - 'A';

if (c >= 'a' && c <= 'z')
puts("'c' has become a lowercase character!");

return 0;
}

You might wonder what the ASCII value of such or such letter is. Truth is that does not matter and is even irrelevant and that you should always use the character itself when performing operations on characters.

Control structures

Instructions and blocks

A block regroups many instructions or expressions. It creates a scope where variables used in expressions can "live". It is specified by specific delimiters: { and }. Functions are a special kind of blocks. Blocks may be nested and empty.

If ... else

Conditions allow the program to execute different instructions based on the result of an expression.

if (expression)
{
instr1;
}
else
{
instr2;
}

For example:

if (a > b)
a = b;
else
a = 0;
tip

You can see that there are no braces here, if your block has only one instruction, it is allowed to omit braces.

Ternary operator

This operator allows to make a test with a return value. It is a compact version of if.

condition ? exp1 : exp2

It reads as follow:

"if" condition "then" exp1 "else" exp2

Example:

int i = 42;
int j = (i == 42) ? 43 : 42; // j equals 43

While

A loop repeats its instructions while the condition is met.

while (condition)
{
instr;
}
tip

Braces are mandatory only if instr is made of several instructions.

Example:

int i = 0;

while (i < 100)
{
i++;
}

Do ... while

The condition is checked only after the first run of the loop. Hence, instr is always executed at least once.

do {
instr;
} while (condition);

Example:

int i = 0;

do {
i++;
} while (i < 100);

For

Prefer the more compact for loop syntax when you need to repeat the same instructions a known amount of times.

for (assignation; condition; increment)
{
instr;
}

Example:

for (int i = 0; i < 10; i++)
{
// do something 10 times
}

Break, continue

  • break: exits the current loop.
  • continue: skips the current iteration of a loop and goes directly to the next iteration.

Example:

for (int i = 0; i < 10; i++)
{
if (i == 2 || i == 4)
continue;
else if (i == 6)
break;
puts("I am looping!");
}

// The text "I am looping!" will only be printed 4 times.

Switch

The switch statement allows to execute instructions depending on the evaluation of an expression. It is more elegant than a series of if ... else when dealing with a large amount of possible values for one expression.

switch (expression)
{
case value:
instr1;
break;
/* ... */
default:
instrn;
}

Detail:

  • value is a numerical constant or an enumeration value.
  • expression must have integer or enumeration type.

It is important to put a break at the end of all cases, else the code of the other instructions will also be executed until the first break. The default case is optional. It is used to perform an action if none of the previous values match.

Example:

switch (a)
{
case 1:
b++;
break;
case 2:
b--;
break;
default:
b = 0;
};

Functions

Definition

A function can be defined as a reusable and customizable piece of source code, that may return a result. In C, there is barely any difference between functions and procedures. Procedures can be seen as functions that do not have a return value (void).

Use

A function is made of a prototype and a body.

Prototypes follow this syntax:

type my_func(type1 var1, ...);
  • type is the return type of the function (void in case of a procedure).
  • my_func is the name of the function (or symbol) and follows the same rules as variables' name.
  • (type1 var1, ...) is the list of parameters passed to the function.

If the function has no parameter, you have to put the void keyword instead of the parameters list:

type my_func2(void);

Definition of the body:

type my_func(type1 var1, type2 var2...)
{
/* code ... */
return val;
}

The execution of the return instruction stops the execution of the function. If the function's return type is not void, return is mandatory, otherwise it will cause undefined behaviors. If the return type is void and that return is present, its only use is to end the function's execution (return;).

danger

When a function has no parameter, forgetting the void keyword can lead to bugs.

Notice the difference between type my_func(void) and type my_func():

  • The type my_func(void) syntax indicates that the function is taking no arguments.

  • The type my_func() means that the function is taking an unspecified number of arguments (zero or more). You must avoid using this syntax.

When a function takes arguments, declare them; if it takes no arguments, use void.

Here is an example showing the risk of forgetting the void keyword.

int foo()
{
if (foo(42))
return 42;
else
return foo(0);
}

If you test this code, you will realize that it compiles and runs causing undefined behavior. However, if you use int foo(void) it will generate a compilation error.

Function call

In order to use a function, you need to call it, using this syntax:

my_fct(arg1, ...)

Arguments can either be variables or literal values.

Example:

int sum(int a, int b)
{
return a + b;
}

int a = 43;
int c = sum(a, 5);
tip

If you want to call a function that does not take any argument, just leave the parentheses empty.

Arguments of a function are always passed by copy, which implies that their modification will not have an impact outside the function.

#include <stdio.h>

void modif(int i)
{
i = 0;
}

int main(void)
{
int i;

i = 42;
modif(i);
if (i == 42)
puts("Not modified");
else
puts("Modified");
return 0;
}

The previous example displays "Not modified".

Recursion

It is possible for a function to be recursive. The following example returns the sum of numbers from 0 to i.

int recurse(int i)
{
if (i)
return i + recurse(i - 1);
return 0;
}

Forward declaration

Sometimes, it is necessary to use a function before its definition (before its code). In this case, it is enough to write the function's prototype above the location where we want to make the function call, outside of any block. This is the same as declaring the function (to declare that the function exists) without defining it (implementing its body). Hence, the compiler will know that the function exists but that its implementation will be given later.

Example (note the ; at the end of the prototype):

int my_fct(int arg1, float arg2);

int my_fct2(int arg1)
{
return my_fct(arg1, 0.3);
}

int my_fct(int arg1, float arg2)
{
// returns something
}

Without the forward declaration, the compiler would tell you it does not know the function my_fct.