Numeric Algebra

A numeric algebra is the science of numbers. In computer science, a number can be represented in different systems. You have to learn these concepts to understand and produce code effectively.

Decimal System

In school, we learn arithmetic using the decimal system. This is based on ancient Arabic symbols, now considered the standard for numeric representation. It uses a unique symbol for each of the ten digits (0-9).

Digit Names

Interestingly, the names for numbers differ across languages. Here is a comparison of digits and their names in several Romance languages.

Decimal	English	Romanian	Italian	Spanish	French
0	zero	zero	zero	cero	zéro
1	one	unu	uno	uno	une
2	two	doi	due	dos	deux
3	three	trei	tre	tres	trois
4	four	patru	quattro	cuatro	quatre
5	five	cinci	cinque	cinco	cinq
6	six	şase	sei	seis	six
7	seven	şapte	sette	siete	sept
8	eight	opt	otto	ocho	huit
9	nine	nouă	nove	nueve	neuf
10	ten	zece	dieci	diez	dix

Large Numbers

For large numbers, we group digits for readability. This can be tricky, as the convention for the thousands separator varies geographically. Europeans typically use a dot, while Americans use a comma.

European number: 24.220.421
American number: 24,220,421

In standard decimal notation, leading zeros are not significant and are typically omitted. For example, the number 024.220.421 is understood as 24.220.421.

Small Numbers (Decimals)

For fractional numbers, we use a decimal separator. This convention also varies by region. Americans use a dot (.), while Europeans often use a comma (,).

European small number: 0,22
American small number: 0.22

In many contexts, trailing zeros after the decimal point do not change a number's value (e.g., 0.22 is the same as 0.220). However, in scientific and engineering fields, trailing zeros are often used to indicate a specific level of precision.

Scientific Notation

This notation is used to represent very large or very small numbers concisely. It takes the form of m × 10ⁿ, where 'm' is a number between 1 and 10. Examples:

1.22 × 10¹² = 1,220,000,000,000

Engineering Notation

Similar to scientific notation, but 'n' is always a multiple of 3. The mantissa 'm' is typically between 1 and 1000.

12.12 × 10¹² = 12,120,000,000,000

Exponential Notation

Computer science uses "E" notation as a substitute for × 10ⁿ, as it only uses ASCII characters. The format is "mEn" or "men". Fractions use a negative exponent: "mE-n".

Earth mass: 5.9724E24 kg
One inch is: 2.54E1 mm (or 25.4 mm)

Roman System

The Roman numeral system is historically interesting but is not used in computer science for computation because it lacks a representation for zero and is inefficient for arithmetic.

In this system, numbers are represented by combinations of letters from the Latin alphabet. It can be a fun exercise to write a program that converts a decimal number into a Roman numeral. Here are the basic symbols:

Symbols used:

1	5	10	50	100	500	1000
I	V	X	L	C	D	M

Count to 10:

I	II	III	IV	V	VI	VII	VIII	IX	X
1	2	3	4	5	6	7	8	9	10

Numbers are formed by combining symbols and adding their values. A symbol can be repeated up to three times. To avoid repeating a symbol four times (e.g., IIII for 4), a subtractive principle is used: a smaller symbol placed before a larger one is subtracted.

Correct (Subtraction)	Incorrect (Addition)	Decimal
IV	IIII	4
IX	VIIII	9
XL	XXXX	40
XC	LXXXX	90
CD	CCCC	400
CM	DCCCC	900

Binary System

The fundamental storage unit in a computer is a bit, which can hold a value of either 0 or 1. By grouping bits, we can represent larger numbers. A system with two digits {0, 1} is called a "binary" system.

Counting from 1 to 10

Representing the numbers 0 through 10 in binary requires 4 bits.

0000 = 0
0001 = 1
0010 = 2
0011 = 3
0100 = 4
0101 = 5
0110 = 6
0111 = 7
1000 = 8
1001 = 9
1010 = 10

Octal System

If we group 3 bits, we can represent 8 distinct values {0,1,2,3,4,5,6,7}. This base-8 representation is called the "octal" system and is rarely used in modern computing.

Hexadecimal System

Using 4 bits allows for 16 combinations. To represent these values, we use the 10 decimal digits plus the first 6 letters of the alphabet: {0, 1, ..., 9, A, B, C, D, E, F}. This base-16 representation is called "hexadecimal". Since one hexadecimal digit corresponds directly to 4 bits (a nibble), it provides a human-friendly way to represent binary-coded values.

Memory Addresses

A group of 8 bits is called a byte. A byte can represent 2⁸ = 256 different values, from 0 to 255 in decimal, or 00 to FF in hexadecimal.

A word is a common unit of data, though its size can vary by architecture. For a 16-bit word, we can represent 2¹⁶ = 65,536 values (0 to 65,535).

Memory Capacity

Memory is measured in bytes and its multiples:

1 B = 1 Byte = 8 bits
1 KB (Kilobyte) = 1024 B
1 MB (Megabyte) = 1024 KB
1 GB (Gigabyte) = 1024 MB
1 TB (Terabyte) = 1024 GB
1 PB (Petabyte) = 1024 TB
1 EB (Exabyte) = 1024 PB

A 32-bit system can address 2³² = 4,294,967,296 unique memory locations. This is why 32-bit operating systems are limited to using less than 4 GB of RAM.

A 64-bit system can theoretically address 2⁶⁴ bytes, which is over 16 exabytes. In practice, current hardware and operating systems have lower limits; for example, the AMD64 architecture currently supports up to 256 TB of RAM.

Endian Encoding

The order in which bytes are arranged to form a larger numerical value in memory is called endianness. For now, it is sufficient to know that this ordering can differ between computer architectures (e.g., PC vs. older Mac systems).

BE (Big-Endian): Stores the most significant byte at the smallest memory address.

LE (Little-Endian): Stores the least significant byte at the smallest memory address.

Endianness also applies to the order of bits transmitted over a communication channel.

Type Systems

Programming languages must deal with data. To do this, data is classified by category. This classification and its associated rules are known as a "type system" in computer science.

Immutable vs. Mutable Data

Data can be constant (immutable) or changeable (mutable). Data embedded directly in a program's source code is typically immutable. External data, stored on a read/write device like a Hard Disk (HDD), is usually mutable. Data on a Read-Only Memory (ROM) chip is immutable.

Variables

A variable is a name that refers to data stored in RAM (Random-Access Memory). The value of a variable can be altered easily. While at a low level data is stored as bits, high-level languages allow us to work with abstract data types like "numeric," "string," "boolean," "date," and "time."

Measurement Units

Most computer languages do not have a built-in notion of physical measurement units. A number can represent anything—width, height, weight, etc. Date and time types are a notable exception. Adding a full measurement unit system is often considered to add too much complexity to a general-purpose language.

Static vs. Dynamic Typing

Static Typing: A language uses static typing when the data type of a variable is fixed when it is declared and cannot be changed during runtime. The type is checked at compile-time.

Dynamic Typing: A language uses dynamic typing when a variable's type is determined at runtime and can change as different values are assigned to it. This does not mean the variable lacks a type, only that the type is flexible.

Identifiers

When you create a program, you give names to program elements like variables, data structures, and sub-programs. These names are symbolic representations that allow you to refer to data and logic easily.

Example:

In this Python example, we define a list of numbers and check if a given number is a member of the list. Here, v and x are both identifiers.

v = [1, 4, 6, 12, 32]
x = int(input("check this number: "))
if x in v:
 print("found")
else:
 print("not found")

Sigil

Some languages use a special prefix character, called a "sigil," to indicate the type or scope of an identifier. For example, in Ruby, a global variable starts with $, while an instance variable starts with @. In PHP, all variables start with $.

Data Literals

Data literals are notations for representing fixed values in source code. For example, 100 is the literal for the decimal number 100. There are various notations for different data types.

Example	Description
'a'	Single-quoted character or string
"string"	Double-quoted string
0b10110	Binary integer literal
1234	Decimal integer literal
0xFFFF	Hexadecimal integer literal
U+FFFF	Unicode code-point
0.05	Floating-point literal
1E10	Scientific notation for floats

Expressions

An expression is a combination of values, variables, operators, and functions that a language interprets and computes to produce another value. There are three common notations for expressions:

+ x y  # Prefix
x + y  # Infix
x y +  # Postfix

In these examples, x and y are operands and + is the operator. Expressions can be as simple as a single literal (e.g., 4) or a combination of many elements (e.g., (a + b) / 2).

Infix Expressions

Most programming languages use infix notation, where operators are written between their operands.

x + y + z
a + b / 2 + c * 2
(a + b) / (a * b)
x != y
a <= 5

Namespaces & Scope

The "scope" of an identifier is the region of the code where it is visible and can be used. Each sub-program usually defines its own "local scope."

If an identifier is visible throughout the entire program, it is called a "global" identifier. A program can have a single global scope or multiple module-level scopes.

Scope Model

Early programming languages often used "Dynamic Scoping." Modern languages predominantly use "Static Scoping" (also called Lexical Scoping), which is considered superior for code readability and maintainability.

Dynamic Scope

In dynamic scoping, the scope of an identifier is determined by the execution path of the program. A function can access the variables of the function that called it. This can make code difficult to reason about, as function behavior can change depending on where it is called from.

Static (Lexical) Scope

In static scoping, the scope of an identifier is determined by its position in the source code. An inner function can access variables from its outer (enclosing) functions. This allows for powerful features like "closures," where a function remembers the environment in which it was created.

Shadowing Effect

If you declare a variable in a local scope with the same name as a variable in an outer scope, the local variable "shadows" the outer one. Within the local scope, the name will refer to the local variable, effectively hiding the outer one.

Dot notation

When you define a data structure (like a class or object), its members (attributes or methods) can be accessed using "dot notation".

Example:

class Person:
    name = "Barbu"
    age = 22

# using dot notation to access class attributes
print(Person.name) # Barbu
print(Person.age) # 22

Note: The example above defines a class with class attributes. These attributes are shared by all instances of the class. They can be accessed directly on the class itself without creating an instance.

Read next: Data Structures