Numeric Algebra
Decimal System
In school, we learn arithmetic using the decimal system. This is based on ancient Arabic symbols, now considered the standard for numeric representation. It uses a unique symbol for each of the ten digits (0-9).
Digit Names
Interestingly, the names for numbers differ across languages. Here is a comparison of digits and their names in several Romance languages.
| Decimal | English | Romanian | Italian | Spanish | French |
|---|---|---|---|---|---|
| 0 | zero | zero | zero | cero | zéro |
| 1 | one | unu | uno | uno | une |
| 2 | two | doi | due | dos | deux |
| 3 | three | trei | tre | tres | trois |
| 4 | four | patru | quattro | cuatro | quatre |
| 5 | five | cinci | cinque | cinco | cinq |
| 6 | six | şase | sei | seis | six |
| 7 | seven | şapte | sette | siete | sept |
| 8 | eight | opt | otto | ocho | huit |
| 9 | nine | nouă | nove | nueve | neuf |
| 10 | ten | zece | dieci | diez | dix |
Large Numbers
For large numbers, we group digits for readability. This can be tricky, as the convention for the thousands separator varies geographically. Europeans typically use a dot, while Americans use a comma.
- European number: 24.220.421
- American number: 24,220,421
Small Numbers (Decimals)
For fractional numbers, we use a decimal separator. This convention also varies by region. Americans use a dot (.), while Europeans often use a comma (,).
- European small number: 0,22
- American small number: 0.22
Scientific Notation
This notation is used to represent very large or very small numbers concisely. It takes the form of m × 10n, where 'm' is a number between 1 and 10. Examples:
- 1.22 × 10¹² = 1,220,000,000,000
Engineering Notation
Similar to scientific notation, but 'n' is always a multiple of 3. The mantissa 'm' is typically between 1 and 1000.
- 12.12 × 10¹² = 12,120,000,000,000
Exponential Notation
Computer science uses "E" notation as a substitute for × 10n, as it only uses ASCII characters. The format is "mEn" or "men". Fractions use a negative exponent: "mE-n".
- Earth mass: 5.9724E24 kg
- One inch is: 2.54E1 mm (or 25.4 mm)
Roman System
The Roman numeral system is historically interesting but is not used in computer science for computation because it lacks a representation for zero and is inefficient for arithmetic.
In this system, numbers are represented by combinations of letters from the Latin alphabet. It can be a fun exercise to write a program that converts a decimal number into a Roman numeral. Here are the basic symbols:
Symbols used:
| 1 | 5 | 10 | 50 | 100 | 500 | 1000 |
| I | V | X | L | C | D | M |
Count to 10:
| I | II | III | IV | V | VI | VII | VIII | IX | X |
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
Numbers are formed by combining symbols and adding their values. A symbol can be repeated up to three times. To avoid repeating a symbol four times (e.g., IIII for 4), a subtractive principle is used: a smaller symbol placed before a larger one is subtracted.
| Correct (Subtraction) | Incorrect (Addition) | Decimal |
|---|---|---|
| IV | IIII | 4 |
| IX | VIIII | 9 |
| XL | XXXX | 40 |
| XC | LXXXX | 90 |
| CD | CCCC | 400 |
| CM | DCCCC | 900 |
Binary System
The fundamental storage unit in a computer is a bit, which can hold a value of either 0 or 1. By grouping bits, we can represent larger numbers. A system with two digits {0, 1} is called a "binary" system.
Counting from 1 to 10
Representing the numbers 0 through 10 in binary requires 4 bits.
- 0000 = 0
- 0001 = 1
- 0010 = 2
- 0011 = 3
- 0100 = 4
- 0101 = 5
- 0110 = 6
- 0111 = 7
- 1000 = 8
- 1001 = 9
- 1010 = 10
Octal System
If we group 3 bits, we can represent 8 distinct values {0,1,2,3,4,5,6,7}. This base-8 representation is called the "octal" system and is rarely used in modern computing.
Hexadecimal System
Using 4 bits allows for 16 combinations. To represent these values, we use the 10 decimal digits plus the first 6 letters of the alphabet: {0, 1, ..., 9, A, B, C, D, E, F}. This base-16 representation is called "hexadecimal". Since one hexadecimal digit corresponds directly to 4 bits (a nibble), it provides a human-friendly way to represent binary-coded values.
Memory Addresses
A group of 8 bits is called a byte. A byte can represent 2⁸ = 256 different values, from 0 to 255 in decimal, or 00 to FF in hexadecimal.
A word is a common unit of data, though its size can vary by architecture. For a 16-bit word, we can represent 2¹⁶ = 65,536 values (0 to 65,535).
Memory Capacity
Memory is measured in bytes and its multiples:
- 1 B = 1 Byte = 8 bits
- 1 KB (Kilobyte) = 1024 B
- 1 MB (Megabyte) = 1024 KB
- 1 GB (Gigabyte) = 1024 MB
- 1 TB (Terabyte) = 1024 GB
- 1 PB (Petabyte) = 1024 TB
- 1 EB (Exabyte) = 1024 PB
A 32-bit system can address 2³² = 4,294,967,296 unique memory locations. This is why 32-bit operating systems are limited to using less than 4 GB of RAM.
A 64-bit system can theoretically address 2⁶⁴ bytes, which is over 16 exabytes. In practice, current hardware and operating systems have lower limits; for example, the AMD64 architecture currently supports up to 256 TB of RAM.
Endian Encoding
The order in which bytes are arranged to form a larger numerical value in memory is called endianness. For now, it is sufficient to know that this ordering can differ between computer architectures (e.g., PC vs. older Mac systems).
BE (Big-Endian): Stores the most significant byte at the smallest memory address.
LE (Little-Endian): Stores the least significant byte at the smallest memory address.
Endianness also applies to the order of bits transmitted over a communication channel.
Type Systems
Programming languages must deal with data. To do this, data is classified by category. This classification and its associated rules are known as a "type system" in computer science.
Immutable vs. Mutable Data
Data can be constant (immutable) or changeable (mutable). Data embedded directly in a program's source code is typically immutable. External data, stored on a read/write device like a Hard Disk (HDD), is usually mutable. Data on a Read-Only Memory (ROM) chip is immutable.
Variables
A variable is a name that refers to data stored in RAM (Random-Access Memory). The value of a variable can be altered easily. While at a low level data is stored as bits, high-level languages allow us to work with abstract data types like "numeric," "string," "boolean," "date," and "time."
Measurement Units
Most computer languages do not have a built-in notion of physical measurement units. A number can represent anything—width, height, weight, etc. Date and time types are a notable exception. Adding a full measurement unit system is often considered to add too much complexity to a general-purpose language.
Static vs. Dynamic Typing
Static Typing: A language uses static typing when the data type of a variable is fixed when it is declared and cannot be changed during runtime. The type is checked at compile-time.
Dynamic Typing: A language uses dynamic typing when a variable's type is determined at runtime and can change as different values are assigned to it. This does not mean the variable lacks a type, only that the type is flexible.
Identifiers
When you create a program, you give names to program elements like variables, data structures, and sub-programs. These names are symbolic representations that allow you to refer to data and logic easily.
Example:
In this Python example, we define a list of numbers and check if a given number is a member of the list. Here, v and x are both identifiers.
v = [1, 4, 6, 12, 32]
x = int(input("check this number: "))
if x in v:
print("found")
else:
print("not found")
Sigil
Some languages use a special prefix character, called a "sigil," to indicate the type or scope of an identifier. For example, in Ruby, a global variable starts with $, while an instance variable starts with @. In PHP, all variables start with $.
Data Literals
Data literals are notations for representing fixed values in source code. For example, 100 is the literal for the decimal number 100. There are various notations for different data types.
| Example | Description |
|---|---|
| 'a' | Single-quoted character or string |
| "string" | Double-quoted string |
| 0b10110 | Binary integer literal |
| 1234 | Decimal integer literal |
| 0xFFFF | Hexadecimal integer literal |
| U+FFFF | Unicode code-point |
| 0.05 | Floating-point literal |
| 1E10 | Scientific notation for floats |
Expressions
An expression is a combination of values, variables, operators, and functions that a language interprets and computes to produce another value. There are three common notations for expressions:
+ x y # Prefix
x + y # Infix
x y + # Postfix
In these examples, x and y are operands and + is the operator. Expressions can be as simple as a single literal (e.g., 4) or a combination of many elements (e.g., (a + b) / 2).
Infix Expressions
Most programming languages use infix notation, where operators are written between their operands.
x + y + z
a + b / 2 + c * 2
(a + b) / (a * b)
x != y
a <= 5
Namespaces & Scope
If an identifier is visible throughout the entire program, it is called a "global" identifier. A program can have a single global scope or multiple module-level scopes.
Scope Model
Early programming languages often used "Dynamic Scoping." Modern languages predominantly use "Static Scoping" (also called Lexical Scoping), which is considered superior for code readability and maintainability.
Dynamic Scope
In dynamic scoping, the scope of an identifier is determined by the execution path of the program. A function can access the variables of the function that called it. This can make code difficult to reason about, as function behavior can change depending on where it is called from.
Static (Lexical) Scope
In static scoping, the scope of an identifier is determined by its position in the source code. An inner function can access variables from its outer (enclosing) functions. This allows for powerful features like "closures," where a function remembers the environment in which it was created.
Shadowing Effect
If you declare a variable in a local scope with the same name as a variable in an outer scope, the local variable "shadows" the outer one. Within the local scope, the name will refer to the local variable, effectively hiding the outer one.
Dot notation
When you define a data structure (like a class or object), its members (attributes or methods) can be accessed using "dot notation".
Example:
class Person:
name = "Barbu"
age = 22
# using dot notation to access class attributes
print(Person.name) # Barbu
print(Person.age) # 22
Note: The example above defines a class with class attributes. These attributes are shared by all instances of the class. They can be accessed directly on the class itself without creating an instance.
Read next: Data Structures