Sage-Code Laboratory

Numeric Algebra

A numeric algebra is the science of numbers. In computer science a number can be represented by different systems and you have to learn this in order to understand code and be able to produce new code.

Page bookmarks:



Decimal System

You learn in school Arithmetic using decimal system. This is based on old Arabic symbols now considered the best numeric representation. It uses one single symbol for each digit except 10 that is using two symbols.

Digit Names

We have a "name" for each digit. Funny though, numbers are translated different in different languages.

Decimal English Romanian Italian Spanish French
0 zero zero zero cero zéro
1 one unu uno uno une
2 two doi due dos deux
3 threetrei tre Tres trois
4 four patruquattrocuatroquatre
5 five cincicinque cinco cinq
6 six şase sei seis six
7 sevenşaptesette siete sept
8 eightopt otto ocho huit
9 nine nouă nove nueve neuf
10ten zece dieci diez dix

Large Numbers

For large numbers we use addition. We separate group of 3 large numbers using a dot or a comma separator. This is tricky business because Europeans are using dot separator Americans are using comma separator.

First digit in a large number can not be 0. First zeros are not significant. Therefore number 024.220.421 is not a correct decimal number.

Small Numbers

For small numbers we use decimals. These are numbers that follow "." (dot) - but this is American notation. In Europe we use "," (comma) to create small numbers. So now you can be totally confused because we are so divided even if we all live on the same planet.

Last digit in a small number can not be 0. Last zeros are not significant. Therefore number 0.220 is not a correct decimal number.

Scientific Notation

We are using multiplication of a small number "m" between (0..1) with 10 at power of "n" (m × 10ⁿ). This notation is used to display very large numbers in scientific papers. Some examples of large numbers:

Engineering Notation

We are using multiplication of any number "m" with 10 at power of "n" (m × 10ⁿ). This notation is used to display very large numbers in scientific papers. Some examples of large numbers:

Exponential Notation

We are using "E" notation in Computer Science because we are using ASCII symbols and there is no support for superscript. So we use letter "e" and "E" to express power of 10 without explicit using number 10. This notation looks like: "###E##" or "###e##" where "#" is a digit. Fractions are using negative exponent: "###E-##".

Roman System

Believe or not but Romans ware very bad at mathematics. They knew how to count only up to 5000, using sticks. They do not have a representation for Null. So this numeric system is actually not used in Computer Science.

In this system, numbers are represented using sticks. It is very fun to make a program that convert an decimal number into a Roman number. Later, we will use this exercise to learn programming languages. Here are some of Roman numbers:

Symbols used:

1510501005001000
IVX L C D M

Count to 10:

I II III IV V VI VIIVIIIIX X
1 2 3 4 5 6 7 8 9 10

To create numbers you can use additions or subtraction. Therefore you can create correct and incorrect numbers. Sometimes this can be very difficult so you need a lot of practice to read large Roman numbers.

Numerals can be added together to form new numbers (e.g., III = I + I + I = 3), but no more than three of the same numeral can be added together.

In addition, to form any numbers ending with 4 or 9 or the numbers 40, 90, 400, and 900, you must subtract from the larger unit because you cannot add more than three of the same numeral. For example, IV = V − I = 5 − 1 = 4.

Correct:(Using Subtraction) Incorrect:(Using Addition) Decimal
IVIIII 4
IXVIIII 9
XLXXXX 40
XCLXXXX 90
CDCCCC 400
CMDCCCC 900

Binary System

A single storage unit is called bit. It can store exclusive: 0 or 1. Using multiple storage units grouped together we can store combinations of 0 and 1. For example on two bit we can represent 4 combinations: {00, 01, 10, 11}. This is called "binary" system. It has two digits: {0,1}.

Counting from 1 to 10

It takes 4 bits to be able to count from 0 to 10. This is one of the reason computers have memory organized in multiple of 4. Now let's learn how to count using binary:

Octal System

If we pair up 3 bits we can represent more digits: {0,1,2,3,4,5,6,7} this kind of representation is called "octal" system, and is very rarely used.

Hexadecimal System

We can use 4 bits and increase the number of combinations to 16 digits: {0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ,A ,B ,C ,D ,E ,F}. The first {0001 = 1} the last {1111 = F}. This representation is called "hexadecimal". One hexadecimal digit occupy 4 bits, so it matches very well with binary system.

As you can see there is not an exact match between the number of possible combinations and 10 digits. Therefore we have invented new digits: {A, B, C, D, E, F} sometimes these digits are represented using lowercase characters: { a, b, c, d, e, f }.

Memory Addresses

Computer science is using groups of 8 bits to represent memory addresses between (0..255) ≡ (00..FF). This is called "1 byte"= 8 bit.

Two bytes used to represent a "1 word" = 16 bit ≡ (0000..FFFF). On 16 bit we can represent 2¹⁶ = 65536 addresses that is numbers from 0 to 65,535.

Memory Capacity

In computer science we measure memory using Bytes (B), Kilo Bytes (KB) and Mega Bytes (MB), then Giga Bytes (GB) and Tera Bytes (TB) and Penta Bytes (PB) and Exa Bytes (EB).

  1. 1 B = 1 Byte = 8 bit
  2. 1 KB = 1024 B
  3. 1 MB = 1024 KB
  4. 1 GB = 1024 MB
  5. 1 TB = 1024 GB

On 32 bit we can represent 2³² = 4294967296 addresses. That is numbers from 0 to 4,294,967,295. Therefore operating systems on 32 bits have a limited memory capacity less than 4 GB of RAM.

On 64 bit we can represent 2⁶⁴ = 8,446,744,073,709,551,616 bytes. This is a very large number representing 16 exabytes RAM capacity. Of course in practical applications, the maximum numbers is much lower, for example AMD64 standard allows 256 TB of RAM.

Endian Encoding

Internal representation of numbers and symbols is different depending on the computer type, operating system and device. For now is suffice to know that "PC" encoding for example is different than "MAC" encoding.

In computing, endianness is the order or sequence of bytes of a word of digital data in computer memory. Endianness is primarily expressed as big-endian (BE) or little-endian (LE).

BE: A big-endian system stores the most significant byte of a word at the smallest memory address and the least significant byte at the largest.

LE: A little-endian system, in contrast, stores the least-significant byte at the smallest address.

Endianness may also be used to describe the order in which the bits are transmitted over a communication channel, e.g., big-endian in a communications channel transmits the most significant bits first.

Identifiers

When you create a program, you give names to program elements. These names are symbolic representations for data elements, structures, group of statements or sub-programs. You need a name so you can refer to data elements more easily, without repeating the data symbols themselves.

Example:

In this example we define a vector "v" using a Python single dimension list and then check if an element is member of this list. "v" and "x" are both identifiers.

v = [1,4,6,12,32]
x = input("check this number")
if x in v:
 print("found")
else:
 print("not found")

Sigil:

Sometimes a language uses some kind of prefix for variable names. This is called "sigil" and has the purpose to differentiate identifiers by purpose. For example in Ruby, the global variables use sigil "$". That means all variables that start with "$" will be global variables while sigil "@" represent an object attribute.

In PHP all variables start with "$", global or not global. Some developers find this rule annoying. In my languages Bee and EVE I use sigil "$" for global system constants and "@" for global system variables.

Data Literals

Data literals are symbols or group of symbols that represent constant data. For example: 100 represent the number 100 written in decimal. There are numerous other notations for numbers, representing different data types.

A group of multiple data elements like a list or a data set can have a special literal created with alphanumeric symbols, delimiters and separators. Once you have learned these conventions, most languages will be easier to grasp since all are using same conventions.

Example Description
'a' Single quoted string
"string" Double quoted string
0b10110 Binary number
1234 Integer number
OxFFFF Hexadecimal number
U+FFFF Unicode code-point
0.05 Decimal number
1E10 Scientific notation float numbers

Expressions

You should be familiar with this concept from mathematics. In computer science there are 3 kind of expressions: Infix, Postfix and Prefix. It is easiest to demonstrate the differences by looking at examples of operators that take two operands:

Expressions types:

+ x y : Prefix
x + y : Infix
x y + : Postfix

In these expressions x, y are operands while "+" is an operator. The most simple expressions are using one single operator and one operand. For example "-4" is an expression while "4" is not an expression but a single constant literal. "2 + 4" however is an expression even if there is no variable involved.

Expressions can be combined in larger expressions using operators. Order of execution can be controlled using operator precedence and also round parenthesis (x+y). We will investigate this in our next examples:

Infix Expressions:

x+y+z
a + b / 2 + c * 2
(a + b) / (a * b)
x ≠ y
a ≤ 5

Most computer languages are using infix expressions. You will learn details about literals and expressions in next course: CSP: Programming. Usually we describe literals and expressions as basic language concepts in first or second tutorial article about every computer language.

Type systems

Programming languages must deal with data. To do this, data is classified by category. For each category there are rules of representation and manipulation. This classification and the rules are called in computer science: "type system".

Immutable Data

Data can be embedded into the program itself or it can be external. When embedded in the program, data is immutable. That means it is constant. You can change this kind of data only if you change the program.

Mutable Data

External data is usually mutable. If is stored on a device that has R/W (Read & Write) capability like HDD (Hard Disk), or RW-CD (Read Write – Compact Disk). Sometimes a device can store external data in read only mode. For example optical disk or ROM (Read Only Memory) can store data that becomes immutable.

Variables

A variable represents data stored in RAM (Random Access Memory). We can alter value of this kind of data very easy, many times without wearing off the storage. In low level computation data is stored as bit: "0" or "1", but a high level language you can store and manipulate abstract data types, for example: "numeric", "string", "boolean", "date", "time".

Measurement Units

Most computer languages do not have notion of measurement units. So a number can represent any kind of physical or abstract concept. For example: width, height, weight. Only "date" and "time" have measurement units. Computer scientists believe that a computer language becomes too complex if is dealing with measurement units.

Static typing

In Computer Science we say a programming language is using "static typing" when the data type for a variable or a parameter must be defined in the same time with the variable and can not be changed during run-time. You can of course change the declaration in source code and then a variable will have a new type but this is a permanent change.

Dynamic typing

In Computer Science we say a programming language is using "dynamic typing" when the data type for a variable or a parameter is not defined and can be changed. This do not means a variable do not have a type. It means the type can change when you change the value dynamically at run-time.

Namespaces & Scope

Each sub-program usually has its own "local scope". Sometimes in the local scope you can define nested sub-programs. Most languages will enable creation of variables and constants in local scope.

When you define identifiers you must know where this identifier is visible. If an identifier is visible in all your program, it is called "global" identifier. The area of visibility is called "scope". For example a program can have a single "global scope" or many "module scopes" or "package scope".

Scope Model

Older programming languages have "Dynamic Scoping". By the time this scpe model was created, people do not knew any better so this terminology was introduce later after we have descover "Static Scope". New programming languages use a "Static Scope" that is considered superior to "Dynamic Scope".

Dynamic Scope

Is used by structured programming and imperative programming languages. For this model you can design global variables that can be used in many functions and can be modified anywhere in the program. Therefore function results can be influenced and can produce different results for same arguments.

Dynamic scope is created on the stack. Every time a function is called, the local variables are created and push on the stack. When a function end execution, all vriables defined in function local scope are removed from memory.

Static Scope

In this model, the scope is created on the heap. A function has a unique scope that is resilient in memory. When a function is called, variables are already initialized. Only parameters are pushed on the stack. Therefore a function is behaving like a first class object. This is how functions can have states. This was known in imperative programming as "static variables".

Execution Context

In local scopping there is a concept of "Execution Context". That is the outer scope of function. In Static Scopped languages, the outer scope variables are bound to the inner functions. This feature enable functional programming languages to create "closures".

Later mutation of values for external variables will not influence function result. The function is going to use the initial value that was given before the function was created. All other next modifications will be ignored. This reenforce encapsulation concept and makes functions more independent.

Shadowing Effect

Shadowing is a secondary effect of most programming languages that support local scope. If you define a variable in local scope having the same name as a variable defined in outer scope the two names will collide. To avoid collision, the languages usually hide the external variable and enable access only to local variable. This effect is called "shadowing". Parameters also have a shadowing effect over outer variables or parameters.

Dot notation

When you define a data structure, the elements in data structure can be public or private. If elements are public, usual notation called "dot notation", enable you to access a member of a collection by name.

Example:

class Person:
    name = "Barbu"
    age = 22
pass #end class
# using dot notation
print(Person.name) # Barbu
print(Person.age) # 22

Note: In previous example we define a "structure class" in Python that is also known as "data class". This kind of class do not need to be instantiated and it behaves like a singleton. It is like an object that has a single instance also known as "static class".


Read next: Programming Paradigms