Programming Concepts

This article is intended for those new to computer science. We begin by exploring the fundamental concepts that form the foundation of programming and software engineering. These concepts are essential for creating new software, solving problems, and supporting business operations. Understanding them will help you build a strong base for further study and creation of practical applications.

Computer programs

Computer programs, often called applications, are fundamentally a series of step-by-step instructions executed by a computer. These instructions are organized into statements, which are then grouped into modules or packages. A single program can consist of numerous packages and may include embedded data or rely on external files for data storage.

Application Purpose

Applications are built using one or more programming languages and are typically designed to solve a specific problem. They often include multiple features that allow users to interact with the system, sending or receiving information through input/output devices such as a console display or a printer.

Application Execution

The execution of a program aims to resolve a problem and communicate the result, or it can produce a physical effect, such as displaying a message on a monitor or controlling a robotic arm. A program can be designed to be executed a single time or multiple times as needed.

Code base

Computer programs consist of files stored in folders, which we call a "project" or code base. A project contains text files with language-specific extensions (source code), and may also include data files, images, audio, and video. Usually, one file is designated as the main entry point that utilizes the other files. A project should also contain documentation.

Sub-programs

Sub-programs serve the principle of "separation of concerns," meaning each sub-program is designed to do one small task. The main program then combines the effects of multiple sub-programs to implement the system's end-to-end functionality. Sub-programs can be contained within a larger program or delivered as reusable components or libraries.

Depending on the computer language, sub-programs are called sub-routines, procedures, functions, methods, or rules. What they have in common is that they all encapsulate specific functionality. The main program orchestrates the execution of sub-programs by transferring control to sub-processes, which can be executed synchronously or asynchronously (sometimes in parallel).

Formal parameters

Sub-programs can receive parameters and can compute results. Sometimes we call these "formal parameters" or "receiver variables". Usually parameters for sub-programs are immutable during sub-program execution.

Parameters are similar to local variables. They are known only inside the sub-routine. Once the subroutine is finished, the value of the parameters is lost or can be propagated back to the caller. This depends on the declaration syntax and purpose of the parameter.

Subprogram Results

A subprogram can produce one or more results, which are returned to the part of the program that initiated the call. This is often accomplished using a return statement. The data type of the result must match the type declared in the subprogram's signature. Some subprograms, known as procedures or void functions, do not return a value and are executed only for their side effects, such as printing to a console or modifying a file.

Programming Symbols

A program is made of symbols that are used to create expressions. Most languages are using expressions that are similar to the ones we learn in mathematics. Sometimes beginners have hard time understanding the difference between "expressions", "symbols" and relation with "functions". We will explain these things here.

Symbols can be single character or multi-character. Single character symbols can be {"digits", "letters", "ASCII", "UNICODE"}. We use symbols to create: operators, punctuation marks, numeric literals, string literals, identifiers, enumerations and data collections.

In general a program manipulate "data". This is also made of symbols. We learn in school to count using Arabic digital symbols: 0,1,2,3,4,5,6,7,8,9. We also learn how to read and write text using Latin alphabet: Aa, Bb, Cc, Dd, Ee, Ff, Gg ... Zz. This is not the case in all cultures around the globe. Some cultures are using other symbols to represent data.

Encoding

The way we represent data using the symbols is called encoding. The most popular encoding system known as Unicode. Before Unicode was invented we used different encodings system known as: ASCII. For different countries there is an Extended ASCII table that has diacritics and special characters. Also Unicode have several variations: UTF16, UTF32 and UTF8.

Binary

A computer uses binary representations of "1" and "0" for encoding symbols. The "1" typically represents a positive electrical charge, while "0" represents no charge. It is similar to a light bulb: when the light is on, we can consider it to represent the logical value "1"; when the light is off, we consider it to represent "0".

Storage

Electronic devices store data using specialized components. There are several physical methods, including "electro-static," "electro-magnetic," "optic," "magneto-optic," and "solid-state electronics." The fundamental point is that all modern computation is digital and based on a binary system.

Logic Science

Logic was discovered by the ancient Greeks and formalized by Aristotle. The word itself means "thought" or "reason." Initially a branch of philosophy, it has since evolved into a formal branch of mathematics. Logic is the science of truth, and its purpose is to provide a framework for reasoning.

Fundamental Concepts

To understand logic, we must first define some foundational words and rules. A firm grasp of these definitions is essential for having meaningful conversations and making sound arguments in computer science. This section provides an introduction to these core concepts.

Inference

Inference, also known as deduction or implication, is a core concept in logic. It is the conclusion reached after studying the arguments or premises within a sentence. Inference is a fundamental method of thinking and reasoning.

Logical Inference: Using valid arguments in a sentence allows us to derive a logical conclusion. Sound arguments will produce the same conclusion regardless of who makes the arguments or derives the conclusion.

Inference and reason may not operate the same way in all contexts. For example, individuals from different philosophical or religious backgrounds might interpret a sentence in different ways, leading to different conclusions.

Sentences

Sentences are sequences of words that together express an idea, describe a situation, give a command, or make an inquiry. Sentences can be classified into four main categories:

  • Imperative sentences (orders/commands)
  • Interrogative sentences (inquiries)
  • Declarative sentences (statements)
  • Exclamatory sentences (expressing surprise)

Elements

A sentence can have five elements: { subject, verb, object, complement, and adjunct } (SVOCA). The subject is the performer of an action. It usually appears at the beginning of a sentence and is represented by a noun or its equivalent (e.g., a pronoun, a noun phrase). The study of sentence structure is known as grammar.

Statements

Statements are declarative sentences that express a fact, idea, or opinion. Statements do not make requests, give commands, or express surprise. A statement usually ends with a period (.).

Some statements are neither true nor false but have an undetermined value, such as “maybe” or “probable.” There is a special branch of logic for handling such statements.

Propositions

Propositions are statements that can be either true or false, but not both at the same time. Propositions are the building blocks of propositional logic.

There are examples of declarative sentences that are not propositions. For example, the sentence “This sentence is false” is a well-known paradox. If we assign it the truth value 'True,' we imply the sentence is false. If we assign it 'False,' we imply the sentence is not false, creating a contradiction.

Fundamental principle

It is important to know that formal logic derives valid inferences from declarative sentences, not from imperative or interrogative ones. An imperative sentence is a command from an authority, such as a ruler, government, or deity.

Info: Drawing inferences from imperative sentences can lead to irrational beliefs, unjust laws, and social injustice. Logic must be based on facts and solid arguments, not on commands that bypass reason.

Composition rules

Statements can be simple or composite. Composite statements can include arguments. A composite statement is made of simple statements. Most statements are true or false.

Let's say we have an argument or a premise and a sentence. Following are some logic rules that can apply to these two:

  • Identity rule: An argument is identical with itself and not something else;
  • Contradiction rule: An argument cannot be true and false in same sentence;
  • Validity rule: An argument is valid if the truth of premises lead to truth of the conclusion.

Validity and Soundness

A deductive argument is said to be valid if takes a form that makes it impossible for the premises to be true and the conclusion nevertheless to be false. Otherwise, a deductive argument is said to be invalid.

A deductive argument is sound if and only if it is both valid, and all of its premises are true. Otherwise, a deductive argument is unsound. Unsound arguments are invalid.

Propositional Logic

Propositional logic studies the relationships between declarative sentences (propositions). These propositions can have Boolean values of True or False. This branch of logic does not concern itself with the content of the propositions, but only with the logical relationships between them.

In these rules, we use variables like A, B, and P to represent logical propositions. The core of propositional logic involves operators like AND, OR, and NOT, which have precise rules that make it a reliable and foundational theory in computer science.

Fundamental rules:

Several logical rules are universal and cannot be broken, regardless of culture or language. Understanding these rules is a crucial first step toward understanding computer science.

  • If A and B are both true, then P = A and B is true.
  • If either A or B (or both) are false, then P = A and B is false.
  • If either A or B (or both) are true, then P = A or B is true.
  • If both A and B are false, then P = A or B is false.
  • If A is true, then P = not A is false.
  • If A is false, then P = not A is true.

Additionally, the "or" and "and" operators are commutative, meaning the order of the propositions does not affect the result:

  • (A and B) is equivalent to (B and A)
  • (A or B) is equivalent to (B or A)

De Morgan's laws:

Sometimes, you can transform one logical expression into an equivalent one without knowing the value of the arguments. These rules, known as De Morgan's laws, are very useful in computer science for optimizing Boolean expressions.

  • The negation of a disjunction is the conjunction of the negations.
  • The negation of a conjunction is the disjunction of the negations.

In terms of expressions, this means:

  • not (not A) == A (Double Negation)
  • not (A or B) == (not A) and (not B)
  • not (A and B) == (not A) or (not B)

Boolean algebra

Boolean algebra is the branch of algebra in which the values of the variables are the truth values true and false, usually denoted 1 and 0 respectively. This contrasts with general algebra that study other numbers not only the value 1 and 0.

Logical Operations:

I have found these are the most suitable symbols for logical operators. Though no programming language yet uses them.

symbol alternative meaning notes
¬ ! NOT negation
& AND conjunction
| OR disjunction
^ XOR exclusive disjunction
  NOR p ↓ q = ¬ (p ∨ q)
  NAND p ↑ q = ¬ (p ∧ q)

The table of truth:

Next table shows all possible combinations and the result for logical operators.

p q ¬ p ¬ q p ⊕ q p ∧ q p ∨ q
1 1 0 0 0 1 1
1 0 0 1 1 0 1
0 1 1 0 1 0 1
0 0 1 1 0 0 0

Predicate logic

Predicate logic is an extension of propositional logic. In propositional logic we look at propositions and arguments. That can be true or false but in reality there are many other things that are not true or false. In predicate logic we use terms and relations between terms to establish the value of truth of a proposition. Predicate logic is essential in programming, to deal with numbers and data sets.

Use case

Predicate logic define new terminology and notations. Understanding this terminology will help you read complex documentation used in programming. This may be a language specification or high level requirement that you can receive on the job. You can be invited to elaborate the requirements in detailed design that require logic.

Predicate

A predicate is a property or a relation between one or more terms. One term can be a variable or a constant. A predicate is not a logic entity but it can be evaluated to value True, False or invalid, depending on the range of predicate arguments.

Predicates with one argument, are called monadic. Predicates with more arguments are called dyadic, n-adic or poly-adic. Some languages are using monadic predicates while other languages can accept poly-adic predicates.

Arguments

The arguments are specific values, used to evaluate a predicate. Sometimes the arguments are called: predicate variables or parameters. In programming and mathematics a predicate is a function or an expression that can be evaluated to True or False depending on the arguments.

Expressions

Simplest predicates are the ones expressing properties of things. We use uppercase letters to express a predicate and lowercase letters to express arguments. Some examples of predicates:

  • A(x): x is tall
  • B(x,y): x owes money to y
  • C(x,y,z): x borrowed y from z

Domain

A variable used in a predicate can be bound to a specific domain. When a variable is not bound, is called “free variable”. This is not an argument but it can influence the result of predicate. In logic papers you may encounter the name “universe of discourse” (UD) that is actually a domain or range of values from a particular data set.

Data set

A data set is a range of values. It can be numeric or can represent symbols or objects. A data set can be empty, or can have one or more elements. Some data sets can have an infinite number of elements. In a data-set the elements are unique. The elements in a data-set are usually ordered. A data set is represented as an enumeration of symbols separated by comma and enclosed in brackets, like:

  • A = {a, b, c}
  • N = {1,2,3}

Relations

Elements in a datasets can be associated using a relation. A relation between two arguments is called binary relation. The elements of a binary relation is the Cartesian product (x). A relation R is a predicate with (at least) two arguments:

R = A x N = {(a,1), (a,2), (a,3), (b,1), (b,2), (b,3), (c,1), (c,2), (c,3)}

In logic, sometimes we use symbol “∼” to define relations: For example: x∼y means x is related to y. Also we can say that pair (x,y) belong to relation “∼”. Most of the time a relation is represented by a binary operator. For example we can use { =, <, > } to express the relation between numbers.

Quantifiers

In logic we use quantifiers to refer to a set of elements. The quantifier is a symbol that specify one or more elements we are referring to. So far I have not found programming languages that are using quantifiers. Usually a quantifier is implemented using one or more statements.

  • ∀ = all (universal quantifier)
  • ∃ = exist (existential quantifier)
Note: Qualifiers can be combined together to create meaningful predicate expressions. Most common we can use for all elements in domain there is or exist x that satisfy a relation R(x).

In programming, the exist "∃" can be implemented by searching for a particular element in a data set. If the element satisfy the relation the predicate is True, otherwise False.

Belonging

We can express if an element belong or does not belong to a particular domain or dataset. This is very helpful to establish meaningful logic expressions involving data sets.

  • ∈ means: “belong”
  • ∉ means: “does not belong”

Sometimes we factor out a qualifier to refer to more than one element. For example the expression: ∃(x,y,z) ∈ {0,1,2,3,4,5,6,7,8,9}, it means exist a pair of 3 single digit numbers.

Note: In programming we can refer to a range of numbers using square brackets. [x-y] or [x..y] or [x: y]. This convention is not the same in all languages. You must learn the specific syntax.

Identity

Identity is an equivalence relation. We have seen this symbol before, used for propositional logic: { “=” , “≡”}. This symbol is polymorphic. It can be used with any kind of elements not only propositions but also predicates or terms. It’s because identity relation is symmetric, transitive and reflexive.

Properties

Understanding relations requires knowing their properties. The following three properties are especially important in logic and mathematics:

  • Reflexive: A relation '∼' on a set A is reflexive if every element is related to itself.
    ∀a ∈ A, a ∼ a
  • Symmetric: A relation '∼' on a set A is symmetric if the order of the elements does not matter. If a is related to b, then b is related to a.
    ∀a, b ∈ A, (a ∼ b) → (b ∼ a)
  • Transitive: A relation '∼' on a set A is transitive if a direct relationship between the first and second element and the second and third element implies a relationship between the first and third.
    ∀a, b, c ∈ A, ((a ∼ b) ∧ (b ∼ c)) → (a ∼ c)

An "equivalence relation" is a relation that is reflexive, symmetric, and transitive. The equality (=) operator is a common example of an equivalence relation.

Order

An set or range of elements is usually ordered. When a set is ordered you can apply special relations between consecutive elements.

For example we can study numbers and relation between numbers using predicate logic. A number can be greater, equal or less than other number. Also a number can be different or not equal to other number. In the next table we show you the most usual notation for comparison operators.

Math CS Description True = 1 False = 0
= == Equal 1 == 1  1 == 0
!= Not equal 1 != 0  1 != 1
> > Greater than 2 > 1 5 > 5 + 1
< < Less than 0 < 1 1 < 0
>= Greater than or equal to 1 >= 0 1 >= 2
<= Less than or equal to 1 <= 1 1 <= 0

Lists & Sets

To understand the predicate logic better we need to grasp a sets and collections of things. A set is a group of things that are unique represented and not duplicated. All things in a set can be similar or can have characteristics in common. We call members of a set: "elements".

Set Examples

Examples of 2 Sets

Sometimes we need to make a collection of elements that can be duplicated. In this case we have a "list of elements" not a sets because some of the elements are not unique. We refer in computer science to a list or a set of element by using the term: collection.

Predicate logic study the elements of collections and the relations between them. For example an element can belong to a set or do not belong to a set. Two sets can contain the same elements or different elements. Then the sets are equal or not equal.

Operations

Operations between sets can produce new sets or logic results. Also it is possible to make operations between one set and one value to check if the value belong to a set. In next table we use Unicode symbols for operators between sets:

symbol example meaning
R = A ∩ B Intersection between two sets => new set
R = A ∪ B Union between two sets => new set
b = A ⊂ B Set is A included in superset B: => boolean
b = A ⊃ B Set A contain subset B: => boolean
Δ Δ = A - B Set difference, => new set
b = x ∈ A Belong: check if element belong to collection => boolean
b = A ≡ B Equivalent: check if A has same elements as B => boolean
∀ x ∈ A Any element: used in collection qualification => boolean
∃ x ∈ A Exist: used in collection qualification => boolean

Intersection

Intersection between two sets A, B will be a smaller set R that will contain all common elements of A and B. All other elements will not be included.
Intersection

Set Intersection

Union

Union between two sets A, B will be a larger set R that contain all elements of A and all elements of B, but duplicate elements will be included only once.
Union

Set Union

Difference

Difference between two sets A, B will be a set C that contain all elements of A but not elements that are common with B. There is a second difference D that we can make between B and A.
Difference

Set Difference

Vector & Matrix

In mathematics there are 2 significant numeric collections: Vectors and Matrices. In Python we define a "List" that can have one or more dimensions and can hold a Vector or a Matrix. Some languages use term: "Array". Sometimes "Array" and "List" are different things with similar properties.

Fuzzy Logic

Fuzzy logic is an attempt to incorporate nature's inherent fuzziness into technology. This logic works not only with values 1=True and 0=False but we some values in between. Not everything is black or white but there is a gray area. This may be related to probabilistic values that are values between 0 and 1. Fuzzy Logic is used in neural networks and machine learning.

The theory of Quantum Computing is using a similar concept called "Quantum Logic". This is based on quantum bits (qbits). The difference is that number of states between {0,1} in quantum computing is limited while in fuzzy logic the number of states between {0,1} are infinite.

State Machines

A "state machine", is a mathematical model of computation. It is an abstract machine that can be in exactly one of a finite number of states at any given time. We call this also a "Finite State Machine" FSM.

Moore machines

Moore machine was invented by Edward Moore in 1956 and is called Moore machine. Moore machines consist of states and transitions. States are able to produce outputs, and the output is determined solely by the current state, not by any input.

Mealy machines

Mealy machines were invented by George H. Mealy in 1955. In comparison with Moore machines, Mealy machines produce outputs only on transitions and not in states. This often results in state diagrams with fewer states because more logic can be put on transitions.

Harel state charts

A state chart is a visual formalism for complex systems. Basically Harel state charts are Mealy/Moore state machines extended by further concepts that allow us to model complex systems in a practical way.

David Harel did his PhD in 1978 at the Massachusetts Institute of Technology in Cambridge. He then became a Professor for Computer Science at the Weizmann Institute in Jerusalem in 1980.

David Harel:  A complex system cannot be beneficially described in this naive fashion, because of the unmanageable, exponentially growing multitude of states, all of which have to be arranged in a flat non stratified fashion, resulting in an unstructured, unrealistic, and chaotic state diagram.

UML state-chart

UML stands for "Unified Modeling Language". UML state machines have the characteristics of both Mealy machines and Moore machines.UML state machine is an object-based variant of Harel state chart.

Usability

State machines are useful to analyze and theorize the computation using abstract notations. In practice there are efforts to create various code generators to translate the state charts into source code.

Representation

State machines can be represented in form of tables and diagrams. States are represented in diagrams by round circles, or shapes and transitions between states by arc of circles with arrow at end. A state machine diagram is usually oriented from top down or from left to right. The entry point is usually represented by a black dot. Some states are marked as "accepting" states with double line circle.

state-machine

State Machine

Turing Machine

The Turing Machine is a theoretical model of computation invented by Alan Turing in 1936, predating modern computers. It was not a physical machine for code-breaking, but rather an abstract concept that defines the limits of what is computable. This model was fundamental to the development of computer science.

Separately, during World War II, Alan Turing designed an electro-mechanical machine called the Bombe, which was instrumental in breaking the German Enigma code. However, the Bombe and the theoretical Turing Machine are not the same thing.

Today, we use the term "Turing Complete" to classify a programming language or system that has a computational power equivalent to a universal Turing machine. Theoretically, such a system can solve any problem that a computer can solve, given enough time and memory. A Turing machine can be represented as a finite state machine, but a conceptual diagram is often easier to grasp:

Turing Machine

Turing Machine

Artificial Intelligence

Artificial intelligence is often used to describe machines (or computers) that mimic cognitive functions usually associated with the human mind, such as learning and reasoning.

Machine Learning

While true artificial intelligence has not yet been achieved, we have created machine learning, a field of AI that uses statistical techniques to enable computer systems to "learn" from data. It is built on the idea that by analyzing vast amounts of data, a system can identify trends and patterns, which can be used to make predictions or recognize new patterns, thereby simulating a form of intelligence.

Example:

Statistic Trend

Statistic Chart
Linear Trend

Neural Network

Neural networks are a key component of machine learning, in which a computer learns to perform a task by analyzing training examples. Typically, these examples have been hand-labeled in advance.

Most modern neural networks are organized into layers of nodes and are "feed-forward," meaning data moves through them in only one direction. An individual node might be connected to several nodes in the layer from which it receives data and several nodes in the layer to which it sends data.

Neural Network

Neural Network
Conceptual Representation

The concept of neural networks was first proposed in 1944 by Warren McCullough and Walter Pitts, two University of Chicago researchers who later moved to MIT in 1952 as founding members of what is sometimes called the first cognitive science department.

DEI in AI

DEI in AI stands for diversity, equity, and inclusion. It is the practice of ensuring that AI systems are fair and inclusive, and that they do not discriminate against people based on their race, gender, religion, or other protected characteristics.

DEI in AI is important because AI systems are increasingly being used to make decisions that affect people's lives, such as whether they get a job or a loan. If these systems are biased, they can perpetuate discrimination and inequality.

There are a number of ways to promote DEI in AI, such as:

  • Using diverse data sets to train AI systems.
  • Developing AI systems that are fair and unbiased.
  • Training AI developers on DEI principles.
  • Monitoring AI systems for bias and discrimination.
  • Taking steps to mitigate bias and discrimination in AI systems.

DEI in AI is a complex and challenging issue, but it is important to address it in order to ensure that AI systems are fair and inclusive.

What is LLM?

LLM stands for **Large Language Model**. It is a type of artificial intelligence (AI) that is trained on a massive amount of text data. This data can include books, articles, code, and other forms of written language. LLMs are able to learn the patterns and relationships between words and phrases in this data, which allows them to generate new text that is similar to the text they were trained on.

LLMs are used in a variety of applications, including:

  • Machine translation: LLMs can be used to translate text from one language to another. For example, Google Translate uses an LLM to translate text between over 100 languages.
  • Text summarization: LLMs can be used to summarize long pieces of text into shorter summaries. This can be useful for quickly getting the main points of a document or article.
  • Chatbots: LLMs can be used to create chatbots that can have natural conversations with humans. This can be used for customer service applications, or for providing information or assistance to users.
  • Creative writing: LLMs can be used to generate creative text formats, like poems, code, scripts, musical pieces, email, letters, etc. This can be used for entertainment purposes, or for generating new ideas.

LLMs are still under development, but they have already learned to perform many kinds of tasks. As LLMs continue to develop, they are likely to become even more powerful and versatile.

Here are some examples of LLMs:

  • GPT-3: GPT-3 is an LLM developed by OpenAI. It has 175 billion parameters and is one of the most powerful LLMs in the world. GPT-3 can be used for a variety of tasks, including machine translation, text summarization, and creative writing.
  • LaMDA: LaMDA is an LLM developed by Google AI. It is similar to GPT-3 in terms of its capabilities, but it has been specifically designed to be more factual and less likely to generate offensive or harmful content.
  • Megatron-Turing NLG: Megatron-Turing NLG is an LLM developed by Google AI and NVIDIA. It is the largest LLM in the world, with 530 billion parameters. Megatron-Turing NLG is still under development, but it has already shown promising results in machine translation and text summarization.

LLMs are a powerful new tool that has the potential to revolutionize the way we interact with computers. As LLMs continue to develop, they are likely to become even more important in our lives.

Generative AI

Generative AI is a branch of artificial intelligence (AI) that focuses on creating new content, such as text, images, audio, and code. It is able to learn the patterns and structures of existing data and then use this knowledge to generate new data that is similar in style or form. This makes it a powerful tool for a variety of applications, including art, design, and product development.

Disclaimer

This website was improved using AI. You can use AI for free using Google's web application. All you need is a Google account. This service is in public beta. While AI can assist with programming, its suggestions should be carefully reviewed for correctness and security.

Google AI: Gemini

Warning: AI can make mistakes. It may generate multiple responses, and you can select and vote on which is better. AI learns from user feedback. If you use AI-generated content, you must verify and validate the response.

Read next: Numeric Algebra