If you work in a traditional software company probably you will work in a team. Depending how the project is organized (scrum, agile or kanban) you may be asked to create requirements and presentations for the project.
If the project use large amount of data requirements will include a database and will make recommendations about the database structure but will not go in much details. Sometimes requirements will include a "Conceptual Model" (CM).
A database model is a graphical representation of database elements. This is used by database designers to communicate high level design of database structure. There are 3 model types you can make before you implement a large database. You need to understand how to read these models if you wish to create a large project.
Each model has its own merits but the most valuable is the Conceptual Model. This is the only model that I will show in details how to make using entry level tools.
Most database designers are using tools to create a database model diagrams. These tool are based on IBM UML notation: (Unified Modeling Language) and can connect to existing databases. You can use them to generate code for you. However some companies that are less traditional and more efficient are using paper and pencil for design.
On-line tools are implemented as SaaS (Software as a Service) platform. Yes there are websites that enable you to design on-line for a small price.One of these is DB Designer. Honestly I have no intention to use this tool for my open source projects. It can not export your diagram as SVG so I will pass thank you.
To represent an entity use a rectangle. To represent relations use a line. If you wish to represent the relation cardinality, use a conceptual notation for the line end. This consist of a bar, a circle or a crow foot like in the picture below:
Conceptual Symbols
One of the most strong principles in relational database design is the principle of normalization. This is the process of organizing data in a database. There are two goals of the normalization process:
Edgar Codd is the inventor of the relational model (RM). He has introduced the concept of normalization and what we now know as the First normal form (1NF) in 1970. Codd went on to define the Second normal form (2NF) and Third normal form (3NF) in 1971
References to Wikipedia
The normalization process is using several normalization rules:
First normal form (1NF): This is a property of a relation in a relational database. A relation is in first normal form if and only if the domain of each attribute contains only atomic indivisible values, and the value of each attribute contains only a single value from that domain.
First normal form enforces these criteria:
Second normal form (2NF): A table that is in first normal form (1NF) must meet additional criteria for second normal form. Specifically: a table is in 2NF if it is in 1NF and no non-prime attribute is dependent on any proper subset of any candidate key of the table.
A non-prime attribute of a table is an attribute that is not a part of any candidate key of the table. Attributes that are part of a candidate kye are called prime attributes.
Third normal form (3NF): is a normal form that is used in normalizing a database design to reduce the duplication of data and ensure referential integrity by ensuring that:
1. the entity is in second normal form
2. all the attributes in a table are determined only by the candidate keys of that table and not by any non-prime attributes.
Most of applications use Third Normal Form (3NF).
These principles are very scientific and difficult to understand. However a good designer must know how to apply these principles. This comes within experience, with many try and errors.
In Relational Algebra the concepts are Logical or Abstract. In a Relational database, we implement the Logical or Conceptual model using Tables. So here are some example of conceptual diagrams patterns that you can use do make a database model:
1:1 Relations
1:M Relations
Association
Inheritance
Extension
Note: Detail design of database do not contain description of stored procedures. This design is too complex to be included in database design document. So most Architects let the development team to decide what packages to implement.
For implementing a database, developers will create DDL scripts. This is a subset of SQL that means: "Data Definition Language". This language enable creation and modification of objects in a database.
The order of scripts is very important. You can not create indexes for a table that do not exist. So here is the natural order of making objects in a database. First you make all the tables with primary keys and check constraints. Second you make the indexes. Third you make the foreign key constraints.
Security: Critical databases must have good design for data access rules. This is done using a special script that is using DCL (Data Control Language). You will learn this language in our Programming Language tutorial for SQL.
Maintenance: Database administrator will perform no optimizations if this is not required by software development team or database architect. Therefore a good design document should include directives for database maintenance tasks, like frequency for index rebuild, and frequency for compacting and reorganization of table data to reset the free space.
In an agile company one team is assigned a single project. A project consist of several epics and each epic is represented by several stories. One story may have several tasks to be completed. A developer can work in a single team and can be assigned to resolve multiple stories. A story need a resolver, so when we create a story it must be assigned immediately, otherwise we can not create stories.
Logic Model
At first we design tables with primary key and foreign keys. This is the high level design of database table. Tables may have additional attributes that will be part of Physical Model.
Table Type | Name | Primary Key | Foreign Keys |
---|---|---|---|
Entity | PROJECT | ID | OWNER_ID |
Entity | DEVELOPER | ID | PROJECT_ID |
Association | TEAM | ID | MASTER_ID |
Detail | EPIC | ID | PROJECT_ID |
Association | STORY | ID | EPIC_ID |
REZOLVER_ID | |||
Detail | TASK | ID | STORY_ID |
REVIEWER_ID |
These relations are implementing several fundamental Agile rules. For some of the rules we need to implement triggers and attribute constraints. This can be done in the physical model. We do not show you the physical model in this article.
ER diagram is used to show attributes, primary keys and foreign keys. This is also called physical model. This diagram is used by developers to create the database structure. Also can be used by developers who use the database to create SQL queries for applications.
ER-Diagram
This has demonstrate how to make a data model to resolve a real problem. We have shown the Logical Model, and the Physical Model. Using these models you can understand the design of a database. Having a model will help you build better SQL to analyze your data.
Read next:Cybersecurity