Database Design Best Practices
how to design relational database that are easy to work with and simple to maintain
Why Good Database Design is Important?
Good database design ensures data to be consistent, minimizes duplication, allows high-performance access as well as simplifies maintenance.
Define Your Entities
First, choose entities you will store in your database from target domain.
For example, if you are designing a CRM application, your entities might include:
Plan For Relationships
Analyze your target domain and discover relationships between entities. Note, there could be several relationship types:
In the case of a one-to-one relationship you often won't need a separate table, so consider keeping data for both entities in one table.
In this case, a record in one table can be related to several records in another table. For our CRM example, you could find that
customers, so one customer can have multiple orders. But usually, one order cannot be placed from more than one customer. This means
orderstable should have
customers_idcolumn (foreign key)
Sometimes both entities could be linked in many-to-many ways, i.e. records from both tables could be linked to many records in another table. For our CRM example,
productshave a many-to-many relationship, because one order can have several products and one product can be included in several orders. To solve this, you could create
orders_products"link" table that will store which product included in which order (as well as additional information could be stored like ordered amount and price)
Design your database to be in normal form using some formal rules like:
- the table should have a Primary key (one or multiple fields)
- all other fields must depend on the whole primary key, not just part of it
- each field should be used to store single value
- if more than one value must be stored - move to a separate table with proper relationship type
- when one field value depends on another (non-primary key) field value - it's redundant and can either be removed or moved to its own table (for example, this usually happens to look-ups)
There are several "normal" forms, but usually, you want to end up with 4NF or 5NF.
Note, in some cases, it's worth denormalizing DB structure (for example to increase performance), but that's out of the scope of this article.
There are several recommendations about standard fields/columns and standard names:
- case - we prefer lowercase with underscores, but it's not really important, just keep it consistent.
- table name - use plural form:
- for link tables use "entity1_entity2" form, for example:
- for link tables use "entity1_entity2" form, for example:
id- first field should be an auto-incremental field with a simple "id" name. It should also be a
PRIMARY KEY(unless you have other strong reasons for a different primary key). Exception: link tables don't require to have an
- "system" fields - status, added/updated. These fields help to track who/when changed the record. Also, a status field can be used to set a record as "inactive" (record can be hidden in some lists) or "deleted" (to prevent accidental deletion and ability to restore)
icode- optional field, allows having record code which is not
id. For example, can be used for special backend processing. Or just a key in target domain entity (like "product number").
iname- this field usually present in every table and should be used for storing short titles/names.
idesc- optional field, large text type. Should be used for storing large descriptions/details.
- add comments for all non-standard fields. For fields with a predefined set of values include a comment with all the values and their meaning
- if SQL engine allows - include defaults for add_time, upd_time or other system fields
NULLis not really necessary for the field, add
NOT NULLto reduce uncertainty
- for strings/varchars add
default '', so strings in applications at least will be zero-length string and not undefined/null/nothing (helps to avoid unnecessary errors)
- do not use spaces or non-alphanumeric symbols in table name or field name
Create Table Example (SQL Server style):
CREATE TABLE products ( id INT IDENTITY(1,1) PRIMARY KEY CLUSTERED, icode NVARCHAR(64) NOT NULL default '', -- product number iname NVARCHAR(64) NOT NULL default '', -- product title idesc NVARCHAR(MAX), -- product description status TINYINT NOT NULL DEFAULT 0, -- 0-ok, 10-inactive, 127-deleted add_time DATETIME2 NOT NULL DEFAULT getdate(), add_users_id INT NOT NULL DEFAULT 0, upd_time DATETIME2, upd_users_id INT NOT NULL DEFAULT 0 );
- in most cases you want your data to be consistent and complete, therefore foreign keys are strongly recommended. (You might want to drop foreign keys to gain some performance or scale data to multiple servers, but that's a more complex case)
NOT NULLif NULL is not possible in data
- use proper defaults
- use appropriate field types (for example, store dates in datetime columns, not int as seconds since 1970)
- indexes - Add only necessary indexes. Too many indexes will drop insert data performance and waste server's memory.
- queries - Test your queries in Query Analyser (SQL Server) or
- views - are a convenient way to build complex subqueries, however, try to avoid
order byand subqueries for better performance.
To keep things simple, put all your database structures into one file
It should have all the code necessary to initialize your database from scratch.
Can include not only
CREATE TABLE statements, but also:
DROP TABLE- helps to clean up the database if you need to re-initialize DB during development
INSERT INTO- prefills your table with required initial data (if there are a lot of such data - you might have a separate file
CREATE INDEX- creates additional indexes to improve performance in particular cases
Things to Avoid
Avoid different names for the same type of fields. Keep field names consistent.
Avoid using spaces or non-alphanumeric characters (underscore is OK) for table and field names. Such names require additional quoting to work with. Also, some systems/frameworks might not work well with such names.
Avoid naming tables/columns/etc as reserved words. While it's possible with proper quoting, it's better just add some suffix or prefix. For example, instead of
namecolumn, you could name it as
Avoid business logic stored on the DB server:
Sometimes its worth to have logic above in the database. However, it's better to use a database just for data and keep all the business logic on the application level for easier development and maintenance.
- Avoid too many indexes per table.
Each index requires some efforts for the DB server when data added or updated. Also to be used efficiently, indexes need to fit into the server's memory. Therefore the more indexes you have, the more server resources required.
If you need to have your SQL Server, MySQL, MS Access or SQLite database to be properly designed - Contact Us and we will help you.