Database Design Best Practices

How to design relational databases that are easy to work with and simple to maintain

Why Good Database Design is Important?

Good database design ensures data is consistent, minimizes duplication, allows high-performance access, and simplifies maintenance.

Define Your Entities

How to start a database design? First, decide what you will need to store.
Choose entities from your target domain.
For example, if you are designing a CRM application, your entities might include: contacts, customers, orders, products

Plan For Relationships

Analyze your target domain and discover relationships between entities.
What are relations in a database? Database relations are associations or links between tables, where the value in one's table field references to (usually) primary key in another table.
Note, there could be 3 relationship types:

  1. one-to-one
    In the case of a one-to-one relationship you often won't need a separate table, consider keeping data for both entities in one table.

  2. one-to-many
    In this case, a record in one table can be related to several records in another table. For our CRM example, you could find that orders are related to customers, as in one customer can have multiple orders, but one order is not usually placed by more than one customer. This means orders table should have customers_id column (foreign key)

  3. many-to-many
    Sometimes both entities could be linked in many-to-many ways, IE records from both tables could be linked to many records in another table. For our CRM example, orders and products have a many-to-many relationship, because one order can have several products, and one product can be included in several orders. To solve this, you could create orders_products "link" table that will store which product is included in which order (as well as storing additional information like ordered amount and price)

Normalization

Design your database to be in normal form using some formal rules like:

  • The table should have a Primary key (one or multiple fields)
  • All other fields must depend on the whole primary key, not just part of it
  • Each field should be used to store a single value
  • If more than one value must be stored in a single field - move such values to a separate "sub-table" with proper relationship type
  • When one field value depends on another (non-primary key) field value - it's redundant and can either be removed or moved to its own table (for example, this usually happens to look-ups)
  • There are several "normal" forms, but usually, you want to end up with 4NF or 5NF.

Note, in some cases, it's worth denormalizing DB structure (for example to increase performance), but that's out of the scope of this article.

CREATE TABLE template

Naming conventions

Having proper naming conventions in your database (and in all your databases/projects) is one of the most important during database design.

Rule number one - whatever naming approach you selected - stick to it. Have names consistent across the database. You, your developers, DB admins who will need to develop and maintain the database will be happy.

Below there are just several recommendations and best practices from our experience about standard fields/columns and standard names:

  • Case - we prefer lowercase with underscores, but it's not really important, just keep it consistent.
  • Table name - use plural form: products, orders, records, users
    • For link tables use "entity1_entity2" form, for example: orders_products, clients_orders
  • id - first field should be an auto-incremental field with a simple "id" name. It should also be a PRIMARY KEY (unless you have other strong reasons for a different primary key). Exception: link tables don't require to have an id field.
  • "System" fields - status, added/updated. These fields help to track who changed the record and when. Also, a status field can be used to set a record as "inactive" (record can be hidden in some lists) or "deleted" (to prevent accidental deletion and ability to restore)
  • icode - optional field, allows having record code which is not id. For example, can be used for special backend processing. Or just a key in target domain entity (like "product number").
  • iname - this field is usually present in every table and should be used for storing short titles or names.
  • idesc - optional field, large text type. Should be used for storing large descriptions or details.
  • Add comments for all non-standard fields. For fields with a predefined set of values include a comment with all the values and their meaning
  • If SQL engine allows - include defaults for add_time, upd_time or other system fields
  • If NULL is not really necessary for the field, add NOT NULL to reduce uncertainty
  • For strings/varchars add default '', so strings in applications will at least be zero-length string and not undefined/null/nothing (helps to avoid unnecessary errors)
  • Do not use spaces or non-alphanumeric symbols in table name or field name

Why to use "i" prefix for icode/iname/idesc? Just because "code", "name" could be a reserved word in your database engine, addition of prefix simplifies the work with it. And "idesc" just to keep naming consistent.

Create Table Example (SQL Server style):

CREATE TABLE products (
  id INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,

  icode                 NVARCHAR(64) NOT NULL default '', -- product number
  iname                 NVARCHAR(64) NOT NULL default '', -- product title
  idesc                 NVARCHAR(MAX),                    -- product description

  status                TINYINT NOT NULL DEFAULT 0,        -- 0-ok, 10-inactive, 127-deleted
  add_time              DATETIME2 NOT NULL DEFAULT getdate(),
  add_users_id          INT NOT NULL DEFAULT 0,
  upd_time              DATETIME2,
  upd_users_id          INT NOT NULL DEFAULT 0
);

Data integrity

  • In most cases you want your data to be consistent and complete, therefore foreign keys are strongly recommended. (You might want to drop foreign keys to gain some performance or scale data to multiple servers, but that's a more complex case)
  • Use NOT NULL if NULL is not possible in data
  • Use proper defaults
  • Use appropriate field types, for example, store dates in datetime columns, not int as seconds since 1970

Performance

  • Indexes - Add only necessary indexes. Too many indexes will drop insert data performance and waste server's memory.
  • Queries - Test your queries in Query Analyser (SQL Server) or EXPLAIN (MySQL).
  • Views - are a convenient way to build complex subqueries, however, try to avoid group by, order by, and subqueries for better performance.

Maintenance

To keep things simple, put all your database structures into one file database.sql. It should have all the code necessary to initialize your database from scratch, and can include not only CREATE TABLE statements, but also:

  • DROP TABLE - helps to clean up the database if you need to re-initialize DB during development
  • INSERT INTO - prefills your table with required initial data (if there are a lot of such data - you might have a separate file data.sql)
  • CREATE INDEX - creates additional indexes to improve performance in particular cases

Things to Avoid

  1. Avoid different names for the same type of fields. Keep field names consistent.

  2. Avoid using spaces or non-alphanumeric characters (underscore is OK) for table and field names. Such names require additional quoting to work with. Also, some systems/frameworks might not work well with such names.

  3. Avoid naming tables/columns/etc as reserved words. While it's possible with proper quoting, it's better just add a suffix or prefix. For example, instead of name column, you could name it as iname.

  4. Avoid business logic stored on the DB server:

  • Procedures
  • Functions
  • Triggers

Sometimes it's worth it to have logic above in the database. However, it's better to use a database just for data and keep all the business logic on the application level for easier development and maintenance.

  1. Avoid too many indexes per table.

Each index requires some effort from the DB server when data is added or updated. Also to be used efficiently, indexes need to fit into the server's memory. Therefore the more indexes you have, the more server resources are required.


If you need to have your SQL Server, MySQL, MS Access, or SQLite database properly designed - Contact Us and we will help you.

written by
Oleg Savchuk

Software Developer with 20+ years of experience and a passion to build good, practical, working software solutions.