Database Design Best Practices
How to design relational databases that are easy to work with and simple to maintain
Why Good Database Design is Important?
Good database design ensures data is consistent, minimizes duplication, allows high-performance access, and simplifies maintenance.
Define Your Entities
How to start a database design? First, decide what you will need to store.
Choose entities from your target domain.
For example, if you are designing a CRM application, your entities might include:
Plan For Relationships
Analyze your target domain and discover relationships between entities.
What are relations in a database? Database relations are associations or links between tables, where the value in one's table field references to (usually) primary key in another table.
Note, there could be 3 relationship types:
In the case of a one-to-one relationship you often won't need a separate table, consider keeping data for both entities in one table.
In this case, a record in one table can be related to several records in another table. For our CRM example, you could find that
ordersare related to
customers, as in one customer can have multiple orders, but one order is not usually placed by more than one customer. This means
orderstable should have
customers_idcolumn (foreign key)
Sometimes both entities could be linked in many-to-many ways, IE records from both tables could be linked to many records in another table. For our CRM example,
productshave a many-to-many relationship, because one order can have several products, and one product can be included in several orders. To solve this, you could create
orders_products"link" table that will store which product is included in which order (as well as storing additional information like ordered amount and price)
Design your database to be in normal form using some formal rules like:
- The table should have a Primary key (one or multiple fields)
- All other fields must depend on the whole primary key, not just part of it
- Each field should be used to store a single value
- If more than one value must be stored in a single field - move such values to a separate "sub-table" with proper relationship type
- When one field value depends on another (non-primary key) field value - it's redundant and can either be removed or moved to its own table (for example, this usually happens to look-ups)
- There are several "normal" forms, but usually, you want to end up with 4NF or 5NF.
Note, in some cases, it's worth denormalizing DB structure (for example to increase performance), but that's out of the scope of this article.
CREATE TABLE template
Having proper naming conventions in your database (and in all your databases/projects) is one of the most important during database design.
Rule number one - whatever naming approach you selected - stick to it. Have names consistent across the database. You, your developers, DB admins who will need to develop and maintain the database will be happy.
Below there are just several recommendations and best practices from our experience about standard fields/columns and standard names:
- Case - we prefer lowercase with underscores, but it's not really important, just keep it consistent.
- Table name - use plural form:
- For link tables use "entity1_entity2" form, for example:
- For link tables use "entity1_entity2" form, for example:
id- first field should be an auto-incremental field with a simple "id" name. It should also be a
PRIMARY KEY(unless you have other strong reasons for a different primary key). Exception: link tables don't require to have an
- "System" fields - status, added/updated. These fields help to track who changed the record and when. Also, a status field can be used to set a record as "inactive" (record can be hidden in some lists) or "deleted" (to prevent accidental deletion and ability to restore)
icode- optional field, allows having record code which is not
id. For example, can be used for special backend processing. Or just a key in target domain entity (like "product number").
iname- this field is usually present in every table and should be used for storing short titles or names.
idesc- optional field, large text type. Should be used for storing large descriptions or details.
- Add comments for all non-standard fields. For fields with a predefined set of values include a comment with all the values and their meaning
- If SQL engine allows - include defaults for add_time, upd_time or other system fields
NULLis not really necessary for the field, add
NOT NULLto reduce uncertainty
- For strings/varchars add
default '', so strings in applications will at least be zero-length string and not undefined/null/nothing (helps to avoid unnecessary errors)
- Do not use spaces or non-alphanumeric symbols in table name or field name
Why to use "i" prefix for icode/iname/idesc? Just because "code", "name" could be a reserved word in your database engine, addition of prefix simplifies the work with it. And "idesc" just to keep naming consistent.
Create Table Example (SQL Server style):
CREATE TABLE products ( id INT IDENTITY(1,1) PRIMARY KEY CLUSTERED, icode NVARCHAR(64) NOT NULL default '', -- product number iname NVARCHAR(64) NOT NULL default '', -- product title idesc NVARCHAR(MAX), -- product description status TINYINT NOT NULL DEFAULT 0, -- 0-ok, 10-inactive, 127-deleted add_time DATETIME2 NOT NULL DEFAULT getdate(), add_users_id INT NOT NULL DEFAULT 0, upd_time DATETIME2, upd_users_id INT NOT NULL DEFAULT 0 );
- In most cases you want your data to be consistent and complete, therefore foreign keys are strongly recommended. (You might want to drop foreign keys to gain some performance or scale data to multiple servers, but that's a more complex case)
NOT NULLif NULL is not possible in data
- Use proper defaults
- Use appropriate field types, for example, store dates in datetime columns, not int as seconds since 1970
- Indexes - Add only necessary indexes. Too many indexes will drop insert data performance and waste server's memory.
- Queries - Test your queries in Query Analyser (SQL Server) or
- Views - are a convenient way to build complex subqueries, however, try to avoid
order by, and subqueries for better performance.
To keep things simple, put all your database structures into one file
database.sql. It should have all the code necessary to initialize your database from scratch, and can include not only
CREATE TABLE statements, but also:
DROP TABLE- helps to clean up the database if you need to re-initialize DB during development
INSERT INTO- prefills your table with required initial data (if there are a lot of such data - you might have a separate file
CREATE INDEX- creates additional indexes to improve performance in particular cases
Things to Avoid
Avoid different names for the same type of fields. Keep field names consistent.
Avoid using spaces or non-alphanumeric characters (underscore is OK) for table and field names. Such names require additional quoting to work with. Also, some systems/frameworks might not work well with such names.
Avoid naming tables/columns/etc as reserved words. While it's possible with proper quoting, it's better just add a suffix or prefix. For example, instead of
namecolumn, you could name it as
Avoid business logic stored on the DB server:
Sometimes it's worth it to have logic above in the database. However, it's better to use a database just for data and keep all the business logic on the application level for easier development and maintenance.
- Avoid too many indexes per table.
Each index requires some effort from the DB server when data is added or updated. Also to be used efficiently, indexes need to fit into the server's memory. Therefore the more indexes you have, the more server resources are required.
If you need to have your SQL Server, MySQL, MS Access, or SQLite database properly designed - Contact Us and we will help you.