How to Create a Relational Data Model
Don't worry about tables just yet., Worry about entity relationships., Be prepared to wage a solo war, with only you dedicated to quality normalization., Once it's time to write tables, concentrate on lookup and type tables (zip codes, statuses...
Step-by-Step Guide
-
Step 1: Don't worry about tables just yet.
It's obvious that you're building a database, and databases (relational ones anyway) are primarily made up of tables, which are made up of rows and columns (tuples and attributes if you're really into it). , Your first goal is to map out the relationships that different business objects have.
This is the "logical modeling" portion.
The "physical model" is the actual implementation.
Confuse / combine the two at your peril.
The requirements are hard to get and painful.
A talented business analyst at this time would be Heaven sent. , Most databases are pieces of garbage because the people who design them are lazy and "just want to get something out there.
We can always fix it later." Yeah, right., You'll need them for foreign key relationships on your "real" tables.
Plus, it gives you a little warm up before you get into the core transactional tables., If you know the date of birth and some start date, then you also know the age at the start date, so don't include this age in the table, A null value represents an undefined attribute of an entity.
If entities can have or not have a particular attribute, then it needs to be handled via an intersect table., This is especially useful when a user needs to select a default value in order to determine proper business rules to apply.
Case 2 how would you design an address table where Address1 was populated and Address2 was not required, but if Address2 was populated it must conform to the business rules of the field.
Sure you could default an empty space, is this better than knowing the user did not edit the field? Try 3rd normal form on an international address...
Can it be done probably but look at the complexity of restructuring the data in a meaningful way., However all agree that you should never allow nulls in key variables.
These are fields that are being used to identify a record uniquely, e.g. a customer identification number.
The null school say that you should use the nulls freely in all other fields.
For instance customers are not obliged to have a cell phone, nor to tell you their number.
Using a null and nothing but a null is the most efficient to record that cell phone is not available.
If it is really important to know why it not there it is better to introduce a new variable that states the reason, as opposed to introducing fancy codes to be stored in the placeholder for cell phone numbers.
Be reluctant to add fields like these, because a) the customer is also not obliged to tell the reason why he is not giving his cell phone number, nor does this question make a nice conversation, nor is he likely to tell his reasons spontaneously, and b) nobody will ever look at them because of a).
Why missing variables generally just waste time.
Be aware that yes/no variables (booleans) often can't hold a null.
Therefore they often contain useless information, such as "either he was republican, or he refused to answer". , You'll use them everywhere if you built things right.
One example would be a high school database where one table is a list of teachers and another for students.
Students have more than one teacher, and teachers have more than one student, so the intersect table, separate from 'teacher' and 'student'
would have two columns : foreign keys pointing at both these two.
The primary key would then be the combination of the two., For the invoices, put them in a table called "invoice".
Products go in "product".
The intersect would be "invoiceProduct"
or "productInvoice"
depending upon which table is really the center of the relationship.,, Get used to the different join statements (except UNION)., Focus on the business rules and relationships that it is trying to enforce, but you can get distracted if you look at the way that someone set it up.
Refer to step #4., Also, keep tabs on the legacy IDs for people to search by. -
Step 2: Worry about entity relationships.
-
Step 3: Be prepared to wage a solo war
-
Step 4: with only you dedicated to quality normalization.
-
Step 5: Once it's time to write tables
-
Step 6: concentrate on lookup and type tables (zip codes
-
Step 7: statuses
-
Step 8: product categories
-
Step 9: As a rule of thumb: don't store data that can be inferred from other fields.
-
Step 10: No nulls.
-
Step 11: Contradiction NULL values are in themselves useful to identify attributes that have not yet been populated by users.
-
Step 12: NULL/NOT NULL Check any database forums and this is a hot topic advocates on both sides advantages / disadvantages for each.
-
Step 13: Get comfortable with intersect (many to many) tables.
-
Step 14: Use a good naming convention.
-
Step 15: If you're going to have replication or log shipping
-
Step 16: try to have that set up as you develop so you can see how it works.
-
Step 17: Inner joins are great
-
Step 18: but there's probably a lot of LEFT OUTER JOIN statements that you'll be doing as well.
-
Step 19: If you have to deal with a legacy application
-
Step 20: build your schema independent of its (don't even look at it).
-
Step 21: Migrating from your legacy systems into a tighter model with proper normalization is difficult
-
Step 22: but can be made a little more manageable by using temporary tables for your imports.
Detailed Guide
It's obvious that you're building a database, and databases (relational ones anyway) are primarily made up of tables, which are made up of rows and columns (tuples and attributes if you're really into it). , Your first goal is to map out the relationships that different business objects have.
This is the "logical modeling" portion.
The "physical model" is the actual implementation.
Confuse / combine the two at your peril.
The requirements are hard to get and painful.
A talented business analyst at this time would be Heaven sent. , Most databases are pieces of garbage because the people who design them are lazy and "just want to get something out there.
We can always fix it later." Yeah, right., You'll need them for foreign key relationships on your "real" tables.
Plus, it gives you a little warm up before you get into the core transactional tables., If you know the date of birth and some start date, then you also know the age at the start date, so don't include this age in the table, A null value represents an undefined attribute of an entity.
If entities can have or not have a particular attribute, then it needs to be handled via an intersect table., This is especially useful when a user needs to select a default value in order to determine proper business rules to apply.
Case 2 how would you design an address table where Address1 was populated and Address2 was not required, but if Address2 was populated it must conform to the business rules of the field.
Sure you could default an empty space, is this better than knowing the user did not edit the field? Try 3rd normal form on an international address...
Can it be done probably but look at the complexity of restructuring the data in a meaningful way., However all agree that you should never allow nulls in key variables.
These are fields that are being used to identify a record uniquely, e.g. a customer identification number.
The null school say that you should use the nulls freely in all other fields.
For instance customers are not obliged to have a cell phone, nor to tell you their number.
Using a null and nothing but a null is the most efficient to record that cell phone is not available.
If it is really important to know why it not there it is better to introduce a new variable that states the reason, as opposed to introducing fancy codes to be stored in the placeholder for cell phone numbers.
Be reluctant to add fields like these, because a) the customer is also not obliged to tell the reason why he is not giving his cell phone number, nor does this question make a nice conversation, nor is he likely to tell his reasons spontaneously, and b) nobody will ever look at them because of a).
Why missing variables generally just waste time.
Be aware that yes/no variables (booleans) often can't hold a null.
Therefore they often contain useless information, such as "either he was republican, or he refused to answer". , You'll use them everywhere if you built things right.
One example would be a high school database where one table is a list of teachers and another for students.
Students have more than one teacher, and teachers have more than one student, so the intersect table, separate from 'teacher' and 'student'
would have two columns : foreign keys pointing at both these two.
The primary key would then be the combination of the two., For the invoices, put them in a table called "invoice".
Products go in "product".
The intersect would be "invoiceProduct"
or "productInvoice"
depending upon which table is really the center of the relationship.,, Get used to the different join statements (except UNION)., Focus on the business rules and relationships that it is trying to enforce, but you can get distracted if you look at the way that someone set it up.
Refer to step #4., Also, keep tabs on the legacy IDs for people to search by.
About the Author
Brittany Wilson
A seasoned expert in lifestyle and practical guides, Brittany Wilson combines 6 years of experience with a passion for teaching. Brittany's guides are known for their clarity and practical value.
Rate This Guide
How helpful was this guide? Click to rate: