They say there are two hard things in software development: cache invalidation, naming things, and off-by-one errors. Even though it seems like a simple thing, naming tables, columns, and stored procedures is hard when designing a SQL database. There are three simple rules I like to abide by when designing schema: give things meaningful names, be consistent, and favor verbosity over obscurity. Let’s expand each one of those points, and then I’ll cover some MySQL specific addendums.
The first part of this idea is to give objects unique, specific names. A table that is named Entities is going to confuse everyone. You want to name your table with the specific category of data or noun it represents. You can do tragical things to your database like have one giant table that’s just EntityID, Key, Value, but I don’t know why you’d use a relational database in that case!
Think of a table in terms of nouns and their attributes. Since you’re developing an application on your database, what are the nouns that get tossed around? Imagine this feature request: “When the user places an order, I want the application to send an email to the address they signed-up with.” This sentence tells me we should have a Users table with an email column and an Orders table, as well as something like TransactionalEmails to record what emails the user received. You want to steer clear of codenames even if they come up in design conversations: using a code name like Zebras or Bananas for your orders table is a bad idea (I’ve never seen anything this ridiculous, but I have seen something in this vein).
You should also create unique names. A simple example is to use FirstName and LastName vs Name and Surname. The first set avoids confusion over what is in the Name column. When picking a label, ask yourself, is there another table that’s labelled something similar?
Although it’s pretty common practice in some circles, you should avoid any temptation to use Hungarian notation. That’s the practice of appending things like “tbl” before table names or adding type information to columns as in “AmountDecimal”. This adds extra noise without adding extra information, since type information is obvious with a Describe Table command. Similarly, don’t add meaningless suffixes like “info” or “data” or “records”. Of course the table has info, data, and records, that what it’s for!
You need to be careful to make sure the contents of the table actually maps to the real world concept. If you name a table that stores phone numbers something like Contacts or EmailAddresses, your fellow developers will have a hard time finding it, and it adds another mental burden when you have to map the table name to the actual noun you should’ve used (in this case, you have to remember that Contacts has email addresses). While you would doubtfully ever call a table that stores phone numbers EmailAddresses since we’re all familiar with that type of data, if you don’t fully understand the application’s domain, then it’s easy to make that kind of mistake.
When you are creating a schema and naming your objects, some decisions will be purely arbitrary. That doesn’t mean you should just make those calls willy-nilly. Come up with conventions for basic things, such as:
- View prefix (okay, Hungarian still has its uses, such as when you want to make it clear that the object is a view not a true table)
- Stored procedure names (they should include their actions in the name, like SelectUsers or UpdateProduct)
- Stored routine prefix (whether you want to use fn or sp before a stored procedure [I vote no, but just be consistent])
- Booleans (decide if you want the Has or Is prefix [IsActive, HasChildren])
- Index / constraint names
- Primary / foreign key names (like, TableName_pk or TableName_fk_UserID)
- Mapping table names (these are tables that allow many-to-many relationships between tables)
- Pluralization (should the table that stores logins be called Users or User)
- Surrogate keys (TableNameID or just naked ID [the former make natural joins easier])
- Table inheritance (if Student is the base table, do you say ElementaryStudent or StudentElementary)
You need to set a convention in a document, and if you’re on a team, share it with everyone and make sure they understand and agree to it. Write it so it’s easy to understand and apply, lest it is ignored in the face of deadlines or laziness.
Consistency is important for two reasons: first, it lets all developers (including your future self) understand what things will be named. You can find things fast, and you can use automated tools to help you find objects in your database or to make changes en masse. Second, it avoids any analysis paralysis: the decision on how to name something is done already.
Favor verbosity over obscurity
When faced with a new database, its names are really important for quickly understanding the system. If you have a bunch of names that are full of abbreviated words or have vowels missing, it adds cognitive overhead to reading the names of the tables, so spell everything out. Likewise, don’t be afraid of adding an extra word if you think it provides needed clarity. A good example is when doing table inheritance. Your primary table is Student, but you put extra, unique information in KindergartenStudent or MiddleSchoolStudent instead of Kindergartner or MiddleSchooler.
Beyond these ideas, here’s a good list of banned practices:
- Don’t drop vowels out of words
- Don’t drop syllables from words (unless it’s a common abbreviation)
- Don’t use single-letter names (no, really, don’t)
- Don’t use obscure acronyms (common acronyms are okay)
Even though you probably have the hot breath of a project manager on your neck, asking you when feature X will be finished, help your future self and your fellow developer by taking the time to name things well. It’s a necessary part of design: you’re going to do it regardless, but you have the choice to do well or badly. Sensible and easy-to-understand table and column names make developing so much easier.
MySQL Specific Notes
As a postscript, there are a few notes about naming things in MySQL.
- Since MySQL is case sensitive with table names on Linux, use Pascal casing (TransactionalEmails) and avoid lowercase and underscores
- Never use a name that’s a reserved keyword (escaping is painful in MySQL, so here’s the reserved keyword list http://dev.mysql.com/doc/refman/5.5/en/reserved-words.html)
If you’ve made it this far, you should sign-up for updates.
Want to hear more about what I’ve learned about designing databases in MySQL? You can subscribe below and I’ll send you updates on a new book I’m writing that helps you navigate past the gotchas in MySQL. If you sign-up, I’ll also send out stuff on MySQL like tips and tutorials.