Designing Multi-Tenanted Databases – Anti-patterns

In this post, I’m going to briefly cover the signs that you’re doing multi-tenancy wrong. Some of these practices are entrenched in software: there are gems in Ruby on Rails, for instance, use the first anti-pattern to achieve multi-tenancy. Listen, you can drive a car with a flat tire and you can eat yogurt with a fork. People have made these solutions work, but there’s a better way.

Flat tire

Creating tables or schemas per customer

If you find yourself running DDL (Create Table…) for each new company or user that you add to your system, most likely you’re committing a pretty big anti-pattern. Now every time you update the table definition or need to update data across all tables, you’ll have to use a script to generate the SQL for you. Those updates will take longer and it’s much more prone to failure.

If you’re doing this for performance reasons, you have two options in most database systems to physically separate customers while maintaining the same logical tables: indexes and partitioning. Indexes, obviously, are the lighter option. As mentioned in my first post on multi-tenancy, you’ll often search the tables by the tenant id (whatever that is), so it’s often wise to include it an index or just to index it by itself. You can extend this idea by using the tenant id as partition key, so the database system would create multiple physical tables beneath the surface while you enjoy the benefits of having one logical table.

Multiple accounts for the same user

Another common mistake is to tie a login or user to one tenant. In many circumstances, you’ll want to give a user access to multiple tenants. In the previous post on many-to-many mapping, I talked about creating mapping tables that model these relationships. If you find yourself creating multiple accounts for the same person, this is a sign you need to use a many-to-many mapping.

Even if there’s a possibility of a user accessing multiple tenants, include it in your initial design. You could keep your database table as a one-user-one-tenant, and then provide an API function that you can use to check whether a user has access to a company’s data:

boolean doesUserHaveAccessToCompany (User user, Company company) { … }

Maybe you never go to many-to-many, but if you do, this will allow you to change your code in a two places (this function and the function that retrieves your User data from the database). Most likely, you will have this need from the get go. Customer support personnel and relationship managers often need access to multiple accounts from the very beginning.

Duplicating data

Here’s a killer anti-pattern: syncing data back and forth between two tenants or users. I’ve seen batch jobs that check to make sure that if a change is made to tenant X’s data, it is then applied to tenant Y. This is no way to live, and it indicates that you need many-to-many mapping. I’ve also seen this done manually, where an account manager has to enter or update the records for tenant X, Y, and Z at the same time. When you have to create the same data many times, the only difference being the tenant id, you have a problem and you need to think in terms of multi-tenancy.

This is an instance of the granddaddy of modeling mistakes – that is, denormalization. Like other denormalization, it is fraught with issues. If your job or manual process doesn’t work perfectly, your data could be correct in one place but not the other. You’ll also be storing the data multiple times, creating more load for your system as well as burdening your file system unnecessarily.

The pain of multi-tenancy

When you have multiple users in your application, you will have to design your system to be multi-tenanted. You can do it well, or badly. Putting a spot of thought into your design can save you pain later.

I cover many more anti-patterns, along with the right way to do things in my book MySQL Simplified. Sign-up to hear when it’s going to come out!

Want to hear more about database design, specifically with MySQL? Well, I’m writing a book about it! You can subscribe below and I’ll send you updates. I’ll also send out stuff on MySQL like tips and tutorials.



 No spam


Tagged with:
Posted in Databases, IT
5 comments on “Designing Multi-Tenanted Databases – Anti-patterns
  1. Jan Steinman says:

    Your first link ( multi-tenancy ) is 404.

  2. Noel says:

    Fixed. Thanks!

  3. Baron says:

    After reading this post, I get a strong feeling you have not built many large multi-tenant applications with high data volume and velocity. Your material is good but too insistent that you know the one right way to do it. I would suggest to share your experiences without the message that others are Doing It Wrong. Of course a post on anti-patterns is hard to write without that message.

  4. Noel says:

    I respect what you have to say. I have built several large systems with hundreds of thousands of users, 10k requests per second, billions of rows, etc. While my experience pales in comparison to yours, I do have some knowledge in this field. However, this is not about me (hopefully). What would you change about the article(s) or what’s inconsistent? I sincerely appreciate it.

  5. Baron says:

    There are lots of people who make my experience look minimal too. But I’ve found that in many multi-tenant situations with lots of tenants and data, schema- or table-per-tenant is the only viable pattern. I think this varies depending on the characteristics of the data (not just its size and rate). And lots of obnoxious things have to be done to make it work, as you mentioned. For example, dealing with schema versioning. I’ve worked in situations where it’s completely impractical to change schema at all, and this is a common theme in large apps. Schema changes are either impossible or are one of the harder things that is done. I know of some apps that have to be able to deal with older schema versions forever, while supporting new versions too.

    I think what I would suggest is just to be less absolute that there’s a right and wrong way to do it. (I should listen to this advice too.)

    For example, I remember writing this white paper http://www.percona.com/about-us/mysql-white-paper/scaling-mysql-with-virident-flash-drives-and-multiple-instances-of-percona-server We wrote about ways to get more out of hardware, but we were not thrilled with the required techniques. Then some people who helped build large cloud apps (the ones who made SaaS a thing, for example) tweeted expressions of ickiness at me. So their experience showed it can be done in one way, while ours was looking at how to get the most out of tens of thousands of dollars worth of hardware. I think a lot of times it comes down to how much money and time the company has to spare. Money solves a lot of problems “nicely” but it’s also possible to solve those problems efficiently even if it doesn’t seem as nice.

    There’s probably other spectrums to think about but that’s one I’ve run into.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>