Developer-Driven Databases

Even though I have come late to the party of professional development, relatively speaking, I am acutely aware of the conflict that seems to pervade the developer-DBA relationship. This is what I gather about why this is: DBAs used to be paid better that developers, and often this was because they were able to reduce the overall license and hardware costs of large database installations. Both the size and proprietary nature of databases made them incredibly expensive, so paying an individual gobs of money to make sure they ran efficiently and that the data was preserved was worth it.

Several trends have changed the playing field. The first is the arrival of small, commodity server hardware that makes mainframes or large servers unnecessary, and thus the cost is pushed down dramatically for most installations, while salaries for trained database professionals remained the same.

The second is the explosion of databases that do not (in theory) require a DBA. The most marked example is MySQL that allowed developers to write simple SQL but was tiny and open-source (read free), which was a first for a database that used SQL, then considered a boon to developers. A developer could install MySQL on his box, and have no contact with a DBA until deployment time. This has changed a bit, and there are MySQL DBAs and developers, although not as plentiful by far as SQL Server or Oracle professionals. The direct appeal to the developer has continued with Postgres, and then wandered into non-SQL territory with databases like MongoDB, CouchDB, Redis, Riak, and Cassandra. Once again, the developer can install these on his local box, and only tell the operations department when it’s time to deploy.

Database technology now is now focused on making administration almost zero and about making the development process easier. The DBAs are not making the decision about what database to use, the developers are. The new breed of databases was made to compete in this environment, and even entrenched players had to play along.

There are some fundamental shifts in thinking with this latest batch of developer-driven databases (labeled NoSQL or NewSQL). The first is the idea that the team that built a solution will also maintain it (it has been called devops, but this is a trend even outside of the buzzword), and so there is less emphasis on the traditional DBA-role. The second is that SQL is not the best way to describe and access data, that a general-purpose language should be used instead (most commonly JavaScript). The third is that the most important part of a database is availability in the face of network outages or high usage.

The first trend of devops is good if it means that teams take responsibility for how their software works in the wild. If they are under the mistaken belief that software systems can perpetually run by themselves, then they are in for a rude awakening. There will always be administration (unless there’s a great leap forward in computing) in addition to coding.

The abandonment of SQL I believe is both sad and probably ephemeral. To think that data is not bound by a schema is to miss a very important part of software design: each record must be similar to others, or else there’s no point in writing software. I agree that SQL needs major additions in order to cover more types of data, but it is an evolving standard just like every other language.

The final worrying trend is the idea that availability needs to come before all else. Not enforcing durability or replica consistency is an inconvenience for customers and a nightmare for software developers who have to deal with those confused customers. I have worked on systems that accept eventual replica consistency, and it took several weeks just to write the logic to deal with this basic design shortcoming, yet even still the customers see the issue, just not as often. The other issue is durability. Most data needs to be saved – that’s why you’re using a database, and not a cookie in the browser. Trouble is bound to happened when a system relies on the fact that some data loss is acceptable or that replication to another live system is a replacement for writing to disk or making backups. Once again, if availability comes before all else, the most vital component of infrastructure is unstable, filled with problems that have been solved years ago.

These two trends run against the basic idea of a DBA: a dedicated professional devoted to making sure that the system is available, backed-up, and secure. The newer database systems do not have the feature-set that allows for true administration, and they are developed for an environment where the developer is managing the system in his spare time.

Database professionals do have much to offer to software engineering: they can provide experience with long-term support of applications, dealing with all sorts of black swan events, providing dramatic improvements in development time, and significant performance increases in terms of both throughput and latency. They can also provide knowledge and experience with many different systems, selecting the right tool for the job.

However, the relationship between developers and DBAs must be repaired. I think that the basic issue at hand is the fact that DBAs want everything locked down and developers want every single privilege possible. Devs often feel impotent, and DBAs feel that they have just allowed their database to be published on the Internet. Developers want things to be quickly done and DBAs want to take their time to make sure that changes won’t affect current systems. DBAs and developers also run the gamut of talent and attitude, and often they stereotype each other based on one bad experience.

The resolution is an honest talk about what is necessary in terms of security. Database designers need to make sure that their databases that can be secured (for instance, credit cards are separate from other data). Development environments must have lots of data, but this data needs to be scrubbed. And the developers need full control of at least one environment. DBAs need to understand unit and integration testing and need to have the ability to load test to reduce their fears about scaling. Finally, the database administrator knows a lot about databases, and so he needs to share his knowledge, as well as researching ways to make development faster, as well as utilizing out-of-the-box functionality to remove boring coding like report-writing or basic ETL solutions.

DBAs and developers need to be on the same team, because it takes a team to create large software applications. Applications change, technologies change, and salaries change, but the basic fact remains that groups of diverse people who put their common goals above their short-term self-interest will win out over homogenous individuals acting alone. The greatest impediment to our own progress is our simple-minded attitudes. This change does not begin with the other guy, it begins with you.

Tagged with: , , , ,
Posted in Databases, IT

Leave a Reply

Your email address will not be published. Required fields are marked *