Wednesday, 7 September 2016

Slow Database

As we move to larger datasets we have improved processors, disk I/O and CPU, however, we are still held back mostly by our own code.

We know that full table scans are bad and we do our best to avoid them, most of the time, and if you have a good DBA he/she will find any that come up over time, but there is another thing that can slow down the data when a large number of queries are overlapping.

Now locks are a perfectly natural thing in a database for data consistency and that is a good thing, nevertheless, this makes sense when changing data it's unnecessary for retrieval of information such as selects that normally make up the bulk of database queries.

SELECT doesn't hold the exclusive lock on pages rather it sets shared lock(S) on the pages to read and other transactions can't modify the data while shared locks exist(but can read the data by placing another shared lock). So it is expected that your SELECT blocks any updates.

So let's paint an example, if you have a website with 100,000 users some 10,000 might be online at the same time so that is some overlapping selects as many of the users view or query the same information, now there are many caching and other smart things you can do at the middleware layer to reduce this however at some point the query will still get to the database and at that point you don't want them waiting because the row is locked by update or insert is running and the rows/table/pages are locked.

So the pain here is when an update and a select are running the select uses a shared lock but the update and insert use exclusive lock.

Now one option would be to use nolock on all of your select statements problem solved right ?
Well not really as now you have reads on incomplete data and also its bad habit to get into as once you start using them you might put one on a "insert/update/delete" statement and then you have a lovely corrupt database.
Also if you have a good DBA he/she will have flagged any statement with nolock in the same way they will flag select * as they are not best practice.

So what to do?  Well, the answer is more simple... drum roll, please... ISOLATION LEVEL now this gives you the more options to read data both committed or uncommitted depending on your need, MySQL, Oracle and Microsoft SQL all support Isolation Level, so you can now control what needs locks and what doesn't.

This is not only best practice but the way you should use your database.

No comments: