Sunday, 30 November 2008

File fragmenation

Here is something administrators and system architects offend over look is file fragmentation, many believe this doesn't impact their systems enough to be worth factoring this into the design but the long term affect can have high impact on the performance and even prevent service.

To prevent this we have to first understand how this happens, lucky this isn't rocket science in fact it quite simple imagine that your harddisk is a book shelf when all is working well it should be in alphabetical order, but if they isn't room in the place where the book should go it will take the first available place on the shelf, after doing this a few hundred times with the read and write of data you end up with many books in places that make it harder to find thus pushing the seek time up and on large files because the file can be located over many parts of the drive you can end up with I/O timeout on file read operations.

On SQL server it can lead to tasks and operations such as backups failing, on file servers is can lead to files not opening on first request and general complaints from users regarding performance of the servers.

One of the quickest ways to prevent this from happening is to not mix the data type and applications that have access to a volume.

If you have many applications placing data on a volume you can quickly have a fragmented drive, one of the best example I have come across was a backup of SQL on the same volume as the database is self, this doesn't sound like much of an issue at first however if i explain all will become clear.
The databases is growing and the backups are being recreated every day, so imagine you have no enough space when database is growing to create the data at the end of the file so it creates it just after the end of the backup then later that days the file backup is deleted and recreated and is also larger and fills up the same space plus some more at the end of the database again.
This results in what I call the leap frog affect as data is always adding at the end, and quickly the volume was 98% fragmented and SQL had I/O error with operations timing out because they where taking longer then 15 seconds.

I know what your thinking this was some old hardware and this is why it made such a big difference... well you'd be wrong this was new hardware 6th generation and the data was on SAN drive that as you know are faster then local drives by up to twenty times, and we are talking about a database that was only 120GB, now bearing in mind I've seen 2.8TB SQL databases on the same hardware without issue, so this really begins to show how much it can impact the server.

SQL Server has encountered 21 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [E:\mssql\data\testdb_log.ldf] in database [testdb]

On file servers the same can happen and it appears as SCSI I/O errors, where windows is not getting the data as quickly as it expects, now on file servers the problem is more complicated as you a file share is a file share there isn't allot that can be done or is there?

Well you can do some steps part from having regular defragmentation of the volume you can also split the shares so the application shares and user share are on separate volumes.

So to quick bullet points for you.

Separate System, Data, Log and backups for SQL servers.

Separate file shares into type of share and have regular defragmentation of those volumes.

Create volumes based on uses and consider how data will react with many write operations.

No comments: