Top 10 Best Practices for SQL Server Maintenance
Perform a full database backup daily
- Technically there are no problems to backing up SAP databases online. This means that end users or nightly batch jobs can continue to use SAP applications without problems. SQL Server Backup consumes few CPU resources. However, SQL Server Backup does require I/O bandwidth because SQL Server will try to read every used extent to the backup device. Everything that is required for SAP (business data, metadata and ABAP applications etc) is included in one database named “<SID>”. Sometimes the time needed to take a full backup (generally a few hours) might become a problem, especially in SQL Server 2000 where no transaction log backups can be made while an Online Database Backup was performed. SQL Server 2005 does not have this issue.
- To create faster online backups using SAN Technology, SQL Server offers interfaces for SAN vendors to perform a Snapshot Backup or to create clones of a SQL Server database. However, backing up terabytes of data every night may overload the backup infrastructure. Another possibility would be to do differential backups of the SAP database on a daily basis and do a full database backup on the weekend only.
Perform transaction log backup Every 10 to 30 minutes
- In case of a disaster happening on the production server, it is vital that the most recent status can be restored using online or differential database backups plus a series of transaction log backups which ideally cover as close as possible to the time of the disaster. For this purpose it is vital to perform transaction log backups on a regular basis. If you only create a transaction log backup every two hours, the in the case of a disaster, up to two hours of committed business transactions would not be able to be restored. Therefore it is vital to do transaction log backups often enough to reduce the risk of losing a large number of committed business transactions in case of a disaster. In many productive customer scenarios, a time frame of 10-30 minutes proved to be an acceptable frequency. However, in combination with SQL Server log shipping, you can create SQL Server transaction log backups even every two or five minutes. The finest granularity achievable is to perform SQL Server transaction log backups scheduled by SQL Agent every minute. Besides reducing the risk of losing business transactions, transaction log backups also truncate log data in the SQL Server transaction log, and reducing the possibility of the transaction log becoming full.
Back up system partition in case of configuration changes
- Back up the system partition after any configuration changes. Use Windows Server 2003 Automated System Recovery (ASR), or other tools such as Symantec Ghost or SAN boot to restore the system partitions.
Back up system databases in case of configuration changes
- Back up the system databases (master, msdb, model) after any configuration changes. In SQL Server 2005, the resource database does not need to be backed up because it does not experience any changes and is installed with the SQL Server 2005 installation.
Run DBCC CHECKDB periodically (ideally before the full database backup)
- Ideally, a consistency check using DBCC CHECKDB sould be run before performing an online database backup. However, please note that DBCC CHECKDB is a very time and resource consuming activity that puts heavy workload on SAP production systems, especially on databases over one terabyte. On commodity hardware with a good I/O subsystem, I/O throughputs in the range of 100-150 GB/h can be achieved. Given such I/O throughputs, and the fact that there are many SAP databases up to 10 terabytes or more, it is clear that running a DBCC CHECKDB on a production system is not always practical. Therefore, many people choose not to run DBCC CHECKDB. Although all components of hardware and software have become more reliable over the last decade, physical corruptions can still happen. One reason for physical corruptions is a catastrophic power outage without having battery backup for hardware components. Another reason could be physical damage to connections or hardware components. In massive cases there is no other way than to go back to a backup and restore the SAP database and then apply all the transaction logs up to the most recent. However, to detect physical inconsistencies at an early state, or to know that the backup method is reliable, or to minimize impact of physical corruptions, the following three major measures should be considered:
- Consider running DBCC CHECKDB on a regular basis. This could be on a sandbox system that runs a restored image of the production environment. On such a system, time and resource consumption of DBCC CHECKDB would not be a concern and would not affect production users.
- Test actually restoring the SAP database from an online or differential and transaction log backup. The fact that a backup is on tape does not necessarily mean that it is consistent on tape or that it can be read from tape. Tape hardware or tape cassettes may fail over the years, and you do not want to be in a position where you have tapes that cannot be read anymore. Having a backup in a vault does not say anything about the ability to be able to restore in case of a disaster. The backup must also be proven to be readable.
- For databases with terabytes of volume, maintain a second copy of the database at the most recent status, using either log shipping or database mirroring. Both of these high-availability methods will de-couple hardware components and hence may provide a physical consistent image of the production database at a secondary site.
Evaluate security patches monthly (and install them if they are necessary)
- For most of SAP customers, availability is the most important requirement. Especially if they need to serve a single SAP instance globally, they don’t want to stop and restart the SAP servers to apply security patches. Plus, some testing in these environments is definitely necessary before installing the security patches. Therefore one of realistic scenarios for SAP customers is carefully evaluating patches and reducing the frequency of patch installations, hopefully almost to zero. Filtering unnecessary packets, disabling unnecessary services, and so forth are good security measures.
- If you have real time anti-virus monitoring, it is recommended that you exclude the SQL Server database files (including data files, transaction log files, tempdb and other system database files) from real time monitoring. If you perform backups to disks, exclude database backup files as well as transaction log backup files.
Evaluate update modules of hardware drivers and firmwares and install if necessary
- There have been critical issues due to bugs in hardware drivers and firmwares within the commodity servers. It is sometimes difficult to find this kind of issue within Microsoft, and furthermore hardware companies sometimes don’t provide enough support services to commodity server customers. So it is a customer’s responsibility to manage updates on drivers and firmwares regularly. Before updating the drivers on production commodity servers, thorough tests must be conducted on test and sandbox systems. Like nearly no other software component, a little flaw in a driver of an Host Bus Adapter (HBA) or SCSI card can be responsible for physical inconsistency within a database.
Update statistics on the largest tables weekly or monthly
- SQL Server provides two options to keep the statistics current: auto create statistics and auto update statistics. These two options are ON by default. SAP recommends keeping them ON. There may be some cases where auto update statistics may not be able to provide satisfactory performance. A specific case came up in SAP BW. The issue was resolved by the functionality in SAP BW that is documented in SAP OSS note #849062. Please keep in mind that auto update statistics is run only on tables with more than 500 rows. In some very specific cases of data developing into one direction, it is recommended to explicitly run update statistics on specific columns of the table on a scheduled basis. However, you should not perform a general manual update statistics. If performance problems are analyzed and the root cause is found in an index, or some column statistics not being recent enough, then the solution often is simply to have a certain column or index statistics updated on a more frequent basis.
Rebuild or defrag the most important indexes
- The impact of reorganizing tables and indexes on performance is highly dependent on the type of query that is executed and the I/O bandwidth that is available on the system. Simply going along measures like (1) Average page density < 80 percent or (2) Logical scan fragmentation > 40 percent as thresholds to start reorganizing are a waste of time and resources. Reasons are:
- Some SAP queuing tables will always show up as being highly fragmented
- A query reading a single row or a small number of rows which represents the majority of SAP queries do not benefit from reorganizing a table.
- If there is enough I/O bandwidth and memory for SQL Server on the database server, the impact of table fragmentation might be limited.
- Many people never reorganize tables to speed up query performance. However, there are also people who reorganize tables to compress them after they archived SAP data. Not all the tables are organized or sorted according to the archiving criteria. Hence it can happen that despite deleting 25 percent of a table, the table only decreased its volume by 10 percent. To maximize space reduction after archiving, you can run DBCC INDEXDEFRAG on the affected tables. DBCC INDEXDEFRAG will compress the data content on the pages of a table. DBCC INDEXDEFRAG treats every move of a bunch of rows to one page as a single transaction. Hence DBCC INDEXDEFRAG will result in many small transactions as opposed to creating an index which treats the entire index creation task as one large transaction. DBCC INDEXDEFRAG does not consume much CPU resources, but it does create significant I/O traffic. Therefore do not run too many DBCC INDEXDEFRAG commands in parallel. Completely reorganizing tables by re-creating their clustered indexes should not be done on large tables because this will generate huge amount of transaction log.
Use a health check monitoring tool for performance, availability, and so forth
- Unplanned downtime depends on how quickly system failures are notified to administrators and how soon they can start the recovery process. For availability, SAP administrators should be aware that automatic failover mechanism of Microsoft Clustering Services (MSCS) or database mirroring (DBM) is able to provide continuous availability of the system. However, a failover itself will cause rollback of open transactions on the database side which again will cause rollbacks on business transactions on the SAP side. The impact of these batch processes breaking and data not being available might be serious (for example, with a payroll calculation). Therefore monitoring the system and notification after failovers can be vital to having interrupted SAP Business processes restarted as quickly as possible.