I ran into an issue with a vCenter database recently, where I couldn’t see historical performance data anymore in the past week-, month- and year-view. When investigating it, it turned out that I couldn’t see any performance statistics older than 24hrs. I also took a look at the database and it turned out that the database grew to over 200GB!
Both must be related. Performance samples probably weren’t processed anymore. Since the database server suffered from a couple of crashed, my first though was that this was probably due to a corruption in the database. I started monitoring the database more thoroughly and witnessed it growing 10GB in a week!.
First thing I did was lowering the statistics level from 3 to 1 for both the 5min. and 30min. sample intervals.
To do this go to Administration -> VirtualCenter Management Server Configuration and select Statistics. Also note the “estimated space required” value of 9.07GB. I guess this database grew just a little bit larger
Digging on the Internet, I found VMware KB article 1004382 , which describes the rollup job error. I didn’t knew of the existence of these SQL agent jobs and was a bit surprised. As mentioned in the possible circumstances, this problem was introduced by moving the database to a different SQL server and recreation of the SQL agent jobs was simply forgotten. Recreation of these SQL agent jobs is also necessary if you migrate from SQL Express to SQL Standard or Enterprise.
Recreating these SQL agent jobs will resolve the database growth problem, but will it get rid of all the historical data inside the database?. I also found VMware KB article 1000125 which describes how to purge historical data from your database (tasks, events, and performance data) using a provided SQL (or Oracle) purge script.
I made up the following resolution plan and checked it with VMware support.
Step1. Recreate SQL agent jobs
- Backup vCenter database
- Recreate SQL agent jobs
Step2. Purge old data
- Backup vCenter database
- Stop vCenter Server
- Start SQL purge script
- Shrink SQL database
- Start vCenter Server
We decided to plan step1 during an evening on a weekday and step2 during the weekend.
After implementing step1, the “past day stats rollup” job kept running until I killed it after 2 days, because I wanted to run the purge script to purge data from the database as soon as possible. I started step2 on a Friday evening and decided to remove everything older than 30 days. On Monday, the purge script was still running and I had to kill that too, to get the vCenter Server back into production.
It turned out that the purge script had deleted 112million rows over the weekend and that only 10GB of free space was available in the database. This is roughly only 6%. I guess this database grew extremely large and according to these numbers the database must hold about 1.8billion records. This is scary! It would mean that we probably need another 30 days to complete purging the database and of course that is no option. So in the end I decided it is better to start with a fresh database and copy some of the customizations I made over time using some PowerCLI magic. Maish Saidel-Keesing over at Technodrone created a nice PowerCLI script where some of these tasks are automated. I used his script as a starting point for creating my own script. Whenever it’s finished I’ll definitely share it with you
So what did we learn from this?
- Monitor your vCenter database size
- Whenever moving your vCenter SQL database to a different SQL server don’t forget to recreate your SQL agent scripts.
This is also applicable when you migrate your database from SQL Express to SQL Standard or Enterprise, because the SQL Server Agent is not available in Microsoft SQL Server 2005 Express and historical data is managed by the vCenter Server service in this case. For instructions on how these jobs can be created see VMware KB article 1004382
- It’s better to run the purge script on a regular basis to keep your database as small as possible. For instructions see VMware KB article 1000125.
Funny enough VMware KBTV released a video just recently on how to purge old data from the vCenter database.
- Virtualization links for this week Tweet I’ve been really busy this week patching my ESX hosts troubleshooting ballooning problems moving ESX hosts to other SAN switches troubleshooting network/vlan problems putting some effort in helping people...
- VMware vCenter Server Heartbeat Tweet VMware just released the VMware vCenter Server Heartbeat product which was first publicly announced at VMworld Europe 2009. VMware vCenter Server Heartbeat is an addon which creates high availability...
- Export and import customization profiles using Powershell Tweet One great thing in the automation of VM deployments is the use of customization profiles. These profiles are stored inside the vCenter Server database. However, when you loose the...
- VMware vCenter Data Recovery Tweet When taking the VMware vCenter Data Recovery lab on VMworld Europe 2009 I was totally surprised about this new product. The product, which will be part of the new...