Saturday, August 21, 2004

Expand Your Focus for Performance

Hopefully this isn't considered a violation of my NDA or anything along those lines, but I just had to share. I left anything too touchy out so it is pretty vauge technically.

At work we have an older program that is very critical for many of our internal customers. It uses and older database backend (similar to MS Access). There is a very important project coming up, and they need it to generate some reports. Unfortunately some of the reports were running for a very long period of time (more then 12 hours) and still not completing. It is a very complex process to generate the data needed for the reports. The program starts generating the data and then eventually hits a point where the CPU utilization and memory consumption are very low and the hard drive access light just goes solid.

Well, because of the importance of this project we were all told to make it our top priority to improve the performance of these reports. I am fairly new to the company so I have had no prior experience with this program. Steve had previously optimized the program as much as possible given the current design - he didn't think a change to program logic was the answer.

We were sitting around discussing what we could do when Steve asked if the cache of the database had been increased - turns out this database is very limited in this regard and its various settings had been explored with very little success. This got me thinking. I asked how big the datafiles were. Turns out it is only 10 to 40 megabytes. I suggested putting the database on a ramdrive - effectively caching the whole thing.

Brian installed a RAM drive and the database was moved over. The windows temp directory was also redirected there since the database used that to store temporary files. Logging was also turned off, but the log could have been redirected to the temp folder (theoretically).

The result was that these reports were done in close to a hundredth of the time. It is hard to get an accurate measure since we were testing on different systems and didn't bring our stop watches. Maybe come Monday we will have some hard data, but I am sure it will be a very substantial improvement.

Using a modern, higher-end database system could also fix the issue, but that is an architectural change.

The moral of the story is to think outside the box. When your only tool is a hammer everything looks like a nail, and when your only tool is the source code then that is where you try to get everything done.

This is could be the performance stack for a program:

  • Hardware
    • CPU
    • RAM
    • Hard drive
  • Operating System
  • Drivers
  • API Library
    • Service Packs
    • Shared DLL's (.NET Framework, DirectX, etc.)
  • Database
    • Data model
  • Development Libraries
  • Source Code
  • Compiler / Linker
    • Debug options
    • Shared libraries vs. Monolithic binary
    • executable compression
  • Deployment
  • User
    • Training
    • Expectations
    • Support
    • Other "unrelated" programs

Your bottleneck can exist anywhere in this chain, and this chain could be significantly more complex depending on the type of application you are running. For example a web based application. If you don't look at the whole chain then you could end up spending a lot of time with very little (if any) return. But making the right change at the right point can have a huge impact.

No comments: