Friday, November 26, 2010

SQuirreL away that data

The most useful language any programmer should know is T-SQL - structured query language.  When you have a solid understanding of relational database architecture and the tools to manipulate it, extracting handy information from your business data becomes almost effortless.

Relational DB systems abound: Access, SQL Server, Oracle, MySQL and PostgreSQL are a few examples.  You can even use SQL to intelligently compile data right out of your Excel worksheets.

As macro developers, we often rely on canned reports produced by corporate systems to feed us data.  This is usually in the form of .csv text files, web scrapings, or stacks of steaming, reeking workbooks that take hours to recalculate.  Sometimes, especially if data volumes are large, moving everything into Access and pulling it back into Excel using SQL Select statements can be the best way to carve out the specific information required.

In a recent project, a client was mining several years worth of member profile data to compile statistics and spot geographic trends.  The prototype system, using a small subset of data and a few pivot tables, appeared to get the job done.  But the workbook was already topping 20MB in size - a real pain to shuttle around.  Then the full data set accumulated to over 65,000 records - more than the number of rows in an Excel 03 (client's spec) worksheet... time to step up to MDB.

I massaged the data in pieces in Excel, imported it into an Access database, and was stunned to find the .mdb file to be only 3.2 MB in size!  That's sweet compression.  Smaller than my prototype workbook was, zipped, but holding 20 times the data.

Next, do away with pivot tables.  Nothing personal. I actually think pt's are a pretty good idea, but they can be unwieldy, and many end users fear them.  Programming them takes you into the outer wastelands of VBA - sketchy knowledge for most of us, (but examples are out there,) and getting exactly the result you need might not be possible - this has been my experience with most of Excel's special power features.

I think SQL is considerably more powerful, and a lot less complicated.  With a simple* piece of code I can build a mini pivot table that doesn't pivot, because it's already pivoted to what the user wants to see.  Here's an example:

SELECT TOP 10 ZipCode, AVG(Revenue) 
FROM MemberView GROUP BY ZipCode ORDER BY 2 DESC

This gives us a nice little Answer, suitable for publication in this years Annual Report - a breakdown of the top 10 zip codes by average revenue.

The key to this is setting up a View, which is another more complex SELECT query that joins data from different tables.  This View can also pre-compute some answers, convert ugly binary 1's and 0's to pleasant things like "Yes" and "No", and limit the data set as needed.  It sits in the DB and looks just like another table.  This brings all the data we need to a single point of light.

I wanted to let users pick the field to zip-rank.  For this I linked a drop-down control to a list of cells containing the fields in my View, and the report macro simply subs the user choice into the Select statement.  With this dead-simple, pre-pivoted table maker, users can't mess up.  Instead, they grind out report after report, all day long, grinning like madmen.

Doing some of the heavy lifting in the database itself means the macros and formulas can be much simpler, and the JET database engine can crunch the numbers a lot faster than Excel anyway.  I love that my Business Reporting System workbook is only 100Kb.  Not having data to lug around means macro updates and bug fixes don't impact the client, which helps keep me off-site.  When the data set scales up even further, I'm well positioned to migrate to an enterprise system like SQL server, with minimal impact on the Excel code.

In a Previous Post I showed you some VBA code to support Access DB connectivity and  pulling data using SQL Select statements.  This should be enough to get you started.  I'd love to hear about any nifty tricks or nasty pitfalls you encounter.

No comments:

Post a Comment