|
In the business world, the individual who builds a particular spreadsheet is rarely the only person to use it.
As either a spreadsheet builder or a spreadsheet user, it is often convenient to have the capability of producing a summary of a particular workbook. For those who build spreadsheets in the first place, such a facility would help to detect errors before releasing models. For end-users, it provides an overview of the logic of the model.
It would also be extremely valuable if you could state in a few words how complex a given workbook is. This information can help you to estimate how long it might take for the average user to learn to use the workbook effectively.
Of course, Excel workbooks can vary enormously in both size and complexity. Depending on the spacing between formulas, a workbook which contains many worksheets can be simpler than a workbook containing a single worksheet. Thus it is difficult to judge the complexity of a given workbook from its size alone.
Furthermore, even if you know the number of formula cells in a given workbook, you would still need some measure how many different, or distinct, formulas are present. Even nicer would be a measure of the complexity of the individual distinct formulas; this would allow a reasonable estimate of how difficult the workbook might be to understand.
The OAK4 | Workbook or Worksheet | Summarize command provides you with just this kind of information about an Excel workbook.
OAK WORKBOOK / WORKSHEET SUMMARY
The OAK4 | Workbook or Worksheet | Summarize command allows you to establish key facts about the size and complexity of a workbook.
This command can generate three reports:
| • | a workbook summary that lists all the selected worksheets contained within the workbook, with detailed statistics for all the unprotected worksheets; |
| • | a list of all distinct formulas (on unprotected worksheets) in the selected worksheets, along with details for each, including a measure of their complexity; |
| • | a list of all referred constants in the selected worksheets, along with details for each. |
Note that if you want OAK to provide statistics for the protected worksheets in a workbook, you must first unprotect them.
Why would you want to use it?
The Summarize command is useful in the following situations:
| • | when you first encounter a model, the workbook summary gives a measure of how large and complex it is and allows you to estimate how long it will take to understand it; |
| • | similarly, when you first encounter a model, the formula list helps you to understand the complexity of individual formulas and to identify those that may require particular attention (i.e. those that are the longest); |
| • | when you are producing model documentation, you may require an overview of the model structure and a list of all the formulas contained in the model; |
| • | when you are ready to release a model, you will need to confirm that it contains no unknown error cells. |
| • | when you are reconciling model inputs with model documentation. |
How to use it
When you choose the Summarize command from the OAK menu, a dialog box is displayed, giving you a few options to choose from.
The Summarize Workbook, List Distinct formulas and List Referred Constants generate different reports. An additional option, Output in Same Workbook, simply defines where to place the reports. The lists allow you to choose which workbook, or which individual sheets within that workbook to summarize.
Workbook Summary
The workbook summary report lists the sheets selected to be summarized and analyzes the unprotected worksheets for the following information:
| • | numbers of non-blank rows and columns, which are multiplied together to give the effective number of cells in use (this is not the number of non-blank cells); |
| • | an indication of whether the used range (located using Ctrl+End or the Edit | Go To | Special | Last Cell command, see Chapter 4) is significantly outside the range of non-blank cells; |
| • | number and percentage of the cells in use containing formulas; |
| • | number and percentage of the formula cells that contain distinct formulas; a distinct formula is one which is inconsistent with the formula in the cell to its left and with the formula in the cell above (see Chapter 4 for details); |
| • | number and percentage of the formula cells which contain error values; |
| • | number and percentage of the cells in use containing constants; |
| • | number of individual arrays, number of array cells, and the percentage of the cells in use which are array cells; |
| • | number of merged cells and the percentage of cells in use which are merged |
| • | Last Cell Location. This column on the workbook summary report identifies worksheets whose used range, as defined by Excel, is much larger than the region on the worksheet which actually contains information. The discussion on used range explains how this can happen, why it's inconvenient, and what you can do about it. |
| • | Formula Cells, Distinct Formulas, Constant Cells, Array Cells. These columns provide you with a rough idea of the overall logic and complexity of the model. A high proportion of constants on a worksheet indicate that the sheet is an input sheet. By contrast, a high proportion of formulas indicate that the worksheet is dominated by calculations. Finally, a high proportion of distinct formulas and arrays indicate that the worksheet is complicated. |
| • | Error Cells. This column indicates the presence of Excel error cells and can be used to confirm that a workbook does not contain any error cells when it is released. These cells are either constant or formula cells which result in one of the following: #NULL!, #DIV/0!, #VALUE!, #REF!, #NAME?, #NUM!, or #N/A. |
| • | Merged Cells. This column tells you if there are any merged cells on the worksheet. Operis finds the presence of merged cells in a spreadsheet to be inconvenient. This column gives an indication of how affected by this issue are the worksheets being studied. |
Notation
If the used range includes too many blank rows (at the bottom) or too many blank columns (on the right), this is indicated using the following text strings:
| • | “>Rows”, which means there are between 10 and 20 blank rows at the bottom of the used range; |
| • | “>>Rows”, which means there are more than 20 blank rows at the bottom of the used range; |
| • | “>Columns”, which means there are between 10 and 20 blank columns on the right-hand edge of the used range; |
| • | “>>Columns”, which means there are more than 20 blank columns on the right-hand edge of the used range. |
Distinct Formula List
The distinct formulas report lists all of the distinct formulas in the workbook or selected worksheets. A distinct formula is one which is inconsistent with the formula in the cell to its left and with the formula in the cell above (see Chapter 4 for details). The report provides the following information for each formula:
| • | the worksheet which contains the formula; |
| • | the address of the first cell which contains the formula; |
| • | the complexity of the formula; |
| • | the number of cells which contain the formula; |
| • | a flag to indicate whether the formula refers to a range on a different worksheet (“Off Sheet”) or workbook (“Book Link”); see the “External Refs” section below; |
The most interesting columns are those with the headings Complexity, Frequency, and External Refs.
| • | Complexity. This column shows the number of elements which make up the formula, and is therefore useful in estimating how complicated each formula is. Sorting the list in descending order by this column identifies the most difficult formulas in the workbook, and the ones which are the most likely to contain errors. |
| • | Frequency. This column simply tells you how many cells contain the same formula. If your worksheets follow good practice in developing models, this column will tend to contain two values: 1 and N, where N is the width of your model. |
| • | External Refs. This column tells you if the formula refers to a cell on another worksheet or workbook. The flag “Off Sheet” is used to indicate that the formula refers to a range on another worksheet using cell references or local names, but not if it does so using global names. The flag “Book Link” is used to indicate that the formula refers to a range in a different workbook. |
Referred Constants List
The referred constants report lists all of the referred constants in the workbook or selected worksheets. A referred constant is a cell that contains an alpha/numeric value that is referred to by at least one other cell. The report provides the following information for each constant cell:
| • | the worksheet which contains the formula; |
| • | the address of the cell which contains the constant; |
| • | the value of the constant. |
This allows the constants used in the model to be checked against related documentation.
Wrinkles
Summarize workbook command cannot analyze worksheets which are protected. Neither the Workbook Summary nor Distinct Formulas reports will provide information about protected worksheets.
Operis notes that there are several different approaches to distinct formulas in cases where formulas are identical but located in different parts of a worksheet or in different worksheets. It may be the case in other packages that an aggressive approach is used, where if a formula is identical to another but separated by a blank line, they still count as one distinct formula. An even more aggressive approach may also used, where provided a formula is identical to another formula in a different part of the worksheet they are counted as only one distinct formula, no matter where on a worksheet that formula may be located. OAK, however, uses a prudent approach, where as long as the formula is dissimilar to the formula in the cell above it or to the left of it, it is counted as a distinct formula.
Due to the limitations of Excel, this report cannot analyze very long formulas. Instead, the following message is displayed:
“***** WARNING: FORMULA TOO COMPLEX TO RETURN *****”. Formulas reported in this way should be modified if at all possible.
What you could do if you didn't have OAK
OAK is not the only package to offer a function equivalent as Summarize Workbook. Many other Excel packages will be able to perform a similar task.
Things that are distinctive about OAK's Summarize Workbook are
| • | that the reports are prepared exceptionally fast: generally a few seconds. Some packages, including versions of OAK before 3.5, can take over an hour to summarize a large spreadsheet. |
| • | the cautious approach that OAK takes in deciding which formulas are distinct |
It is also possible in Excel to be able to review the formulas in each cell by using Excel's Tools | Options | View | Formulas command. The shortcut for this command is Ctrl+`. (On a British keyboard, this is an obscure key between the Tab and Escape keys, to the left of the number 1. There is never any occasion other than this short cut to press it, at least not in Windows.)
|