What's safe in simplification |
This page in 2003 Top Previous Next |
|
Safe simplification Some of the actions that OAK can take to simplify formulas can be performed completely safely; that is, the simplified formula will always give the same answer as the original formula. For example, =A1++B1 can be rewritten as =A1+B1, and depended on to give the same answer. The rule that OAK follows to make transformation just shown is that x++y maps to x+y. This rule is always safe, by which we mean that the transformation it describes can be made without there being the slightest risk that the meaning of the formula, or the results it generates, will be altered. This is true whatever x and y are. They can be •cell references, as in the =A1++B1 example with which we have started •ranges, as in =A5:A9++B5:A9 •ExcelNames, as in =Revenue++Costs OAK terms the transformation from x++y to x+y as conservative, in that it is safe under any conditions. Unsafe simplification 1 The example with which we introduced expression simplification had the pattern =IF(x>y,x,y) mapping to MAX(x,y). That simplification is fine so long as x and y are coordinate references. But if x and y are name references, there's a good chance that the two formulas will give different answers. =IF(TaxDue>Threshold, TaxDue, Threshold) is not at all the same as MAX(TaxDue,Threshold). The reason has to do with Excel's propensity to pick out single cells from ranges by matching rows and columns. Similarly, the expression =AVERAGE(B5+0,B8*1) is equivalent to =AVERAGE(B5,B8). But it does not follow from that that we can substitute x whenever we encounter x+0, or y whenever we encounter y*1. =AVERAGE(B5:D5+0,B8:D8+0) is not at all the same as AVERAGE(B5:D5,B8:D8). In the first, Excel will use column matching to pick out two cells to average; in the second, it will calculate the average of six cells. The rules Excel follows in such circumstances are in Lessons I, J and K of the Excel names tutorial in this help. Unsafe simplification 2 The expression =IF(A1>B2,C3,0) can be rewritten as =(A1>B2)*C3. Not everyone would think that the second formulation was any clearer, but some people like it. But it is not safe to infer a rule that =IF(x,y,0) can map to =x*y. Such a rule would work splendidly if x evaluates to TRUE or FALSE, since those truth values are coerced by Excel into the integers 1 and 0 the moment they are involved in any maths. But what happens x is not a truth value? Suppose instead it was an expression that evaluated to 4. Just as Excel will turn truth values into 1 and 0 integers when the context demands it, so it will turn numbers into truth values. Since Excel expects the first parameter of the IF() function to be a truth value, it will coerce the 4 into one. The convention is that 0 is treated as FALSE, and any other value is taken as TRUE. IF(4,y,0) therefore evaluates to IF(TRUE,y,0), which in turn is y. If we apply our rule, we will map this expression from =IF(x,y,0) to x*y, which will come out as 4*y, which is not the same answer as the untransformed expression. OAK terms the transformations shown here as aggressive. They will work in the majority of real world conditions, but it can't be relied upon to be safe whatever x is. |