Google

Learn Excel from MrExcel Video Podcast

EXCEL FUNCTIONS

Excel Programming

Excel Charting

Excel VBA

Friday, March 2, 2007

SUMPRODUCT

Multiple Condition Tests

One of the most basic functions in any spreadsheet is to return an answer based upon some condition. This becomes especially useful when counting or summing based upon that condition. One condition is useful, but multiple conditions extend the functionality and flexibility, so that you can count say the number of items sold by part number AND by month. There are a number of ways that this can be achieved within Excel, but this paper is focussing on one particular function, the SUMPRODUCT function, which by creative use has evolved a flexibility undreamt of by its originators in Microsoft.

SUMPRODUCT is one of the most versatile functions provided in Excel. In its most basic form, SUMPRODUCT multiplies corresponding members in given arrays, and returns the sum of those products. This page discusses the classic use of SUMPRODUCT, how creativity and inbuilt flexibility has enabled it to evolve into a far more useful function, and explains some of the techniques being deployed. Finally, some examples of SUMPRODUCT show its versatility.

Standard Use of SUMPRODUCT
Evolving Use of SUMPRODUCT
Advantages of SUMPRODUCT
SUMPRODUCT Explained
Format of SUMPRODUCT
Examples
References
Acknowledgments

Standard Use of SUMPRODUCT

In it's classic form, SUMPRODUCT multiplies each value in one array by the corresponding value in another array, and returns the summed result. As an example, if cells A9:A11 contain the values 1,2,3 and B9:B11 contain 10,20,30, then

=SUMPRODUCT(A9:A11,B9:B11)

returns 140, or (1*10)+(2*20)+(3*30)=10+40+90=140.

This is a useful function, but nothing more than that. A further, more 'creative' use of SUMPRODUCT has evolved, and is still evolving, driven as far as I can see mainly by the regular contributors of the Microsoft Excel newsgroups. This has been a creative and productive process that has significantly increased the useability of SUMPRODUCT, but in a way that you will not find documented in Excel's Help.

Evolving Use of SUMPRODUCT

Within Excel, there are two very useful functions that support conditional counting and summing, namely COUNTIF and SUMIF. Very useful functions, but limited in that they can only evaluate a single test range, and only a single test condition. Multiple conditions are so useful to test ranges (say between two dates), and double tests (one array = A and another = B), and whilst this can be managed using array functions
=SUM(IF(test_A,IF(test_B, etc.,
this is somewhat unwieldy, and is an array formula. And there is a better way, using SUMPRODUCT.

Note that in this section, all formulae given are using the '*' (multiply) operator format, but this in itself is one of the biggest discussion points around the SUMPRODUCT function, one which is discussed below.

To understand how SUMPRODUCT can be used, first consider the following data.

A

B

C

1

Make

Month

Price

2

Ford

June

7,500

3

Ford

June

8,300

4

Ford

May

6,873

5

Ford

June

11,200

6

Renault

June

13,200

7

Renault

June

14,999

8

BMW

June

17,500

9

BMW

May

23,500

10

BMW

June

18,000

Table 1.

We can easily count the number of Fords with
=COUNTIF(A1:A10,"Ford")
which returns 4.

Similalrly, it is straight-forward to get the value of Fords sold, using
=SUMIF(A1:A10,"Ford",C1:C10),
which gives 33,873.

How do we get a count of how many Fords are sold in June, or the value of them? The number can be calculated with
=SUM(IF(A1:A10="Ford",IF(B1:B10="June",1,0),0)),
which is an array formula so is committed with Ctrl-Shift-Enter, not just Enter. Similarly, the value is obtained with
=SUM(IF(A1:A10="Ford",IF(B1:B10="June",C1:C10,0),0)),
also an array formula.

But as this page is about SUMPRODUCT, you would expect that we could use that function in this case, and we can. The solution for the number of Fords sold in June using this function is
=SUMPRODUCT((A1:A10="Ford")*(B1:B10="June")).
The value is obtained with
=SUMPRODUCT((A1:A10="Ford")*(B1:B10="June")*(C1:C10))
In my view, this formula more readily shows what the objective is.

As a further extension of its use, we can use the '+' (plus) operator to count OR conditions, such as how many cars sold were either Fords, or were sold in June. This formula shows how
=SUMPRODUCT((A1:A10="Ford")+(B1:B10="June"))

Although array formulae are mentioned here, they are not explained. For a detailed discussion, see Chip Pearson's Array Formulas web page.

So far, so good, in that we have a versatile function that can do any number of conditional tests, and has an inbuilt flexibility that provides extensibility. Its power is augmented when combined with other functions, such as can be found in the examples below.

Advantages of SUMPRODUCT

Multiple conditional tests are a major advantage of the SUMPRODUCT function as descibed above, but it has two other considerable advantages. The first is that it can function with closed workbooks, and the second is that the handling of text values can be tailored to the requirement.

In the case of another workbook, the SUMIF function can be used to calculate a value, such as in
=SUMIF('[Nowfal Rates.xls]RATES'!$K$11:$K$13,">1")

This is fine in itself, and the value remains if the other workbook is closed, but as soon as the sheet is re-calculated, the formula returns #VALUE. Similarly, if the formula is entered with the other workbook already closed, a #VALUE is immediately returned.

SUMPRODUCT, however, overcomes this problem. The formula
=SUMPRODUCT(--('[Nowfal Rates.xls]RATES'!$K$11:$K$13>1),--('[Nowfal Rates.xls]RATES'!$K$11:$K$13))
returns the same result, but it will still work when the other workbook is closed and the sheet is re-calculated, and can be initially entered referencing the closed workbook, without a #VALUE error.

The second major advantage is being able to handle text in numeric columns differently. Consider the follwoing dataset, as shown in Table 2.

A

B

1

Item

Number

2

x

1

3

y

2

4

x

3

Table 2.

If we are looking at rows 1:4. we can see that we have a text value in B1 In this case it is simply a heading row, but the principle applies to a text value in any row.

Using SUMPRODUCT, we can either return an error, or ignore the text. This can be useful if we want to ignore errors, or if we want to trap the error (and presumably correct it later).

Errors will be returned if we use this version
=SUMPRODUCT((A1:A4="x")*(B1:B4))

To ignore errors, use this amended version which uses the double unary operator (see SUMPRODUCT Explained below for details)
=SUMPRODUCT(--(A1:A4="x"),(B1:B4))

But how does it work?

SUMPRODUCT Explained

To understand how SUMPRODUCT works will help you to determine where to use it, how you can construct your formula, and thus how you can extend it.

Table 3. below shows an example data set that we will use. In this example, the problem is to find how many Fords with a category of "A" were sold. A9:A20 holds the make, B9:B20 has the category, and C9:C20 has the number sold. The formula to get this result is
=SUMPRODUCT((A9:A20="Ford")*(B9:B20="A")*(C9:C20)).

The first part of the formula (A9:A20="Ford") checks the array of makes for a value of Ford. This returns an array of TRUE/FALSE, in this case it is
{TRUE,FALSE,TRUE,TRUE,TRUE,TRUE,TRUE,FALSE,TRUE,TRUE,TRUE,TRUE}

Similarly, the categories are checked for the vale A with (B9:B20="A"). Again, this returns an array of TRUE/FALSE, or
{FALSE,FALSE,TRUE,TRUE,FALSE,TRUE,TRUE,TRUE,TRUE,TRUE,TRUE,TRUE}

And finally, the numbers are not checked but taken as is, that is (C9:C20), which returns an array of numbers
{3,4,2,1,4,3,2,8,6,8,7,6}

So now we have three arrays, two of TRUE/FALSE values, one of numbers. This is showm in Table 4.

And this is where it gets interesting. SUMPRODUCT usually works on arrays of numbers, but we have arrays of TRUE/FALSE here as well as an array of numbers. By using the '*' (multiply) operator, we can get numeric values that can be summed. '*' has the effect of coercing these two arrays into a single array of 1/0 values. Multiplying TRUE by TRUE returns 1 (try it, enter =TRUE*TRUE in a cell and see the result), any other combination returns 0. Therefore, when both conditions are satisfied, we get a 1, whereas if any or both conditions are not satisfied, we get a 0. Multiplying the first array of TRUE/FALSE values by the second array of TRUE/FALSE values returns a composite array of 1/0 values, or
{0,0,1,1,0,1,1,0,1,1,1,1}.
This new array of 1/0 values is then multiplied by the array of numbers sold to give another array of numbers sold that satisfy the two test conditions. SUMPRODUCT then sums the members of this array to give us a count.

Table 4. below shows the values that the conditional tests break down to before being acted upon by the '*' operator. Table 5. shows a virtual representation of those TRUE/FALSE values as their numerical equivalents of 1/0 and the individual multiplication results. From this, you should be able to see how SUMPRODUCT arrives at its result, namely 35.

A

B

C

9

Ford

B

3

10

Vauxhall

C

4

11

Ford

A

2

12

Ford

A

1

13

Ford

D

4

14

Ford

A

3

`5

Ford

A

2

16

Renault

A

8

17

Ford

A

6

18

Ford

A

8

19

Ford

A

7

20

Ford

A

6

Table 3.

A

B

C

9

TRUE

*

FALSE

*

3

10

FALSE

*

FALSE

*

4

11

TRUE

*

TRUE

*

2

12

TRUE

*

TRUE

*

1

13

TRUE

*

FALSE

*

4

14

TRUE

*

TRUE

*

3

15

TRUE

*

TRUE

*

2

16

FALSE

*

TRUE

*

8

17

TRUE

*

TRUE

*

6

18

TRUE

*

TRUE

*

8

19

TRUE

*

TRUE

*

7

20

TRUE

*

TRUE

*

6

Table 4.

A

B

C

9

1

*

0

*

3

0

10

0

*

0

*

4

0

11

1

*

1

*

2

2

12

1

*

1

*

1

1

13

1

*

0

*

4

0

14

1

*

1

*

3

3

15

1

*

1

*

2

2

16

0

*

1

*

8

0

17

1

*

1

*

6

6

18

1

*

1

*

8

8

19

1

*

1

*

7

7

20

1

*

1

*

6

6

35

Table 5.

Table 6. below shows you the same virtual representation of 1/0 numerical values without the numbers sold column, that is using SUMPRODUCT to count the number of rows satisfying the two conditions, or
=SUMPRODUCT((A9:A20=A1)*(B9:B20="A"))

No comments: