EXCELcises as of Nov 15, 2009


The assignment is not in final form until it is check-marked

Attach the assigned EXCELcises to the regular homework securely; be sure to label them carefully & include your name. Color coding applies as in the rest of the assigned problems.

SAVE THE TREES: Print compactly, on the fewest sheets possible.

1.1  (a) Using the necessary formulas in column C and in rows 16-19, construct the spreadsheet in this picture (click here);  (b) Print;  (c) Display all formulas [Press CTRL + ` (grave accent)] and print again.

1.2  (a) Last week's payroll for our firm's part-timers looks like this (click here);  using the data on the right of the table (new hours worked), prepare this week's payroll in Excel. (b) Print;  (c) Display all formulas as in 1.1 (c) and print again.

3 Simulate 900 flips of two fair coins in columns A&B of an Excel sheet as in our class exercise. Place the sum of A and B in column C. Use the COUNTIF( ) function to count the values in columns A and C (the frequency), and bar-graph the respective relative frequencies of A & C. Fit the upper part of the simulation in one page and print normally and, again, in "show-formulas" mode  [CTRL + `]; use proper zoom and column width to fit the page;  (print a total of 2 pages); no need to show all 900 lines and there is no need to show the graphs on the second sheet.(save the file and modify to use in 4, below)

4 Repeat the previous exercise for 900 rolls of two fair dice in columns A&B. Use the FREQUENCY function to count the values in columns A and C (the frequencies), and do a column graph of the respective relative frequencies of A & C. Fit the upper part of the simulation in one page and print normally and, again, in "show-formulas" mode  [CTRL + `]; use proper zoom and column width to fit the page;  (again, print a total of 2 pages); no need to show all 900 lines and there is no need to show the graphs on the second sheet.
Comment:
(a) What proportion of "3"s would you expect in Col. A? What did you get?
(b) What proportion of "3"s would you expect in Col. C? What do you actually get? (Hand-print these answers and circle the values very neatly on the first sheet)

 

5. Why Does Sampling Work? Sampling is used widely in business, manufacturing, science, social research and elsewhere. Why? In order to represent a much larger set,  such as the population of a state, the totality of items manufactured, the quality of the corn grown in large area, etc. It may be too expensive or even impossible to check every one of millions of items or persons; so we examine a few. How few? If we ask only one person how many cups of coffee they drink in one day, we learn next to nothing about the rate of coffee consumption in NJ. But if we ask five, fifty, or 500 randomly selected folks the same question, we begin to get an estimate of the answer with an increasing sense of accuracy.

In order to see the effect of increasing sample size n on the accuracy of the estimate, we use Excel to simulate random observations, =5+10*RAND(), and plot the frequency of the average (for n=1, 2, ...5) on a comparative graph, like this.

Generate 1000 sets of five-tuples in columns A-E (the example is for 500 sets of observations). Column G (AVERAGE($A1:A1)) is identical to A (the average of one number equals the number itself), col. H=AVERAGE($A1:B1), ..., K=AVERAGE($A1:E1). These are averages of random samples of a single observation, 2 observations, etc. Show the respective frequency distributions (N-R) with bins in M. Add a combined graph of the frequencies. (the column graph is for n=1; show it similarly or via a line graph.
Using "Landscape" orientation, print three (or four) single pages to display
(a) the graph,
(b) columns A-R (as in the example), and
(c) the formulas in columns (A-R)
[Press CTRL + ` (grave accent)]; use appropriate zoom, e.g. 50% to help fit these columns; if necessary, widen columns so the equations show entirely. You may need 2 pages for Part (c)-- no need to show all 1,000 samples, however (see example)
(d) Explain, briefly, what the comparative graph tells us about sampling and the relative precision of the procedure.

 6: Another Sampling Simulation?

A manufacturing process produces computer chips using a totally new method. As is often the case in such production situations, not all of the product is usable. A proportion p of it turns out to be defective. Samples of 20 chips are tested every hour to ascertain that the process is stable. We simulate 600 samples of 20 chips each with p=.4 (value in cell A1). The formula to provide 0's and 1's in the correct proportion is  = IF(RAND()<$A$1,1,0). The above figure shows a representative portion of the spread-sheet which is similar to EXCELcise 5, above.

The sampling distribution (column Z and this graph) shows the variation in sample proportion (p-hat) due to chance alone.
{ Optional Question: If a sample shows too many defectives (if, say, p-hat is more than .55) the manufacturing process is stopped and examined for the causes of high defective rate. What is you best estimate of the probability that this happens due to chance alone? Use the above figures for this challenge }

 

7.  Descriptive statistics: A. Using the following 25x10 block of data (Set A) find its quartiles, median, mean, etc as in the example. Some applicable formulas are given. Fill in the rest and also do the relative frequency and graph with proper x-axis (bins provided).  (a) Comment briefly: Is the distribution symmetric/skewed (b) What are the outliers (if any)? B. Repeat for Set B. [Hints: cut-and -paste the data from here into Excel, do not retype. You can use the same sheet for both parts A and B and change the data or, alternatively,  you can copy-and-paste the worksheet (A) in Sheet 1 into Sheet 2 (B) and then change the data. Print each of A & B in one page, if possible. No need to print the formulas. ]

        Set A                   Set B          
22.11 31.5 24.96 24.08 17.1 23.46 32.38 16.29 19.62 27.88   20.66 22.53 30.84 19.97 19.31 39.03 26.06 21.62 16.65 48.48
33.41 28.07 23.29 31.8 24.45 26.95 19.32 31.27 16.93 26.63   18.6 19.95 25.89 31.33 25.77 20.51 35.17 18.18 19.68 24.85
21.71 21.58 30.73 24.8 21.44 32.51 19.14 22.09 21.58 32.76   26.81 32.59 27.86 19.93 32.77 23.13 29.2 33.06 27.03 15.9
29.82 27.67 23.4 22.89 22.86 21.21 26.83 28.55 27.21 24.5   23.72 28.22 33.78 22.33 25.72 16.35 22.9 31.2 26.52 26.59
20.7 26.34 32.69 24.51 23.27 21.36 21.89 20.16 18.01 24.24   16.87 20.08 17.39 30.28 25.56 23.15 30.01 34.16 32.6 27.94
18.02 27.22 27.07 25.31 34.64 14.96 26.97 26.61 31.73 29.06   23.91 20.41 22.36 22.7 20.84 30.81 25.14 15.57 23.46 23.85
27.23 15.65 30.71 14.98 20.27 27.95 16.38 28.76 26.19 22.22   26.63 19.99 19.42 26.27 17.08 23.58 18.93 30.62 32.73 28.15
28.03 26.91 27.82 7.88 21.66 26.26 19.71 31.35 20.66 18.71   26.87 25.87 20.89 19.22 25.32 34.64 26.19 29.44 23.89 31.03
20.15 28.84 30 30.19 19.63 22.18 22.48 26.12 22.37 41.31   28.43 22.23 31.7 33.81 21.26 29.25 23.67 19.21 28.7 26.45
28.58 17.07 30.44 21.59 24.8 17.53 27.79 29.45 32.36 24.21   25.02 25.86 29.01 26.68 29.92 23.36 18.54 12.59 21.62 19.86
14.1 31.25 21.7 25.09 25.48 18.96 24.85 24.81 15.51 23.09   35.24 27.43 18.77 20.83 22.63 26.09 22.63 32.38 20.43 24.04
31.67 23.39 25.32 24.58 14.04 26.85 20.93 18.75 22.62 27.7   29.85 16.11 25.27 24.5 22.52 19.09 19.47 24.16 27.15 27.36
19.26 23.29 23.02 20.55 27.52 19.51 13.15 25.87 18.88 14.1   19.86 35.96 19.61 29.72 44.44 25.22 31.24 24.36 29.45 17.94
31.3 28.17 17.65 23.64 31.6 24.72 22.03 24.24 26.15 22.25   26.72 33.23 24.31 23.06 24.64 25.92 30.51 17.93 26.22 17.12
33.57 27.74 29.24 22 17.4 32.04 26.1 23.7 16.42 20.84   26.11 20.67 22.42 29.26 26.76 26.82 28.58 24.15 26.8 17.81
24.96 20.07 32.25 21.36 33.8 22.22 25.69 23.12 18.2 22.54   33.96 19.94 17.19 33.66 24.89 26.77 14.26 25.85 19.89 27.11
30.12 26.68 25.1 26.96 23.42 24.37 18 18.04 19.5 21.64   17.75 13.05 33.61 30.24 23.29 32.17 16.62 22.19 28.24 22.06
29.98 35.37 24.17 21.47 17.84 25.65 23.14 30.27 12.33 27.74   32.21 18.99 29.79 32.11 26.23 27.8 26.05 26.58 24.82 16.33
35.42 27.5 17.86 20.48 20.07 33.02 20.15 23.95 29.32 22.52   21.47 26.53 26.87 20.97 25.14 21.08 22.01 24.87 19.14 26.23
25.62 29.85 24.43 20.67 14.43 15.93 23.86 21.05 24.64 29.44   34.48 32.13 21.53 28.95 28.2 21.98 27.82 29.55 20.56 23.92
23.02 32.54 21.87 25.42 18.55 22.96 26.95 19.07 21.49 23.33   19.72 23.51 24.57 22.59 20.45 22.56 28.1 29.97 26.51 29.22
22.25 27.43 21.11 32.1 18.4 22.16 27.83 21.2 28.96 32.65   27.43 28.79 23.95 27.51 21 19.12 27.61 24.13 22.76 27.83
30.41 19.08 27.06 22.99 33.32 27.66 22.65 25.46 15.82 25.46   26.53 19.84 24.2 17.35 34.81 26.49 22.11 25.95 17.59 18.82
30.14 29.69 30.47 23.61 20.77 24.21 29.23 34.28 32.9 23.85   23.41 27.27 32.08 32.96 20.7 16.71 25.35 23.45 17.93 16.1
30.84 30.54 23.92 24.25 24.37 21.91 30.27 23.17 23.27 26.25   43.43 21.66 28.24 30.54 35.57 32.35 16.87 22.17 23.26 24.71


(the following is an example; your data will be Set A and Set B, respectively. In the formulas, "D" indicates the data block)

8.  Probability meets Statistics (and Normality and Simulation and Excel)...

The data of the preceding Excelcise is normally distributed, specifically, from N(25,6). It was generated by NORMINV(RAND(),25,6). In this instance, we use the same formula to simulate voter samples from the distribution N(43,7) as described in homework problem 13.20. (a) Duplicate Excelcise 7 with these parameters for 1000 observations and show the relative frequency as in col. AD (and graph) in the adjoining figure.
(b) Superimpose the normal curve
N(43,7) to see how well the data fits it. That is, in col. AA, use
NORMDIST(AB2,m,s,FALSE)*q
(the bins in column AB are s/4=1.75=q units apart; that is why the values in col AA need to be normalized by a factor q); graph as shown;
(c) What formula would you use to answer the question in problem 13.20? Show the formula and use it in cell AF22 for the numerical value.