Example: Coffee Taste Test ========================== Problem Description ------------------- A coffee taste test was conducted at the Stat-Ease office to improve the taste of the coffee [#]_. This example uses a simplified version of that experiment. We will look at 5 input factors: * Amount of Coffee (2.5 to 4.0 oz.) * Grind size (8-10mm) * Brew time (3.5 to 4.5 minutes) * Grind Type (burr vs blade) * Coffee beans (light vs dark) With one output, or `response`, variable: * Average overall liking (1-9) The liking is an average of the scores of a panel of 5 office coffee drinkers. Design Choice ------------- A `full factorial `_, that is, running all combinations of lows and highs, would take 2\ :sup:`5` = 32 taste tests. We want to add 8 center point runs to check for curvature, bringing the total number of runs up to 36. We can only do 3 per day, so as not to over-caffienate our testers, and can only do the tests on days when all 5 testers are in the office. That means the test will probably take a month or so. We can calculate the power of a full factorial using dexpy, assuming a signal to noise ratio of 2: .. code:: python import dexpy.factorial import dexpy.power import pandas as pd import numpy as np coffee_design = dexpy.factorial.build_factorial(5, 2**5) center_points = [ [0, 0, 0, -1, -1], [0, 0, 0, -1, 1], [0, 0, 0, 1, -1], [0, 0, 0, 1, 1] ] coffee_design = coffee_design.append(pd.DataFrame(center_points * 2, columns=coffee_design.columns)) coffee_design.index = np.arange(0, len(coffee_design)) sn = 2.0 alpha = 0.05 factorial_power = dexpy.power.f_power(model, coffee_design, sn, alpha) factorial_power.pop(0) # remove intercept factorial_power = ['{0:.2f}%'.format(i*100) for i in factorial_power] # convert to % ========== ====== Term Power ========== ====== amount 99.98% grind_size 99.98% brew_time 99.98% grind_type 99.99% beans 99.99% ========== ====== This means we have a 99.97% chance of detecting a change of 2 taste rating, assuming a standard deviation of 1 taste rating for the experiment. This is high enough that we decide to run a fraction instead, and get the experiment done more quickly. We can create a 2\ :sup:`5-1` fractional factorial, which will have 16 runs, along with the 4 center points for a total of 20. As you can see the power is still quite good. .. code:: python coffee_design = dexpy.factorial.build_factorial(5, 2**(5-1)) coffee_design.columns = ['amount', 'grind_size', 'brew_time', 'grind_type', 'beans'] center_points = [ [0, 0, 0, -1, -1], [0, 0, 0, -1, 1], [0, 0, 0, 1, -1], [0, 0, 0, 1, 1] ] coffee_design = coffee_design.append(pd.DataFrame(center_points * 2, columns=coffee_design.columns)) coffee_design.index = np.arange(0, len(coffee_design)) model = ' + '.join(coffee_design.columns) sn = 2.0 alpha = 0.05 factorial_power = dexpy.power.f_power(model, coffee_design, sn, alpha) factorial_power.pop(0) # remove intercept factorial_power = ['{0:.2f}%'.format(i*100) for i in factorial_power] # convert to % ========== ====== Term Power ========== ====== amount 96.55% grind_size 96.55% brew_time 96.55% grind_type 99.61% beans 99.61% ========== ====== We can also check the power for the interaction model: .. code:: python twofi_model = "(" + '+'.join(coffee_design.columns) + ")**2" sn = 2.0 alpha = 0.05 factorial_power = dexpy.power.f_power(twofi_model, coffee_design, sn, alpha) factorial_power.pop(0) factorial_power = ['{0:.2f}%'.format(i*100) for i in factorial_power] # convert to % ===================== ====== Term Power ===================== ====== amount 93.67% grind_size 93.67% brew_time 93.67% grind_type 98.91% beans 98.91% amount:grind_size 93.67% amount:brew_time 93.67% amount:grind_type 93.67% amount:beans 93.67% grind_size:brew_time 93.67% grind_size:grind_type 93.67% grind_size:beans 93.67% brew_time:grind_type 93.67% brew_time:beans 93.67% grind_type:beans 98.91% ===================== ====== Run the Experiment ------------------ We can build the 2\ :sup:`5-1` design using :ref:`build_factorial `, then appending the 8 center point runs. We actually already did this to evaluate the power, but here is the code again. .. code:: python coffee_design = dexpy.factorial.build_factorial(5, 2**(5-1)) coffee_design.columns = ['amount', 'grind_size', 'brew_time', 'grind_type', 'beans'] center_points = [ [0, 0, 0, -1, -1], [0, 0, 0, -1, 1], [0, 0, 0, 1, -1], [0, 0, 0, 1, 1] ] coffee_design = coffee_design.append(pd.DataFrame(center_points * 2, columns=coffee_design.columns)) coffee_design.index = np.arange(0, len(coffee_design)) It is convenient to print out the design in actual values, rather than the coded -1 and +1 values, for when we make the coffee. .. code:: python actual_lows = { 'amount' : 2.5, 'grind_size' : 8, 'brew_time': 3.5, 'grind_type': 'burr', 'beans': 'light' } actual_highs = { 'amount' : 4, 'grind_size' : 10, 'brew_time': 4.5, 'grind_type': 'blade', 'beans': 'dark' } actual_design = dexpy.design.coded_to_actual(coffee_design, actual_lows, actual_highs) +-----+--------+------------+-----------+------------+-------+ | run | amount | grind_size | brew_time | grind_type | beans | +=====+========+============+===========+============+=======+ | 0 | 2.5 | 8 | 3.5 | burr | dark | +-----+--------+------------+-----------+------------+-------+ | 1 | 2.5 | 8 | 3.5 | blade | light | +-----+--------+------------+-----------+------------+-------+ | 2 | 2.5 | 8 | 4.5 | burr | light | +-----+--------+------------+-----------+------------+-------+ | 3 | 2.5 | 8 | 4.5 | blade | dark | +-----+--------+------------+-----------+------------+-------+ | 4 | 2.5 | 10 | 3.5 | burr | light | +-----+--------+------------+-----------+------------+-------+ | 5 | 2.5 | 10 | 3.5 | blade | dark | +-----+--------+------------+-----------+------------+-------+ | 6 | 2.5 | 10 | 4.5 | burr | dark | +-----+--------+------------+-----------+------------+-------+ | 7 | 2.5 | 10 | 4.5 | blade | light | +-----+--------+------------+-----------+------------+-------+ | 8 | 4 | 8 | 3.5 | burr | light | +-----+--------+------------+-----------+------------+-------+ | 9 | 4 | 8 | 3.5 | blade | dark | +-----+--------+------------+-----------+------------+-------+ | 10 | 4 | 8 | 4.5 | burr | dark | +-----+--------+------------+-----------+------------+-------+ | 11 | 4 | 8 | 4.5 | blade | light | +-----+--------+------------+-----------+------------+-------+ | 12 | 4 | 10 | 3.5 | burr | dark | +-----+--------+------------+-----------+------------+-------+ | 13 | 4 | 10 | 3.5 | blade | light | +-----+--------+------------+-----------+------------+-------+ | 14 | 4 | 10 | 4.5 | burr | light | +-----+--------+------------+-----------+------------+-------+ | 15 | 4 | 10 | 4.5 | blade | dark | +-----+--------+------------+-----------+------------+-------+ | 16 | 3.25 | 9 | 4 | burr | light | +-----+--------+------------+-----------+------------+-------+ | 17 | 3.25 | 9 | 4 | burr | dark | +-----+--------+------------+-----------+------------+-------+ | 18 | 3.25 | 9 | 4 | blade | light | +-----+--------+------------+-----------+------------+-------+ | 19 | 3.25 | 9 | 4 | blade | dark | +-----+--------+------------+-----------+------------+-------+ | 20 | 3.25 | 9 | 4 | burr | light | +-----+--------+------------+-----------+------------+-------+ | 21 | 3.25 | 9 | 4 | burr | dark | +-----+--------+------------+-----------+------------+-------+ | 22 | 3.25 | 9 | 4 | blade | light | +-----+--------+------------+-----------+------------+-------+ | 23 | 3.25 | 9 | 4 | blade | dark | +-----+--------+------------+-----------+------------+-------+ All that is left is to drink 24 pots of coffee and record the results. Note that, while the tables in this example are in a sorted order, the actual experiment was run in random order. This is done to reduce the possibility of incidental variables influencing the results. For example, if the temperature in the office for the first 8 runs was cold, the testers may have rated the taste higher. Hot coffee being more pleasing in a cold environment. If the first 8 runs were the only runs where amount was at its low setting, as it is in the sorted table above, we would confound the low amount effect with the effect of the cold office, and incorrectly conclude that a lower amount of coffee is better. +-----+------+-----+------+------+--------+------+ | run | hank | joe | neal | mike | martin | mean | +=====+======+=====+======+======+========+======+ | 0 | 4 | 4 | 4 | 5 | 5 | 4.4 | +-----+------+-----+------+------+--------+------+ | 1 | 3 | 3 | 3 | 3 | 1 | 2.6 | +-----+------+-----+------+------+--------+------+ | 2 | 2 | 3 | 3 | 2 | 2 | 2.4 | +-----+------+-----+------+------+--------+------+ | 3 | 9 | 9 | 9 | 8 | 8 | 8.6 | +-----+------+-----+------+------+--------+------+ | 4 | 1 | 3 | 1 | 2 | 1 | 1.6 | +-----+------+-----+------+------+--------+------+ | 5 | 2 | 6 | 1 | 2 | 3 | 2.8 | +-----+------+-----+------+------+--------+------+ | 6 | 7 | 6 | 6 | 8 | 9 | 7.2 | +-----+------+-----+------+------+--------+------+ | 7 | 1 | 4 | 2 | 5 | 5 | 3.4 | +-----+------+-----+------+------+--------+------+ | 8 | 5 | 7 | 8 | 6 | 8 | 6.8 | +-----+------+-----+------+------+--------+------+ | 9 | 2 | 6 | 3 | 5 | 1 | 3.4 | +-----+------+-----+------+------+--------+------+ | 10 | 2 | 5 | 4 | 5 | 3 | 3.8 | +-----+------+-----+------+------+--------+------+ | 11 | 9 | 9 | 9 | 9 | 9 | 9 | +-----+------+-----+------+------+--------+------+ | 12 | 8 | 3 | 4 | 4 | 7 | 5.2 | +-----+------+-----+------+------+--------+------+ | 13 | 4 | 6 | 2 | 4 | 2 | 3.6 | +-----+------+-----+------+------+--------+------+ | 14 | 9 | 8 | 8 | 8 | 8 | 8.2 | +-----+------+-----+------+------+--------+------+ | 15 | 7 | 6 | 5 | 8 | 9 | 7 | +-----+------+-----+------+------+--------+------+ | 16 | 7 | 6 | 4 | 5 | 5 | 5.4 | +-----+------+-----+------+------+--------+------+ | 17 | 7 | 7 | 7 | 7 | 6 | 6.8 | +-----+------+-----+------+------+--------+------+ | 18 | 7 | 2 | 2 | 4 | 3 | 3.6 | +-----+------+-----+------+------+--------+------+ | 19 | 6 | 6 | 4 | 5 | 6 | 5.4 | +-----+------+-----+------+------+--------+------+ | 20 | 7 | 4 | 4 | 3 | 6 | 4.8 | +-----+------+-----+------+------+--------+------+ | 21 | 6 | 7 | 5 | 7 | 6 | 6.2 | +-----+------+-----+------+------+--------+------+ | 22 | 5 | 4 | 3 | 6 | 4 | 4.4 | +-----+------+-----+------+------+--------+------+ | 23 | 7 | 6 | 4 | 5 | 7 | 5.8 | +-----+------+-----+------+------+--------+------+ We'll store the mean for later as another column in the DataFrame. .. code:: python coffee_design['taste_rating'] = [ 4.4, 2.6, 2.4, 8.6, 1.6, 2.8, 7.2, 3.4, 6.8, 3.4, 3.8, 9.0, 5.2, 3.6, 8.2, 7.0, 5.4, 6.8, 3.6, 5.4, 4.8, 6.2, 4.4, 5.8 ] Fit a Model ----------- The statsmodels package has an OLS fitting routine that takes a patsy formula: .. code:: python from statsmodels.formula.api import ols lm = ols("taste_rating ~" + twofi_model, data=coffee_design).fit() print(lm.summary2()) +-----------------------+---------+----------+---------+------------+---------+---------+ | term | Coef. | Std.Err. | t | P>|t| | [0.025 | 0.975] | +=======================+=========+==========+=========+============+=========+=========+ | Intercept | 5.1 | 0.1434 | 35.5718 | 0 | 4.7694 | 5.4306 | +-----------------------+---------+----------+---------+------------+---------+---------+ | amount | 0.875 | 0.1756 | 4.9831 | **0.0011** | 0.4701 | 1.2799 | +-----------------------+---------+----------+---------+------------+---------+---------+ | grind_size | -0.125 | 0.1756 | -0.7119 | 0.4968 | -0.5299 | 0.2799 | +-----------------------+---------+----------+---------+------------+---------+---------+ | brew_time | 1.2 | 0.1756 | 6.8339 | **0.0001** | 0.7951 | 1.6049 | +-----------------------+---------+----------+---------+------------+---------+---------+ | grind_type | -0.1333 | 0.1434 | -0.93 | 0.3796 | -0.4639 | 0.1973 | +-----------------------+---------+----------+---------+------------+---------+---------+ | beans | 0.45 | 0.1434 | 3.1387 | **0.0138** | 0.1194 | 0.7806 | +-----------------------+---------+----------+---------+------------+---------+---------+ | amount:grind_size | 0.25 | 0.1756 | 1.4237 | 0.1923 | -0.1549 | 0.6549 | +-----------------------+---------+----------+---------+------------+---------+---------+ | amount:brew_time | -0.075 | 0.1756 | -0.4271 | 0.6806 | -0.4799 | 0.3299 | +-----------------------+---------+----------+---------+------------+---------+---------+ | amount:grind_type | -0.175 | 0.1756 | -0.9966 | 0.3481 | -0.5799 | 0.2299 | +-----------------------+---------+----------+---------+------------+---------+---------+ | amount:beans | -1.325 | 0.1756 | -7.5458 | **0.0001** | -1.7299 | -0.9201 | +-----------------------+---------+----------+---------+------------+---------+---------+ | grind_size:brew_time | 0.375 | 0.1756 | 2.1356 | 0.0652 | -0.0299 | 0.7799 | +-----------------------+---------+----------+---------+------------+---------+---------+ | grind_size:grind_type | -0.725 | 0.1756 | -4.1288 | **0.0033** | -1.1299 | -0.3201 | +-----------------------+---------+----------+---------+------------+---------+---------+ | grind_size:beans | 0.375 | 0.1756 | 2.1356 | 0.0652 | -0.0299 | 0.7799 | +-----------------------+---------+----------+---------+------------+---------+---------+ | brew_time:grind_type | 0.75 | 0.1756 | 4.2712 | **0.0027** | 0.3451 | 1.1549 | +-----------------------+---------+----------+---------+------------+---------+---------+ | brew_time:beans | 0.15 | 0.1756 | 0.8542 | 0.4178 | -0.2549 | 0.5549 | +-----------------------+---------+----------+---------+------------+---------+---------+ | grind_type:beans | 0.0833 | 0.1434 | 0.5812 | 0.5771 | -0.2473 | 0.4139 | +-----------------------+---------+----------+---------+------------+---------+---------+ We can reduce this model by keeping only terms that have a p-value below 0.05 (bolded in the table above): .. code:: python # this gets the pvalues dataframe from the RegressionResults # and slices out the rows with p < 0.05 pvalues = lm.pvalues[1:] reduced_model = '+'.join(pvalues.loc[pvalues < 0.05].index) # you could also just specify it by hand: # reduced_model = "amount + brew_time + beans + amount:beans + " \ # "grind_size:grind_type + brew_time:grind_type" lm = ols("taste_rating ~" + reduced_model, data=coffee_design).fit() print(lm.summary2()) +-----------------------+----------+----------+---------+--------+---------+---------+ | | Coef. | Std.Err. | t | P>|t| | [0.025 | 0.975] | +=======================+==========+==========+=========+========+=========+=========+ | Intercept | 5.1 | 0.1659 | 30.7405 | 0 | 4.75 | 5.45 | +-----------------------+----------+----------+---------+--------+---------+---------+ | amount | 0.875 | 0.2032 | 4.3063 | 0.0005 | 0.4463 | 1.3037 | +-----------------------+----------+----------+---------+--------+---------+---------+ | brew_time | 1.2 | 0.2032 | 5.9058 | 0 | 0.7713 | 1.6287 | +-----------------------+----------+----------+---------+--------+---------+---------+ | beans | 0.45 | 0.1659 | 2.7124 | 0.0148 | 0.1 | 0.8 | +-----------------------+----------+----------+---------+--------+---------+---------+ | amount:beans | -1.325 | 0.2032 | -6.5209 | 0 | -1.7537 | -0.8963 | +-----------------------+----------+----------+---------+--------+---------+---------+ | grind_size:grind_type | -0.725 | 0.2032 | -3.5681 | 0.0024 | -1.1537 | -0.2963 | +-----------------------+----------+----------+---------+--------+---------+---------+ | brew_time:grind_type | 0.75 | 0.2032 | 3.6911 | 0.0018 | 0.3213 | 1.1787 | +-----------------------+----------+----------+---------+--------+---------+---------+ .. [#] http://www.statease.com/publications/newsletter/stat-teaser-09-16#article1