Homework # 5 - due
Thursday November 6 by 5pm
Problems students are to do
-
-
UG students - please do problems 1 and 2 below
-
G students - please do problems 1 - 4 below
As always, list necessary assumptions, and include respective p-values in parentheses next to your conclusion - e.g., one might conclude, "the data suggested a difference in the treatments (p = 0.0013)."
1. Huet, Bouvier, et al (Statistical Tools for Nonlinear Regression, p.2) use the Pasture Regrowth data from Ratkowsky (Nonlinear Regression, p.88) to fit a four-parameter sigmoidal growth model. In the dataset, Y = pasture regrowth (since last grazing), and X = time, and for our present purposes, we can assume that the data are independent measurements. The model function that these authors used to fit the data is rather complicated, and coming up with starting values for the model parameters is not easy, and can only come after we understand the roles they play.
(a) Look at this SAS Program/Output and write down the assumed model function and discuss the roles of the model parameters which play a role in determining the upper and lower asymptotes of the model function; assume that q4 is positive. Next, look only at the plot to obtain a guess of these asymptotes, and discuss how to get good starting values for q1 and q2. Next, and most challenging, get a starting value for q3 and q4 by showing how the fitted linear regression is related to the original model function.
(b) Look at the NLIN output and comment on the estimated upper and lower asymptotes (using the parameter estimates). Also, based on the output, do a (two-tailed) Wald test that q4 = 3 using a = 1%. Redo this (Wald) test using a = 5%. Clearly report your conclusions in each case.
(c) Repeat your tests in part (b) using LR tests.
(d) Examine the residuals, and comment on your findings. If the NLIN were to be rerun with the third point (x = 21) removed, would the estimate of the lower asymptote increase or decrease? Why?
2. In Nonlinear Regression Analysis and its Applications (1988, p.269), Bates and Watts report data from Treloar (1974) regarding the "velocity" of an enzymatic reaction. The number of counts per minute of radioactive product from the reaction was measured as a function of substrate concentration (ppm), and from these counts the initial rate, or "velocity," of the reaction was calculated (counts/min2). The experiment was conducted once with the enzyme treated with Puromycin (treated = "yes") and once with the enzyme untreated (treated = "no"). The velocity is assumed to depend on the substrate concentration according to the Michaelis-Menton (MM2) equation. It has been hypothesized that the "ultimate velocity parameter" (q1) should be affected by introduction of the Puromycin, but not necessarily the "half-velocity parameter" (q2). This SAS program/output may help us to answer these queries.
(a) Give estimates for the MM2 model parameters for both the treated and untreated curves; display these in a 2x2 box.
(b) Using Wald hypothesis tests, test the relevant hypotheses, reporting test statistics, p-values and conclusions. Approximate p-values as best you can here and in the next part.
(c) Using a full-and-reduced (likelihood-based) F-test, test whether the half-velocity parameters are the same, reporting the test statistic, p-value and conclusion.
3. In "Calibration and assay development using the four-parameter logistic curve" (Chem. Intell. Lab. Systems, 1993, p.97), O'Connell et al fit the LL4 (four-parameter log-logistic) model function to radioimmunoassay (RIA) data. Click here for the data and analysis in SAS. This program first uses PROC NLIN and then three PROC NLMIXEDs.
(a) Comment on the necessary assumptions and the roles of the parameter for the model fit in the NLIN. Look at the residual plot and comment on whether all necessary assumptions (of the NLIN) appear to be met. The first PROC NLMIXED just fits the same homoskedastic (assumed constant variance) model.
(b) Explain what is being done in the second PROC NLMIXED in terms of the new "model" and the roles of the parameters, and note the value of -2LL. Perform a likelihood-based test of whether the extra parameter (r) in the variance is need, writing out your hypotheses, test statistic, p-value, and conclusion.
(c) It turns out that the third PROC NLMIXED involves another, more appropriate, way of modelling the variance. Comparing this latter NLMIXED with the first NLMIXED, perform a likelihood-based hypothesis test testing for homoskedasticity, again writing out your hypotheses, test statistic, p-value, and conclusion. Finally, compare the parameter estimates and SEs for this latter (third) and the first NLMIXED - what has changed?
4. Examine this SAS program/output and determine what model is being fit. Write down the model function. What is the relevance of the parameter named PHI? Write down it's estimate, do a (WALD) t-test that the true value of PHI is equal to -0.40 (using a = 5%), and write down the 95 % Wald Confidence Interval (WCI) for this parameter. Next, do the likelihood test of whether PHI is equal to -0.40 (using a = 5%). It turns out that the true 95% confidence interval for PHI here is very different from the WCI; in your opinion is the true CI shifted to the right or to the left of the WCI? Why?