Homework # 7 – due Thursday December 4 by 5pm
Problems students are to do
-
· Undergraduate students: please do problem A and C below
· G students: please do problems A - D below
As always, list necessary assumptions, and include respective p-values in parentheses next to your conclusion - e.g., one might conclude, "the data suggested a difference in the treatments (p = 0.0013)." Again, please note that some of the following SAS program outputs are *very* lengthy, so you are warned to print them out at your own risk - rather, just copy down what you need and print out the essential sections! Do not include the computer output (from the computer programs) in your homework submissions except to indicate relevant test statistics, p-values, outliers, etc.
A. Diggle, Liang & Zeger
(1994) give an example of repeated measurements on the "size" (by
convention this is log height plus twice log diameter) of 79
(1) For each part (MANOVA, SP, each of the Mixed's), listing the necessary assumptions,
(2) In the MANOVA approach, comment on the significance of the fact that the "time_6" contrast for "treat" is significant (p = 0.0031),
(3) Comment on the usefulness
or otherwise of the results of the Split Plot (SP) approach,
(4) Identify (and justify!!) which of the three covariance structures used
in Proc Mixed is most appropriate for these data. "Justify"
here means report the relevant test statistic(s) and results. How many
variance components need to be estimated for each of the three runs of Proc
Mixed?
(5) Using the appropriate analysis, comment on whether you feel the profiles for the two treatment curves are the same, reporting the relevant test statistic and p-value. Give your final conclusion.
B. On page 18 of Davidian & Giltinan, Nonlinear Models for Repeated Measurement Data, the authors present the data of Kwan et al (1976), and fit a mixed nonlinear model of the form
EY = b1*exp(- b2X) + b3*exp(- b4X),
in which Y = plasma concentration (of a drug called cefamandole) and X = time (in minutes post dose). For this study, a dose of 15 mg/kg body weight of the drug was administered by ten-minute intravenous infusion to six healthy volunteers. The data are input and analyzed in SAS here. The program first graphs the data then runs four Proc NLMixed's in SAS.
(1) Using the values of b1 = 2.7733, b2 = 2.8139, b3 = 0.7870, b4 = 0.4195, graph the above curve, and comment on the role of each of the four parameters (this part has been done for students and discussed in class – so no need to do it!).
(2) Listing all necessary assumptions, explain what is being done in each of the NLMixed runs, commenting on which model is being fit and identifying the underlying assumptions. Be specific. Also, identify which models are special cases of others (i.e., which are nested, and identify which they are nested in.)
(3) Choose the NLMixed analysis and model which best describes these data, and give specific reasons for why you chose the model you did, and why you rejected the others. Give test statistics, degrees of freedom and p-values to justify your claims.
(4) The parameters b2 and b4 are important since they address the rate of decrease of the expected concentration function. Contrasting the estimated standard errors of these two parameters for the first NLMixed with those for the second through fourth NLMixed's, why are these SEs lower for the latter three NLMixed's than for the first one?
C. Reanalyze the Sitka89 data
from exercise A using the SAS program/output here.
This program runs one NLIN and four
NLMixeds. The NLIN and the first NLMixed are run "by
treatment," so each produces two sub-outputs.
The respective outputs are identified by a corresponding title.
(1) Identify the models that are fit in each of these runs and the underlying
assumptions. (2) Focusing on the NLIN in output # 1, make a 2x4 table of
parameter estimates and comment on which parameters seem close (guesses can be
made using approx. CIs) for the two treatments.
(3) It turns out that outputs # 2 and # 3 are virtually identical, noting that
the -2LL's for output # 2 sum to (-206.1 + - 492.8) almost that of output
# 3 (-697.2). In model # 3, what is the role of the "add" terms
(i.e., as in “th1add”, “th2add”, etc.)?
(4) Of the models #1, #3, # 4, and # 5 –
which are special cases of others (nested)?
(5) What hypothesis can be tested by comparing outputs # 3 and # 4?
Clearly list the hypotheses, test statistic, p-value and your conclusion.
Which parameters are random in these models? Do these models assume that these
random parameters vary by different or the same amounts?
(6) Answer the questions in (5) but comparing
outputs #4 and #5.
(7) Using the model you feel best
describes these data, summarize the data with your model.
In comparing the NLIN output and the last NLMIXED, comment on the differences
in the estimated variability associated with the LD50 parameter for each of
these models.
D. A patient swallows a tablet
of Zantac, which enters the patient's gut, and begins entering the patient's
bloodstream at time t = 0. Blood samples are then taken every half-hour
until hour 16, and the concentration of Zantac in the patient's serum are
recorded and analyzed in the SAS program and output here.
(1) Describe the model(s) that are being fit in the NLIN and the NLMixed, and
the implicit assumptions. Are they fitting the same model? What are the
model parameters, and the roles of these for each model.
(2) The residual plot after the NLIN highlights a problem with one of the
implicit assumptions. Which one (assumption) is it, and what is wrong?
What are the usual ramifications of the violation of this (or these)
assumption(s)?
(3) Explain what is being done in the IML procedure, focusing on which function
is being minimized in the
"neg2lla" and the "neg2llb" functions. The latter
function introduces an additional
parameter - which one is it and what is it's role? Is it
significant? (Listing your null and alternative hypotheses, report the
relevant test statistic, distribution, degrees of freedom and p-value.)
Give reliable 90%, 95% and 99% CIs for this parameter.