$ontext Non-Linear Least Squares Regression example Erwin Kalvelagen, may 2008 Reference: http://www.statsci.org/data/general/troutpcb.html PCB Concentrations in Lake Trout -------------------------------------------------------------------------------- Description Data on the concentration of polychlorinated biphenyl (PCB) residues in a series of lake trout from Cayuga Lake, NY, were reported in Bache et al (1972). The ages of the fish were accurately known, because the fish were annually stocked as yearlings and distinctly marked as to year class. Each whole fish was mechanically chopped, ground, and thoroughly mixed, and 5-gram samples taken. The samples were treated and PCB residues in parts per million (ppm) were estimated using column chromatography. Bates and Watts (1988) use a linear model log(PCB) = b1 + b2 Age^1/3 but they remark that the nonlinear model log(PCB) = b1 + b2 Age^q is slightly better. -------------------------------------------------------------------------------- Variable Description -------------------------------------------------------------------------------- Age Age of trout (years) PCB PCB concentration (ppm) -------------------------------------------------------------------------------- Source Bache, C. A., Serum, J. W., Youngs, W. D., and Lisk, D. J. (1972). Polychlorinated biphenyl residues: Accumulation in Cayuga Lake trout with age. Science 117, 1192-1193. Bates, D. M., and Watts, D. G. (1988). Nonlinear Regression Analysis and Its Applications. Wiley, New York. Smyth, G. K. (2002). Nonlinear Regression. In: Encyclopedia of Environmetrics, A. H. El-Shaarawi and W. W. Piegorsch (eds.), Wiley, Chichester, Volume 3, pages 1404-1411. Analysis > troutpcb <- read.table("troutpcb.txt",header=T) > attach(troutpcb) > out <- nls(log(PCB)~cbind(1,Age^theta),start=list(theta=1),algorithm="plinear") > summary(out) Formula: log(PCB) ~ cbind(1, Age^theta) Parameters: Value Std. Error t value theta 0.196867 0.273931 0.718673 -4.864660 8.424260 -0.577458 4.701570 8.272060 0.568367 Residual standard error: 0.503198 on 25 degrees of freedom Correlation of Parameter Estimates: theta 0.997 -0.998 -1.000 > plot(Age,log(PCB)) > lines(Age,fitted(out)) $offtext set i /i1*i28/; table data(i,*) Age PCB i1 1 0.6 i2 1 1.6 i3 1 0.5 i4 1 1.2 i5 2 2 i6 2 1.3 i7 2 2.5 i8 3 2.2 i9 3 2.4 i10 3 1.2 i11 4 3.5 i12 4 4.1 i13 4 5.1 i14 5 5.7 i15 6 3.4 i16 6 9.7 i17 6 8.6 i18 7 4 i19 7 5.5 i20 7 10.5 i21 8 17.5 i22 8 13.4 i23 8 4.5 i24 9 30.4 i25 11 12.4 i26 12 13.4 i27 12 26.2 i28 12 7.4 ; parameters Age(i) 'Age of trout (years)' PCB(i) 'PCB concentration (ppm)' ; Age(i) = data(i,'Age'); PCB(i) = data(i,'PCB'); variables a 'parameter to be estimated' b 'parameter to be estimated' c 'parameter to be estimated' sse 'sum of squared errors' ; equation fit1(i) 'linear fit' fit2(i) 'non-linear fit' sumsq 'dummy objective' ; sumsq.. sse =n= 0; fit1(i).. log(pcb(i)) =e= a + b * Age(i)**1/3; fit2(i).. log(pcb(i)) =e= a + b * Age(i)**c; options nlp=nls, lp=ls; models m1 /sumsq,fit1/ m2 /sumsq,fit2/ ; solve m1 minimizing sse using lp; c.l=1/3; solve m2 minimizing sse using nlp;