Back to main index

Hypothesis testing

 

 

Contents

 


 

General description

Hypothesis testing is performed according to Akaike's Information Theory utilizing corrected Akaike's Information Criterion. For more details see:  Motulsky, H. and Christopoulos, A., Fitting Models to Biological Data Using Linear and Nonlinear Regression: A Practical Guide to Curve Fitting. 1 edition ed. 2004: Oxford University Press, USA. 352

 

Important note: There is an unresolved issue with using Akaike's Information Criteria to evaluate fitting of NMR line shape data. The problem is that AIC requires knowledge of the number of points in the dataset. In classical statistical analysis and in Akaike's Information Theory it is implicitly assumed that the points of the dataset all represent meaningful signal. This is not the case for NMR line shape data, where number of points is an arbitrary setting of the acquisition and processing protocols and only indirectly correlates with the number of points spanning the signal envelope. Most of the points in NMR spectrum reside in the noise floor. The tails of the peak may be extended all the way to the edges of the spectral window thus bringing number of points in the line shape fitting to a very large number (say, 2048) while the actual signal may be covered by only 5-10 points. It is clear that these extra points are meaningless to include but, at this time, there is no rigorous theoretical reasoning on how to exclude them from the test.

The way the AIC is set by default is meaningless to use for NMR line shape fitting because number of points may be arbitrary due to differences in the line shape tails. At the same time excessive number of points in line shape datasets supports inclusion of a very large number of extra parameters in the model without a significant penalty. This is because what is counted is not the number of fitting parameters but the number of remaning degrees of freedom, which is calculated as [number of points]-[number of fitting parameters]. Fractional decrease in the number of degrees of freedom in line shape datasets becomes very small when long tails are included. For example, from the formal statistical viewpoint, the line shape with just 10 points spanning the peak envelope but 200 total points may be easily fit with a model of 20 parameters.

Therefore, current implementation of Akaike's Information Criterion for hypothesis testing is NOT RECOMMENDED to use in data analysis with IDAP.

 

Back to Contents

 


Specific modules

DatasetAIC - comparison of alternative fitting models for the Dataset object

TotalFitAIC - - comparison of alternative fitting models for the TotalFit object