Statistical analysis in LineShapeKin

 


Statistical comparison of individual and global fitting results

    If changes you observe in NMR spectra are related to a single binding event than all the residues should display comparable kinetics. Indeed if you do individual fitting you may see similar koff for a group of residues. You need to apply statistical testing in order find out which hypothesis is more likely: single Kd , koff for all residues or individual for each. In fact, from statistical point of view it is individual  hypothesis  that is to be examined because it has increased number of parameters compared to global (single Kd , koff ). Global hypothesis is simplest and is null hypothesis from the statistical point of view.

    NOTE: Here you may only test models that use all their parameters as global. For example three-parameter model Kd-Koff-w0B will not work because w0B is individual for each data set. You can go around it though inserting w0B into manual_w0.txt for each residue and switch back to Kd-Koff model. If you need to compare which model in a sense of different parameters kept constant or variable is more correct when applied to the same dataset see below

    To do statistical testing LineShapeKin uses Akaike's Information Theory. In order to perform testing of global versus individual fitting modes:

    1. you should select a list of residues you want to test and put it into setup.m.
    2. run both individual and global fittings.
      (resname_x_w?_SS.txt files are generated for each fitting containing number of titration points for each residue and a sum of squares )
      IMPORTANT: Before running global fitting and hypothesis testing you must insert Scale Factor values determined in individual run and RE-RUN individual fitting so sums of squares are computed with new scaling factor now!


    3. edit AIC_setup.m to insert the same residue list into its resnames variable.
    4. Look up a composite residue name corresponding to the global fitting results from grid_results_...long-composite-name...dat. and insert it into global_resname variable of compare.m
    5. issue compare on a Matlab command line
    6. the results are displayed on a screen and saved in a text file. For the interpreation of Akaike's test resutls Examples .
    7. If you like use Fisher test use Sums of Squares (SS), N (number of data points) and K (number of parameters +1) to calculate Fisher statistics yourself.

     

Back to the Contents

Determining the likelihood of different models


In previous section we were discussing application of the same mathematical model to either each residue-specific dataset individually or to a combined dataset of spectral data from all residues. Different task is when you need to decide whether to go with 1, 2 , 3 etc. - parameter models for the experimental data sets.

When you switch the models to fit the same data set you need to determine which model is more likely to be correct. Most certainly the model with larger number of fitting parameters will produce lower norm from fitting. However, one needs to determine whether this reduction in sum of squared residuals is more than statistically expected from a larger number of parameters. If it is more then expected then the model is better describing the data because extra parameter help to take into account some specific feature. Otherwise you should stay with the simpler model.

 

Example 1: Adding variable frequency of the bound state

In some cases we may think that our titration was not finished and the final spectrum does not represent the chemical shifts of pure state B. In this case we can use a model with variable w0(B) that compensates for incomplete titration.

We fit data in 'single' mode use 2-parameter and 3-parameter models

Mode 1: Kd and Koff  Model 2 : Kd, Koff and w0(B)

 

 

Example 2 : Adding correction for concentrations.

LPcorrection/

 

Back to the Contents