Copyright 2014-2016 by Evgueni Kovriguine

Back to main topic

Processing of NMR 2D data for NMR line shape analysis

Contents

 

 


   Introduction

The NMR titrations produces a series of 2D datasets that must be correctly processed to allow for the line shape analysis. It is important to note that most "standard" 2D processing protocols used in routine NMR titration experiments are designed to increase resolution in the spectra by applying resolution-enhancement window functions such as the shifted sine-bell, gaussian, Kaiser, etc. To make NMR data suitable for fitting with Bloch-McConnell equations, the raw NMR datasets must be processed using the exponential window functions to obtain lorentzian line shapes. This leads to visual loss of resolution but the lorentzian line shape is a requirement in the current version of IDAP. For a thorough description of window functions as well as the basics of 2D data processing - see standard NMR texts such as Cavanaugh et al. [1].

One difficulty in line shape analysis of titration data is the sheer number of the datasets that needs to be processed. Large number of similar datasets leads to errors in labeling of the results, which is difficult to detect and prevent if processing is done in the one-by-one fashion.  To make analysis of multiple 2D datasets error-proof, I am including the Python program ProcessAll.py, which purpose is to direct consistent processing of the entire titration series. The program code and usage is described in its header and highlighted below.

System requirements: Unix, Linux, or OS X. May be adapted for Windows but --- never tried.

 

Back to Contents


 


 Organizing 2D data

Typical series of 2D NMR datasets contains 5-16 folders originating from the spectrometer. The ProcessAll.v8 folder with its content should be copied to the same location.

The NMR data folders may have arbitrary names. The best practice is to add a numerical prefix to the folder names to indicate position of the dataset in the titration series and enter dataset names in data_name_list.txt to direct ProcessAll program to proper folders. In the next step, the IDAP 1D NMR will use this prefix to automatically assign the correct index to the spectrum in the Sparky project. Example:

However, if your experiments already have numerical names and you don't want to change them - just enter dataset names in data_name_list.txt the way they are. In this example, the experiments #1-3 were 1D datasets, while 2D began with #4. You will have to enter indices manually in Sparky later to indicate the sequence of the 2D spectra in the titration.


 

 

Back to Contents



  Conversion to NMRPipe format

The ProcessAll program directs data processing using NMRPipe, which must be installed on your system. Prior to running ProcessAll, you have to convert the raw spectrometer data into the NMRPipe format. Typically, all of your experiments were acquired with the same spectrometer parameters (which is a recommended mode -- except for the number of transients for averaging that may be increased in the low-sensitivity spectra in the intermediate points). If all datasets had the same parameters you can use a simple batch processing workflow decribed below:

  1. go to the first folder with 2D data (first titration dataset)
  2. issue varian or burker or whatever else to create a fid.com file that will correctly convert raw 2D data to the NMRPipe-format matrix.
  3. copy two processing scripts process_batch.com and process.com to this first dataset from ProcessAll.v8/batch_fid_processing
  4. edit process.com to include all the folder names with 2D data to be converted (including name of the current folder)


  5. issue process.com on the Unix command line.

The scripts will run fid.com that you created in each of the data folders in the process.com list. The NMR data is ready for processing with ProcessAll.

 

 

 

Back to Contents



  Automated processing

  1. Change folder in the Terminal to ProcessAll.v8
  2. Read description of the usage of ProcessAll.py (open its text in the editor or issue ProcessAll.py on the command line). What I give below are only my remarks not description of the program! 
  3. Edit ProcessAll.ini to enter all information relevant for the titration series. In the example below I entered the name of the protein in the field <Experiment_type> and name of the ligand in <name_suffix>. They are appended together when creating new names for the datasets.



    NOTE: The phases for first and second dimensions will be applied to all datasets unless a plain-text file 'dataset_name.phases' is created in the same folder with 2D data folders (see ProcessAll.py for details).  
  4. I usually run processing of one dataset first to see whether it results in a satisfactory spectrum.

    1. Example:
      ProcessAll.py     -1    1.XlnB2-E87A-xylobiose-eq0.fid
      ProcessAll creates a folder Protocol_1 and places three datasets there with .ft2, .sp, and .for_nmrproc2 extensions.
    2. Run NMRDraw to inspect the file with ft2 extension.
  5. Optimize basic processing parameters

  6. Adjust phases in ProcessAll.ini as needed. Rerun processing of one dataset.
  7. Once satisfied with phases, check in NMRDraw if the peak envelopes have optimal (5-10) number of points in the peak. Change multiplier in the ZF function in nmrproc1.P1.com if needed. Rerun, check again.
  8. Run processing of all datasets with this protocol:
    ProcessAll.py -all
  9. Check phases in all ft2-files created by processing. It is not uncommon that phase of the spectrometer drifts over a period of time. Then you will have to create files for local phases for each dataset in question.  The ProcessAll.py is designed with ability to very quickly shift phases in the datasets without running lengthy re-processing (add -ps switch):
    ProcessAll.py -all -ps
    As it runs the program reports whether it used global or local phases for the specific dataset.
  10. After you are happy with results of processing of all files, issue clean.com to remove (very large) intermediate files. If you want to know more about their origin see Inside of hypercomplex processing.

 

 

Back to Contents



  Sparky project

Now you can create your Sparky project. I recommend using the automated workflow as outlined below. To have it work you need to copy two helper Python programs make_save_files.v2.py and unscrew_savefiles.py ( in NMRLineShapes1D/NMR_series_processing/Python/) onto your system Python path. Kovrigin submenu in Sparky must be installed.

  1. Create a save-file template that will be used to make identical views for all the titration points with all identical settings:
  2. In the Terminal: issue make_save_files.v2.py template.save to create the project with all synchronized windows.
  3. Open project

You are ready to begin line shape data extraction with the IDAP 1D NMR extension.

 

NOTES:

  1. If you find that you need to re-process the raw data, you should to quit Sparky, reprocess data and then reopen the Project.proj. Sparky will display updated spectral data.
  2. Sparky has a glitch that sometimes it distorts the paths stored in the .save and .proj files. In such case, sparky will complain that it cannot open the project. To fix this:
    1. quit Sparky
    2. In the Terminal, in the folder with .save files issue
      unscrew_savefiles.py ../
      It should report that files were fixed.
      Important: this fix requires all .sp files be in the same location! (../ or another)
  3. Because of this glitch, I do not recommend adding views in this project from the save files located in other folders. If you need something to comparison, copy its .sp file into Protocol_1 folder and .save file into Sparky folder and issue unscrew_savefiles.py ../

 

Back to Contents



 Inside of the hypercomplex processing

This section is not essential for the line shape analysis and details underlying principles behind ProcessAll operation.

Processing of NMR datasets by NMRPipe is sequential (vector after vector) and linear (sequence of operations do not matter, in most cases). Some of the processing operations may take significant time when applied to a large titration series. It makes sense to perform time-consuming operations first to produce some intermediate file and then apply "polishing" functions such as phase shifting, extraction, and baseline correction (which, normally, need to have their parameters adjusted multiple times before the result is deemed satisfactory). To achieve this goal, we use two NMRPipe scripts that are executed in a sequence: nmrproc1.P1.com (and nmrproc1.P2.com) and nmrproc2.com. The nmrproc1.PX.com scripts include "less variable" processing steps (apodization, zero-filling, Fourier transformation) while nmrproc2.com does the phase shifting, baseline correction, and spectral range extraction.

In the first stage, the ProcessAll.py runs the nmproc1-script(s) producing hypercomplex datasets (hypercomplex --- because the imaginaries are not discarded as is usually done in the standard NMRPipe scripts). These datasets (.for_nmrproc2 extension) are very large (4x vs. the final files). In the second stage, the nmrproc2-script starts with the hypercomplex data to perform all "adjustment" operations and create the final ft2 and sparky format sp files.

Thanks to splitting of the processing into two stages, in case of iterative adjustment of the phases, spectral range or baseline, we do not need to run the most time-consuming operations of processing again. Instead, ProcessAll.py (with the switch -ps) only runs the nmrproc2.com; therefore, adjustment of the 10-15 datasets takes very little time.

To make sure the disk space is not consumed by left-over hypercomplex files, you run the clean.com script to clean up all the hypercomplex and ft2 data.

ProcessAll.py is capable of performing two parallel processing protocols encoded in nmrproc1.P1.com and nmrproc1.P2.com. This way one may examine effect of different data processing routines on the results of analysis. By default, ProcessAll.py uses just one protocol (nmrproc1.P1.com). To enable two parallel protocols, set <protocols_num> 2 in ProcessAll.ini and create nmrproc1.P2.com (duplicate the P1 script and adjust the processing routine). The results of the second protocol will be deposited in Protocol_2 folder.

 

Back to Contents



   References

[1] Cavanaugh, J., Fairbrother, W. J., Palmer III, A. G., and Skelton, N. J. (2006) Protein NMR Spectroscopy: Principles and Practice, Academic Press.