Open Access

PREdator: a python based GUI for data analysis, evaluation and fitting

  • Christoph Wiedemann1,
  • Peter Bellstedt1 and
  • Matthias Görlach1Email author
Source Code for Biology and Medicine20149:21

https://doi.org/10.1186/1751-0473-9-21

Received: 27 June 2014

Accepted: 17 September 2014

Published: 24 September 2014

Abstract

The analysis of a series of experimental data is an essential procedure in virtually every field of research. The information contained in the data is extracted by fitting the experimental data to a mathematical model. The type of the mathematical model (linear, exponential, logarithmic, etc.) reflects the physical laws that underlie the experimental data. Here, we aim to provide a readily accessible, user-friendly python script for data analysis, evaluation and fitting. PREdator is presented at the example of NMR paramagnetic relaxation enhancement analysis.

Keywords

Python3 Matplotlib Paramagnetic relaxation enhancement Data fitting

Introduction

In nearly all fields of physical, chemical or biological research it is requiered to convert experimental data into mathematical expressions. Particularly the determination of a "best fit" for a series of data points to a mathematical model is a pivotal and potentially time consuming step in the extraction of results and data evaluation.

Nuclear magnetic resonance (NMR) spectroscopy not only provides structural information at the atomic scale on biological macromolecules but also on their dynamics, and hence, a more complete description of the system under investigation. Furthermore, dynamics parameters may contribute also to the understanding of proteins and their interaction with other proteins, nucleic acids or small ligands. The determination of longitudinal (R 1) or transverse (R 2) relaxation rates of protons in biological macromolecules deliver valuable molecular dynamics information on the system under investigation. For example, this information can be used to determine the interaction interface between individual domains or subunits on the basis of surface accessibility studies in situations where no other NMR parameters, e.g. nuclear Overhauser enhancement (NOE) or chemical shift perturbation data, are observable. In such cases, surface accessibility studies can be performed by using of chemically inert paramagnetic probes, e.g. paramagnetic metals, oxygen or nitroxides as cosolvents [1]. Protein residues located in the interior of proteins or at the interaction interface are shielded from the paramagnetic agent and experience a weak paramagnetic relaxation enhancement (PRE). In contrast, residues located at the solvent accessible surface experience a strong PRE.

PRE can experimentally be derived from longitudinal (R 1) or transverse (R 2) relaxation rate measurements. A sensitive and reliable measure of transverse PREs can be obtained from cross-peak intensities for the state with and without the paramagnetic cosolvent. Relaxation rates are measured by a series of 2D saturation-recovery spectra (1H, 13C-HMBC or 1H, 15N-CRINEPT [2]), in which the time delay during which relaxation takes place is gradually increased. The experiments are repeated with different concentrations of the paramagnetic agent. To extract the relaxation rates the signal intensities are fitted to I = I 0 ( 1 - e - R i t ) where I 0 is the intensity after infinite recovery delay, R i is the longitudinal or transverse relaxation rate and t is the time. The PRE is calculated and is represented by the slope of the relaxation rate as a function of the concentration of the paramagnetic agent [35].

Even though, a variety of tools (e.g. MATLAB 8.0 and Statistics Toolbox 8.1 (The MathWorks, Inc., Natick, MA, US), GNU Octave [6] or R [7]) and NMR-software suites (NMRView [8], CCPN [9], ROTDIF [10]) are available for the extraction and fitting of relaxation data, here we provide a straightforward Python3 based application with a graphical user interface not only for the extraction of relaxation data but also for the calculation of PREs. However, the script should also be useful for fitting and evaluation of virtually any set of data series.

Implementations and results

PREdator was initially conceived for the analysis of PRE. The application was written in a Mac OS X environment, but it can be run under any operating system for which a Python3 interpreter is available. Python3 and the packages Matplotlib [11], SciPy/NumPy [12] and dill [13] are required to run PREdator.py. Matplotlib [11] is used for data visualisation. All generated plots can be saved as either raster (PNG) or vector format files (PDF or EPS). PREdator also provides the option to save the current session and to restore it later. For data serialization the dill package is implemented [13].

The input file has to contain comma separated data (see example files provided with the download package).

In an initial dialogue the user has the opportunity to choose a predefined fitting function from a list or to enter a self-defined fitting function with up to three fitting parameters. The implementation of NumPy allows to create self-defined fitting functions with predefined mathematical expressions (e.g. sin, cos or tan).

PREdator provides an initial estimate of the parameters to be fitted. If the user has knowledge of the order of magnitude of the fitting parameters and the experimental error then there is the possibility to enter such initial fitting and error parameters. Data fitting is performed with the curve-fit function implemented in the SciPy-package (modul: scipy.otimize) [12]. For visual inspection the fitted curve and the original data points are shown as graph (Figure 1).
Figure 1

PREdator interface elements. The four interactive analysis windows in PREdator. A) Graphical representation of the original data and the fitted curve. B) Window for selection or deselection of data points that are considered for fitting, drop down menu for fitting function selection, fields for entering initial fitting parameters and experimental error, and entry fields to enter the axis labels for A). C) Graphical summary of one of the selectable fitting parameters over the range of submitted data (e.g. amino acid residues). D) Summary of the resulting fitting parameters, which can be saved as a text file, is shown in the upper text field. The fitting parameter (a,b or c) selected here is displayed over the data range in C). Entry fields to adjust the title and the axis labels for C) are also provided.

An operating window is provided to re-adjust the fitting function and/or the fitting parameters. Obvious data outliers can be deselected so that they are not considered for fitting. The change of the fitting outcome in the context of the selection and deselection of data points gives a qualitive estimate of fitting robustness. A summary of the fitting results and errors is given in a second window. Fitting errors are provided as one standard deviation errors. The user has the option to save the results to a text file.

For the calculation of the residue-specific PRE, the relaxation rate (R 1 orR 2) for each residue and each concentration of the paramagnetic cosolvent are obtained. The cosolvent concentration dependent relaxation rates for individual residues are subsequently correlated by a second fitting. The slope of the resulting fitted function of this second fitting step delivers the PRE for each individual residue.

PREdator delivers fitting parameters in a first step (e.g. R 1 of an individual residue of a protein for different concentrations of the paramagnetic cosolvent). In addition it allows to correlate those fitting parameters, obtained for different conditions, in a second step. The principle of such analysis is not restricted to the evaluation of PREs and is applicable to all kinds of experimental data sets where one type of measurement is repeated under different conditions. Examples include the analysis of fluorescence recovery after photobleaching (FRAP) in a living cell as function of the temperature or the assessment of a DNA-protein interaction under different salt, pH or temperature conditions and to compute properly fitted binding curves. The binding curves in turn can be used to derive the condition-dependent affinity parameter K d (equilibrium dissociation constant).

Conclusions

In summary, PREdator is a time saving tool for visual inspection, fitting and analysis of series of data points. The application is freely accessible at http://nmr.fli-leibniz.de/nmrsoftware.shtml and can be adapted to user requirements.

Availability and requirements

Project name: PREdatorProject homepage: http://nmr.fli-leibniz.de/nmrsoftware.shtmlDirect Download link: http://nmr.fli-leibniz.de/PREdator/PREdator.zipOperating systems: Linux, Mac OS X and WindowsProgramming language: Python3Other requirements: Matplotlib, SciPy/ NumPy, dillLicense: GNU GPL v3Any restrictions to use by non-academic users: no licenses are required

Declarations

Acknowledgements

We thank Georg Peiter for technical support and Dr. Peter Hemmerich (FLI Jena) for providing experimental data. CW was supported by the Leibniz Graduated School on Ageing and Age-Related Diseases (LGSA). The FLI is a member of the Science Association ’Gottfried Wilhelm Leibniz’ (WGL) and is financially supported by the Federal Government of Germany and the State of Thuringia.

Authors’ Affiliations

(1)
RG Biomolecular NMR Spectroscopy at the Leibniz Institute for Age Research - Fritz Lipmann Institute

References

  1. Bertini I, McGreevy KS, Parigi G: NMR of Biomolecules: Towards Mechanistic Systems Biology. 2012, Weinheim, Germany: John Wiley & SonsView ArticleGoogle Scholar
  2. Riek R, Wider G, Pervushin K, Wüthrich K: Polarization transfer by cross-correlated relaxation in solution NMR with very large molecules. Proc Natl Acad Sci. 1999, 96 (9): 4918-4923. 10.1073/pnas.96.9.4918.PubMed CentralView ArticlePubMedGoogle Scholar
  3. Respondek M, Madl T, Göbl C, Golser R, Zangger K: Mapping the orientation of Helices in Micelle-Bound peptides by paramagnetic relaxation waves. J Am Chem Soc. 2007, 129 (16): 5228-5234. 10.1021/ja069004f.View ArticlePubMedGoogle Scholar
  4. Madl T, Bermel W, Zangger K: Use of relaxation enhancements in a paramagnetic environment for the structure determination of proteins using NMR spectroscopy. Angewandte Chemie Int Edition. 2009, 48 (44): 8259-8262. 10.1002/anie.200902561.View ArticleGoogle Scholar
  5. Madl T, Güttler T, Görlich D, Sattler M: Structural analysis of large protein complexes using solvent paramagnetic relaxation enhancements. Angewandte Chemie Int Edition. 2011, 50 (17): 3993-3997. 10.1002/anie.201007168.View ArticleGoogle Scholar
  6. Eaton JW, Bateman D, Hauberg S: GNU Octave Version 3.0.1 Manual: a High-level Interactive Language for Numerical Computations. 2009, CreateSpace Independent Publishing Platform, [http://www.gnu.org/software/octave/doc/interpreter]Google Scholar
  7. R Core Team: R: A Language and Environment for Statistical Computing. 2014, Vienna, Austria: R Foundation for Statistical Computing,http://www.R-project.org.Google Scholar
  8. Johnson BA, Blevins RA: NMR View: a computer program for the visualization and analysis of NMR data. J Biomolecular NMR. 1994, 4 (5): 603-614. 10.1007/BF00404272.View ArticleGoogle Scholar
  9. Vranken WF, Boucher W, Stevens TJ, Fogh RH, Pajon A, Llinas M, Ulrich EL, Markley JL, Ionides J, Laue ED: The CCPN data model for NMR spectroscopy: development of a software pipeline. Proteins: Struct Function Bioinformatics. 2005, 59 (4): 687-696. 10.1002/prot.20449.View ArticleGoogle Scholar
  10. Berlin K, Longhini A, Dayie TK, Fushman D: Deriving quantitative dynamics information for proteins and RNAs using ROTDIF with a graphical user interface. J Biomolecular NMR. 2013, 57 (4): 333-352. 10.1007/s10858-013-9791-1.View ArticleGoogle Scholar
  11. Hunter JD: Matplotlib: A 2D graphics environment. Comput Sci Eng. 2007, 9 (3): 0090-0095.View ArticleGoogle Scholar
  12. Oliphant TE: Python for scientific computing. Comput Sci Eng. 2007, 9 (3): 10-20.View ArticleGoogle Scholar
  13. McKerns MM, Strand L, Sullivan T, Fang A, Aivazis MAG: Building a framework for predictive science. Proceedings of the 10th Python in Science Conference. Edited by: Millman J, van der Walt Se. 2011, 67-78. [http://arxiv.org/pdf/1202.1056]Google Scholar

Copyright

© Wiedemann et al.; licensee BioMed Central Ltd. 2014

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Advertisement