Determines the alignment tensor of a molecule based on user-supplied residual dipolar couplings (RDC) and atom coordinates of a single or multiple conformers. The RDCs can be for multiple types of bond vectors and nuclear spin pairs.
Accessibility
The ALTENS module is accessible from the Beta section of the main menu.
Basic Usage
The purpose of the module is determine the molecular alignment tensor using experimental RDC data measured for one or more types of bond vectors (for example, N-H, Cα-Hα, N-C) and nuclear spins (e.g., 1H, 15N, 13C)[1,2]. This analysis can also be used to assess how well the molecular structure agrees with experimental RDC data.
Notes
Typical usage is to input RDC data and atom coordinates (PDB) for a protein or nucleic acid of interest.
The RDC data can correspond to various pairs of atoms connected by a bond (e.g., N-H, Cα-Hα, N-C) and spin-1/2 nuclei (1H, 13C, 15N). The module can analyze RDC data for multiple types of bonds/atoms and multiple chain IDs simultaneously.
The input atom coordinates are in PDB format and can correspond to a single structure/conformer or multiple conformers of the same molecule (for example, from MD trajectory, in DCD format). The molecular structure/coordinates can include more than one chain.
There is no limit to the number of proteins or nucleic acids in the structure/complex or the number of atom-pair (bond vector) types.
If the atom pair includes hydrogen, the input PDB file must contain the coordinates of hydrogen atom.
RDC data file must contain header line(s) starting with "#" character followed by the chain ID and the names of atom 1 and atom 2 in order to define the type of bond vector and atom pair. If the input file contains more than one type of vectors/atom pairs, the RDC data for each type of vectors/atom pairs must be preceded by the corresponding header line. Note that the chain ID and the naming of the atoms should be consistent with the input PDB file.
The input RDC data file must contain at least two columns to define the residue number (first column) and the experimental RDC value (second column) with the optional third column for the experimental error of the RDC value. If the data error column is not provided, the module assumes the value of 1 Hz for all RDC errors.
An optional exclusion-list file can be used to iteratively refine the analysis by excluding specific residues/atom pairs from the input RDC data set.
The output is in terms of the alignment tensor, its principal values (Axx, Ayy, Azz), and eigenvectors. The output also contains the axial and rhombic (Aax and Arh) components of the tensor and the Euler angles (Alpha, Beta, Gamma) representing its orientation with respect to the coordinate frame in which the molecule is defined.
The module also visualizes and quantitates the agreement between the experimental RDC data and back-calculated RDCs.
In case of multiple conformers, the module returns the alignment tensors for each individual conformer as well as the average alignment tensor.
Screen Shots and Description of Input Fields
This example calculates the alignment tensor and Euler angles of Ubiquitin monomer (1D3Z_1model.pdb) using the experimental RDC date[4] and performs Monte Carlo analysis.
run name: User-defined name of folder that will contain the results.
pdb file input: PDB file with naming information and coordinates of the starting structure.
pdb/dcd file input: PDB or DCD file with coordinates that will be used to calculate molecular orientation(s).
RDC data file: The experimental RDC data file.
Monte Carlo sampling option:Check box to perform Monte Carlo analysis of the errors in the alignment tensor eigenvalues and Euler angles by generating synthetic input data. The synthetic RDC data are generated by adding Gaussian noise (scaled by the experimental errors) to the input RDC data.
enter number of Monte Carlo steps: The number of Monte Carlo steps for generating synthetic data per residue/atom pair. (default = 500).
resiude list file option:Check box to enter list of residues/atom pairs to be excluded from ALTENS analysis.
exclusion list file: The file name to define the exclusion list.
Example Output
Results are written to a new directory within the given "run name" as noted in the output. For example, in the figure it is noted that the RDCs, the alignment tensor, and the Euler angles were saved to files within the current project directory within the chosen "run name" directory:
run_0/altens/run_0_results_00001.txt -- summary of the ALTENS run
run_0/altens/run_0_results_average.txt -- average results over models
run_0/altens/run_0_results_per_model.txt -- results of each model
run_0/altens/run_0_calc_rdc_00001.txt -- experimental and back-calculated RDC data
run_0/altens/run_0_calc_rdc_average.txt -- averaged back-calculated RDC data over models
run_0/altens/run_0_mc_00001.txt -- trajectory of Monte Carlo sampling
run_0/altens/run_0_mc_per_model.txt -- Monte Carlo sampling results of each model
run_0/altens/run_0_single_model_first.html -- html file for users to regenerate plots
run_0/altens/run_0_mc_histogram.html -- html file for users to regenerate plots
run_0/altens/run_0_residue_list.txt -- list of actual resiudes
run_0/altens/run_0_exclusion_list.txt -- template to define the exclusion list refinements
Visualization
The plots below illustrate the screen output for a single-model/conformer in case when the Monte Carlo option is selected.
The top row panels show the following. Left panel: input RDCs as a function of residue number. Middle panel: the residuals of fit (experiment - calculation) as a function of residue number. Right panel: the correlation between the experimental and back-calculated RDCs; the straight line on this plot corresponds to absolute agreement.
The type of vector/atom pair and the color used for these data is indicated in the upper left corner of each plot. Shown on the top of the right panel are parameters quantitating the agreement: the R-factor (defined in [3]) and the Pearson’s correlation coefficient (Corr. coeff). By placing the cursor over these plots the user can obtain information regarding specific data point or bar.
The middle and the bottom row panels appear only when the Monte Carlo sampling option is checked. They depict histograms of the distributions of the eigenvalues (Axx, Ayy, Azz) and the Euler angles (Alpha, Beta, Gamma) for the Monte-Carlo generated synthetic RDC data.
This example refines the results of the previous run with Ubiquitin monomer (1D3Z_1model.pdb) by excluding residues 75 and 76 located at the flexible C terminus.
The resulting correlation coefficient = 0.991 and R-factor = 0.089 indicating an improvement in the overall agreement between the experiment and back-calculation.
Example 2: Averaging over multiple models/conformers
This example calculates the alignment tensors for 10 conformations of a monomeric protein Ubiquitin (1D3Z_10models.pdb).
Visualization
The figure below is the snapshot of the bottom set of plots provided in the case when the pdb/dcd file input contains multiple models/conformers, resulting in model-dependent outputs. This figure appears only when the DCD file contains more than one model/conformer.
Left panel: plot of the R-factors vs. Correlation coefficients for all the models/conformers analyzed (red circles), as well as their average results over the values of each model/conformer (blue cross).
Middle panel: Eigenvalues of the alignment tensor as a function of model/conformer number.
Right panel: Euler angles for the alignment tensor orientation as a function of model/conformer number.
The data for the specific model of interest is seen by scrolling mouse over the data point in the figure. To review the back-calculated RDC data and MC analysis results for the specific model (e.g. Model number = 6 with greatest deviation from average in the left panel above), click the data point to open a new tab containg plots only for the selected model as seen at the figure below.