Helix and bending analysis of nucleic acid double helix structures

 

Summary

In this part of the IMB Jena Image Library we offer tools for the analysis and visualization of nucleic acid double helix structures. This service is available for all regular A and B-helices from structures deposited in the Protein Databank (PDB) and in the Nucleic Acid Database (NDB). The analysis of helix structure includes the analysis of helical parameters and a bending analysis. The results are presented in tabular form and as plots. Furthermore, still images as well as interactive models are provided.

 

1. Introduction

The double helix is the predominant conformation of DNA in cells, and also several RNA molecules are folded into domains of a double helical conformation. Watson-Crick helices are often thought to be rather regular and uniform, but atomic resolution structures from X-ray crystallography and NMR have revealed a wealth of structural deviations from the canonical A- and B-type conformations. The fine details of the helical structure are sequence dependent, but they can also be influenced by external factors like proteins, ligands, ions, and crystal packing effects.

While the canonical A- and B-helices derived from fiber diffraction are straight, several nucleic acid structures solved at atomic resolution show a bending of the helix. Pronounced DNA bending is found especially in DNA/protein complexes. The interaction of proteins and nucleic acids is a central problem in current molecular biology. Proteins control the translation of genes, the replication, and the packing and arrangement of the DNA in cells. Many proteins recognize structural variations of their binding sites rather than reading the base sequence directly. Binding of a protein can be achieved either by shape complementary or by an induced fit mechanism where the shape of the nucleic acid changes upon binding.

Perhaps the most prominent example of protein-induced DNA bending is the formation of chromatin. But DNA bending also plays a crucial role in the regulation of gene expression. DNA binding proteins like IHF and transcriptions factors like CAP or TBP introduce severe bends into the DNA. This can bring DNA sites into close proximity which are several hundred base pairs apart from each other in sequence. In this way it becomes feasible that a protein can interact with several DNA sites at the same time. In particular, this mechanism may explain how enhancer and upstream and downstream regulatory DNA elements can interact with a transcription initiation complex located at the promoter of a gene.

With the increasing number of protein/DNA structures solved at atomic resolution the origin of protein-induced DNA bending can be studied in greater detail. A rigorous analysis of DNA structure and geometry is required to understand the details of protein/DNA interaction. Here we offer an analysis of nucleic acid double helix structure with two approaches. First we analyze the structure in terms of helical parameters. The programs CURVES and FREEHELIX are used to calculate the displacement parameters between adjacent base pairs. These parameters describe the local conformation at each dinucleotide step. The second approach offered is a bending analysis, were the shape of the helical axis is analyzed. This yields parameters like kink angles or radius of curvature which directly quantify the amount of global bending.

Both approaches are complementary because they emphasize either local or global properties. The numerical analysis is supported by a visualization of the structures. The nucleic acid double helices are shown in three orthogonal views, superimposed on their helical axis. Orthogonal views are most informative if the structure is oriented in an appropriate way. We have developed an algorithm to do this automatically. This algorithm calculates the principle axis of inertia of the helical axis. The helical axis is deteremined with the CURVES program. Then, the principle axes are aligned with the coordinate axis of the viewing windows. This orientation is especially useful for the evaluation of bending because the front view always shows the largest bend. Models of the helix and the helical axis are also offered for viewers of 3-dimensional molecular structures like RASMOL and CHIME. It is intended that users should compare all the results obtained from the distinct analysis procedures and for different structures. Therefore, all images, plots and tables are arranged consistently. In particular, the 5'-end of strand 1 is always located at the left hand side, in the plots of data as well as in the images of structures.

The analysis and visualization of nucleic acid double helix structures is offered as a web-based tool for more than 800 structures from the PDB and the NDB. The main strength of this service is, that the user can determine helical and bending parameters with uniform methods for all structures in the PDB and the NDB.

The analysis of helix structure and bending applies only to the nucleic acid bases. A further aspect of nucleic acid structure is the conformation of the sugar phosphate backbone. For the analysis of backbone structure a table of selected backbone torsion angles is presented, and furthermore, plots of the minor and major groove width can be requested.

 

2. Structures available for analysis

The analysis of nucleic acid double helices is limited to regular A and B-helices with at least 6 base pairs. The duplex must be composed of two separate molecules or chains. Helices shorter than 6 base pairs, Z-DNA, tetraplex structures and irregular helices containing bulges, loops or several mismatches are excluded. The algorithm for scanning the PDB and NDB for appropriate duplex structures is relatively simple and may fail in a few cases.

Only the nucleic acid double helix part of a structure is analyzed. All other part like proteins, ligands, and even overhanging ends are not taken into account and are not shown in the images and 3-dimensional models. Triple helix structures are separated into a Watson-Crick double helix which is analyzed and a third strand which is discarded. Only the first model is analyzed for structures with more than one model (NMR structures).

 

3. Selection of structures

Both PDB and NDB codes can be used for selecting structures. For structures contained in both databases the analysis is always done for the coordinate file from the NDB, regardless whether a structure was selected by its PDB or NDB code. The coordinates are the same in both databases, only the data in the coordinate file headers are different. We prefer the NDB files because the NDB has updated the headers in accordance to PDB file format specification 2.1.

Structures can be selected from lists of structures, but also by searching for a given PDB or NDB code. There are separate lists for duplexes from pure nucleic acid structures and from protein complexes. The list of structures from protein complexes is provided with a simple interface, which allows to search for structures according to the type of protein, and to sort the structures found.

 

4. Analysis tools

The following tools are offered for the analysis and visualization of regular nucleic acid double helices:


Image Library Home     Helix Analysis Home


Last modified: July 28, 2000
Author: Peter Slickers and Jürgen Sühnel (jsuehnel@imb-jena.de)