(-) Introduction

The batch calculation option enables the analysis of sets of PDB entries.
As a starting point the calculation of mean amino acid distances is currently available.

We plan to add other calculation options, based on user requests.

(-) Request Status Report

It is not necessary to keep the Internet browser open until a calculation is finished.

On the batch calculation start page there is an option to view the status of any request.

Request ID:

If permanent cookies are enabled in your browser (at least for the JenaLib server), you can get a history report of all recent batch calculation requests that were started from your browser. Just click on the "go" button without providing a request ID.

If they are not enabled you can still get a status report for a specific request by entering the request ID. To be sure that you can access a request after closing your browser it is best to write down the request ID after starting a request. This will also allow you to view the results in a different browser (on a different computer). The results of a request are usually available for several days after it is finished.

(-) PDB Entry Selection

PDB entries are selected for a batch calculation request by providing a list of PDB codes.
One can either write or paste the list into the text input field or select a file containing the list.

If only the PDB code is provided, all available chains are used for the calculation.
If a chain identifier is appended to the PDB code, only this chain is used for the calculation.

A chain identifier can either be appended directly to the PDB code ("1lflA") or by an underscore ("1lfl_A") or by a space character ("1lfl A").

There are 3 different list formats accepted:

(-) Mean Amino Acid Distance

The mean distance between the alpha-carbon atoms of amino acids within a protein chain (not between different chains) is determined for a set of PDB entries.

The distance calculation is performed by using the ncont program of the crystallographic program suite CCP4.

The individual distances provided by ncont are used for averaging over each individual chain and for averaging over all chains of all entries. Averaging (mean distance, median distance) is done for all amino acid pairs or amino acid group pairs that were selected in the submission form.

The amino acid selection can be made in 3 different ways:

  1. By an amino acid pair
    One can select either a specific amino acid or "any" amino acid for each half of the pair.
    If "any" is selected, all available standard and non-standard amino acids are included.
    example: Asparagine (Asp,D)
    Glutamine (Gln,Q)

    Averaging is done for each individual amino acid pair.

  2. By a predefined amino acid group pair
    One can select predefined sets of amino acids for each half of the pair, according to the amino acid classification by Russell, Betts & Barnes.
    example: Aromatic (F,H,W,Y)
    Charged (D,E,H,K,R)

    Averaging is done for the whole group pair and not for the individual amino acid pairs.

  3. By a user-defined amino acid group pair
    One can define a customized group for each half of the pair by supplying a comma-separated list of amino acids. By this way also specific non-standard amino acids (e.g.: MSE) can be included.
    The amino acids can be specified either by three-letter or one-letter codes.
    example: Asp,Glu

    Averaging is done for the whole group pair and not for the individual amino acid pairs.

The selection of PDB entries for the distance calculation is described in the section PDB Entry Selection above.

Be aware that the mean distance calculation may take several hours or even much longer.

The running time depends on the number of individual distances that must be determined. And this number strongly depends on the following parameters (besides the JenaLib server usage):

Tyr-Leu Positive-Negative Any-Any
Code Entry Count Chain Count Residue Count Time Distance Count Time Distance Count Time Distance Count
1bl0 1 1 116 <1 sec 72 <1 sec 330 7 sec 6670
1deh_A 1 1 374 <1 sec 174 <1 sec 1800 18 sec 69751
1deh 1 2 748 <1 sec 348 <1 sec 3600 30 sec 139502
- 100 141 24739 25 sec 13419 38 sec 79503 9 min 2519076
- 2000 3951 837877 11 min 820226 25 min 4455477 7 hours 137350398

The results are available in 2 different formats:

The HTML result page contains the request status information, a table averaging over all entries and a table averaging chain-specifically.
The individual distances are not included because the page size would become very large, even for a few PDB entries.

In CSV format (tab separated values, suitable for importing the results into spreadsheat programs like Microsoft Excel) there are three independant data files available:

  1. averaging over all entries
  2. chain-specific averaging
  3. individual distances
Because the data files can become very large, we recommend to download them as a compressed ZIP archive. The content of the ZIP archive can be selected freely.

