The generic force field for SURPASS model describes the most fundamental properties of globular proteins. The only sequence-dependent parameters comes from secondary structure. The background for force-field derivation define regularities observed in real protein structures. The statistics is based on a redundant set of 4600 protein chains, representing all known protein families, with resolution not lower than 1.6Å and a sequence identity not greater than 60%. Described below analysis of these statistical data defines the SURPASS force field consisting of knowledge-based statistical potentials.

[Figure 1. Schematic illustration of the terms included in the SURPASS force field.]

The deficiencies of atomic details in strongly simplified and pre-averaged SURPASS chain may cause an incorrect local geometry of the structure. To avoid this, it is necessary to transfer the structural regularities of the atomistic models onto the corresponding sets of united atoms. All generic terms: R12, R13, R14 and R15 are prepared in six variants (HH, EE, CC, HE, HC, EC) depending on the secondary structure assignments for pairs of residues located at key positions. All short-range interactions have been implemented in the force field as potential of mean field (PMF), using a one-dimensional kernel density estimator (KDE) as a method of estimating the density of the empirical distribution.

[Table 1. Secondary structure dependent short range interactions. | term | statistic plots (6 variants) | energy plot (all-in-one) - table 4 rows x 8 columns]

[equasion and description]

In the SURPASS model only the hydrogen bonds between residues that are distant in the sequence, especially in extended structure fragments, are modeled more directly. Therefore, the formation of model hydrogen bonds depends on the fulfillment of a few simple geometrical conditions: - the length of the model hydrogen bond is in a range of 3.8Å to 6.0Å, and the most probable length is 4.65Å; - the maximum number of connections for each pseudo residue in the β-strand is 2; if there are more potential candidates for hydrogen bond formation, the best two are chosen according to the following angular criterion: - a hydrogen bond should be perpendicular to the main chain of both interacting β-strands and the permitted angle range is from 70˚ to 115˚; - the maximum allowable twist of the beta sheet, measured as the planar angle between the main chains of two adjacent β-strands, is not greater than 55˚; - for a pseudo residue that forms two hydrogen bonds (with two different β-strands), the planar angle between these bonds must be greater than 125˚, and 180˚ is the best orientation.

[Figure 2. Statistical analysis of the geometry of the model hydrogen bond: A – length of hydrogen pseudobonds extracted from the RDF of distance between i-th and j-th pseudoresidues in two beta strands. B – angle between two β-strands connected by a hydrogen bond. C – twist of the β-sheet measured as a planar angle between the main chains of two adjacent β-strands; D – angle between two hydrogen bonds of three connecting β-strands.]

- pseudo atom H (helix-like) for helical (HHHH) or almost helical (HHHC, CHHH) fragments
- pseudo atom S (like β-strand) representing centers of mass of EEEE, EEEC or CEEE, fragments
- pseudo atom C (coil-like) for all remaining secondary structure combinations (H, E and C)