This section of the manual is divided into two categories:
At times, a protein may be determined with a wrong fold, e.g. due to poor resolution of the structure (X-ray data) or low density map (ECM data). In such case GapRepairer can be used to create a structure with correct nontrivial topology. Correct protocol for such protein will be described below, using as an example a bacterial tRNA methyltransferase (PDB ID: 1oy5) - after proper reconstruction protein backbone will be knotted.
Workflow:
First the offending segments should be removed (either manually by removing them using a text editor, or by a graphical program such as PyMol) - amino acids to be removed are shown in red in Fig. 1 below.
Then the modified structure should be uploaded to the GapRepairer using the Process my own structure option, along with the unmodified .fasta file. To ensure that (as it's own best homologue) the original structure is not used as a template, it should be specified in the Structures to exclude option (under Advanced Options in the repair form). Protein backbone after reconstruction is shown in the Fig. 1B.
As another possible application, the GapRepairer can be used to untie a protein backbone. Inconsiderate reconstruction (eg. using a straight line) can lead to an artificial entanglement. Such artificial crossings are more common in case of knots than slipknots (77 and 55 cases respectively found by KnotProt). This inequality my be due to the number of additional crossings that must be introduced to create an artificial entanglement - one such error is enough to create a knotted protein. In case of such protein GapRepairer can be used to recreate the missing fragment and return to the trivial topology (if such is the prevalent one amongst the homologues of the protein in question), using the workflow described for Artificially disentangled proteins.
Examples of untied proteins (reduced from a 31 knot/slipknot to an unknot):
Untying more complicated proteins:
While artificial trefoils (that is the 31 knots/unknots) are by far the most common of similar errors, more complicated topologies can also be created - and resolved using GapRepairer. Such entanglement may be created when a protein contains one (or more) of the twisted loops - and the unfortunate connection that fill the gap happens to go right through this loop.
Case study: Correcting a structure based on an incorrect lasso topology
Structure with PDB ID 3j70 is a computational model of HIV envelope glycoprotein. Two of its loops are crossed (marked in Fig. 4 left panel below), which, with both belonging to separate lassos, makes a Hopf link out of them. While these are highly mobile loops, such structure is not present in any of the experimental structures of this protein available in the PDB. As such, we have determined reconstruction of this particular region as incorrect and decided to rebuild it using GapRepairer through following workflow:
Thanks to the possiblity of the reconstruction of multiple missing fragments at the same time, GapRepairer can also be used to change the chirality of a protein. By correcting just two wrongly interpreted crossings,a -31 slipknot can be changed into a +31 one (as shown in Fig. 5). Protein with PDB ID 4zg6 is annotated as -31 slipknot in KnotProt (due to the straight line gap-filling), while all of its homologues have different chirality. As can be seen in Fig. 5 below repaired protein has a +31 topology
Topology differentation is especially important for proteins where high sequential similarity does not preclude differing topology. The most notable example of such is the ATCase/OTCase protein family (PFam family PF00185) - a family containing two closely related enzymes: a 31- knotted aspartate carbamoylotransferase, and an unknotted ornithine carbamoylotransferase. These two enzymes cannot be easily differentiated based on their sequence, yet contain different fold (as can be seen in the JSmol applet below). Since a consensus structure that averages over different topologies can be quite unexpected (it is impossible to say for sure which topology would it display), by restricting the topology of the templates to those in accordance with the closest homologue, drastically reduce the uncertainity of the final result.
When uploading a .pdb file to GapRepairer, user can select to take other chains present into consideration. As currently only one chain can be modelled at a time, this option allows user to repair a multichain structure without resorting to editing .pdb files by hand. To repair a to chain structure, with both chains incomplete, the proper workflow would be to:
There is currently no upper linit to the length or number of missing fragments - as long as at least the first and last residue are present and, to ensure sensible reconstruction, close enough homologues exist.
This functionality can also be used for structures with significant structural errors. One such case that can be resolved using GapRepairer is the structure with PDB ID 2xkl. Based on comparison with its closest structural and sequential homologues (eg. structures with PDB IDs 2wew and 2xkl, one of its beta strands has an incorrectly assigned structure - it was assumed to apprear at the N terminus, while it should fall into the middle of the sequence (deep blue beta strand in the middle of Fig. 8 left panel). Thanks to the close structural relation with its homologues it is possible to fully reconstruct the assumed correct form. Suggest worflow here is as follows:
GapRepairer was used to check whether the reconstruction of missing amino acids in gapped proteins would allow to confirm the position of a lasso (that is the disulfide bridge). PDB codes of repaired proteins, whose chains and complex lasso type were successfully modeled, are listed in the collapsible panel below, and will shortly be available in our database. List of the repaired lasso proteins
For the in silico part of the analysis a missing link between two domains of the 1ual protein was repaired using GapRepairer.
The proteins 1cmx and 4i6n were reconstructed using GapRepairer.