Prediction of Chain Flexibility in Proteins

P. A. Karplus and G. E. Schulz
Today, a multitude of genes have been sequenced of which the respective proteins remain unknown. These proteins can be identified, localized, and purified by using antibodies raised against oligopeptides that correspond to segments of the hypothetical protein sequence [1, 2]. In special cases, such oligopeptides may even give rise to synthetic vaccines [3, 4]. Currently, the selection of oligopeptide stretches from a protein sequence is based on schemes designed to predict segments of high antigenicity [5], hydrophilicity [6, 7], or reverse-turn potential [8, 9]. However, it has recently been demonstrated that segmental flexibility is more indicative of an antigenic determinant than the selection criteria mentioned above [10], and that it is also better suited for selecting crossreacting peptides [11]. Accordingly, we have analyzed 31 refined protein structures to develop a method for predicting flexible segments from a given amino acid sequence. The data base used for the prediction of chain flexibility consisted of 31 proteins (given in Fig. 2) of known threedimensional structure, as deposited in the Protein Data Bank, Brookhaven, USA. The protein structures selected had been refined with individual atomic temperature factors (i.e. B-values); they had more than 30 residues, their resolution was better than or equal to 0.3 nm, and they were at least 50% different in sequence from all other included proteins.