A predicted structure of the cytochrome c oxidase from Burkholderia pseudomallei Mohd.
Firdaus Mohd. Raih Ahmad
Tarmidi Sailan Zulkeflie
Zamrod Mohd.
Noor Embi Rahmah
Mohamed* * Corresponding author Financial support: This
work was funded by the Intensification of Research in Priority Areas
(IRPA) grants IRPA 01-02-02-001 and IRPA-TOPDOWN 09-02-02-T001 provided
by the Ministry of Science, Technology and the Environment, Malaysia.
Burkholderia pseudomallei is the causative agent of melioidosis, a serious disease of humans and animals that occurs primarily in South East Asia, Northern Australia and other tropical areas (Dance, 2002). This pathogenic bacterium survives in diverse environmental conditions and secretes various extracellular products that have been implicated as factors involved in pathogenesis of this disease. In this paper, we summarize the identification of a cytochrome c oxidase gene in B. pseudomallei and its tertiary structure determination utilizing available protein structure prediction (comparative modeling) methods based on the predicted protein sequence. The postulated amino acid residues that play the major roles in oxygen reduction, proton and electron transfer pathways were used as reference points to gauge functional probability. The sequence we have identified and its predicted corresponding structure is the first functional annotation for this family of proteins in B. pseudomallei.
Genomic library preparation. A Burkholderia pseudomallei genomic DNA library was prepared in pSV-SPORT1 vector (Gibco, BRL) and transformed in Escherichia coli strain DH5a. After screening with a heterologous oligonucleotide probe, a recombinant plasmid containing a 20 kb EcoR1 insert was isolated. Plasmid preparations were carried out by standard procedures (Birnboim and Doly, 1979) and then subjected to automated DNA sequencing. DNA sequencing. Automated DNA sequencing was carried out utilizing Taq DyeDeoxyTM Terminator Cycle Sequencing Kit, Amplitaq® DNA Polimerase (Perkin Elmer, USA), FS enzyme and electrophoresed via the ABI PRISM 377 Automated DNA sequencer. The insert was then subjected to a primer walking strategy initially utilizing universal primers USP6, UT7 (Gibco, BRL) and followed by four synthetic primer pairs. The complete sequence of the ORF2 sequence has been deposited in the GenBank Database (http://www.ncbi.nlm.nih.gov/Genbank/) and was assigned the GenBank nucleotide accession number AF087002 (AAF13732 protein accession number). Gene prediction. Open reading frames were identified with the aid of DNAsis (Hitachi Software Engineering America Ltd.) and GeneMark ver. 2.0 programs (Lukashin and Borodovsky, 1998). Sequence database alignments and comparisons were done with the BLAST family of programs (blastx, blastp) against database specifications of non-redundant protein, SWISS-PROT and PDB which were available from the BLAST website at the National Center for Biotechnology Information webserver, (http://www.ncbi.nlm.nih.gov/blast/) (Altschul et al. 1997). Multiple sequence alignments were done using ClustalW 1.8 (Thompson et al. 1994). PROSIS was used to calculate the amino acid composition encoded by the ORFs and some predicted properties of the individual proteins. Protein structure prediction. The B. pseudomallei Cox subunit I (AF087002) predicted protein sequence was submitted to several transmembrane prediction programs accessed via their web interfaces. The programs used were DAS (Cserzo et al. 1997; http://www.sbc.su.se/~miklos/DAS/), TOPPRED (Claros and von Heijne, 1994; http://bioweb.pasteur.fr/seqanal/interfaces/toppred.html), TMHMM (Krogh et al. 2001; http://www.cbs.dtu.dk/services/TMHMM-2.0/), Split (Juretic et al. 1999), MEMSAT (Jones et al. 1994; http://www.psipred.net) and SOSUI (Hirokawa et al. 1998; http://sosui.proteome.bio.tuat.ac.jp/cgi-bin/sosui.cgi?/sosui_submit.html). A survey of evaluated TM regions prediction program (Moller et al. 2001) was used as a reference to gauge accuracy and reliability of the TM prediction results. The 512 residue amino acid sequence was submitted for further automated prediction of secondary structures and protein fold. The programs chosen were based on the results of CAFASP 2 (Fischer et al. 2001; http://www.cs.bgu.ac.il/~dfischer/CAFASP2/) for fully automated protein structure prediction. The following programs were used via their web browser interfaces: GenThreader (Jones, 1999; http://www.psipred.net), PHD (Rost et al. 1994; http://cubic.bioc.columbia.edu/predictprotein/), bioinbgu (Fischer, 2000; http://www.cs.bgu.ac.il/~bioinbgu/) and 3D-PSSM (Kelley et al. 2000; http://www.sbg.bio.ic.ac.uk/~3dpssm/). These servers were accessed via a web directory interface of online protein structure prediction resources - SPORes: Structure Prediction with Online Resources website (http://cgat.ukm.my/spores/). Structural topology was built from the secondary structure prediction data while template selection was done using the fold prediction data. The target sequence was then aligned to the template sequence using the Homology module of InsightII (MSI, ver. 98). Manual editing for optimal alignments were done where deemed necessary. The alignment was translated into tertiary structure using Modeller Release 6 (Sali and Blundell, 1993). The initial strain of the predicted structure was relieved by carrying out energy minimizations using Discover (MSI ver. 98) utilizing the CVFF force field. Short contacts were removed by manually rotating the side chains. Model evaluation was done using Procheck (Laskowski et al. 1993), Errat (Colovos and Yeates, 1993), What If (Vriend, 1990) and Verify3D (Luthy et al. 1992). Refinements of the structure and geometry optimizations, when necessary, were carried out using the Insight interface (MSI, ver. 98).
Complete sequencing of the 2.0 kb DNA insert from the recombinant clone showed that it contained a 1536 bp ORF designated Bp cox1 potentially encoding a subunit 1 cytochrome c oxidase termed Cox. A thorough search of available sequence databases showed that the putative B. pseudomallei Cox (AF087002) sequence is homologous to other known cytochrome c oxidases with varying levels of sequence identity and appears to have structural features similar to the largest subunit of the heme/copper-requiring cytochrome c and quinol oxidases (Figure 1 and Figure 2). BLAST queries of the predicted B. pseudomallei Cox protein sequence yielded eight other sequences with E values lower than 10-50. All of these sequences were identified as being cytochrome c oxidases (6 sequences) or hypothetical cytochrome c oxidases (2 sequences) (Figure 1). Alignments of the predicted protein sequence against the PDB database also showed significant homology with the sequence of a solved crystal structure for cytochrome c oxidase (PDB identification = 1EHK) from Thermus thermophilus (Soulimane et al. 2000). Multiple sequence alignments on these sequences (P. denitrificans, T. thermophilus and bovine heart mitochondria aligned to the B. pseudomallei) showed several regions of high conservation despite some sequences being phylogenetically distant (Figure 1 and Figure 2). Several of these highly conserved regions were identified as crucial residues in other proposed models for electron transfer in cytochrome c oxidases. The Thermus thermophilus sequence exhibited 36% identity to the B. pseudomallei AF087002 sequence. Figure 1 illustrates further the observed sequence conservation by showing the alignments of the amino acid sequences of the B. pseudomallei cytochrome c oxidase towards the other sequences chosen from the GenBank (non-redundant protein databases, SWISSPROT, PDB) BLAST searches. The B. pseudomallei subunit is shorter at both termini than the subunit from these cytochrome c oxidases, the degree of conservation however remains clear, as shown by the number residues that are identical across the sequences aligned. For clarity, the sequence alignment in Figure 2 shows only segments that include the six histidines present in every representative of these enzymes. The six histidine residues and other conserved amino acids are placed in a similar pattern along the putative membrane spanning hydrophobic segments. Multiple methods of transmembrane prediction were used to enable a consensus confirmation of predicted transmembrane helices via differing approaches (data not shown). The TM region search for B. pseudomallei Cox subunit I revealed 12 possible TM helices (Figure 2). The objective of this TM search step was to identify and map out regions of transmembrane helices to confirm our sequence database based hypothesis of the putative sequence being identified as a cytochrome c oxidase subunit I. Cythochrome c oxidases are known to have these transmembrane regions and the identification of these regions served as a confirmatory step in gauging the validity of the predicted structure. The observation from the transmembrane prediction step is consistent with the subunit I of most other haem copper oxidases and acts as confirmatory data for correctness of the protein fold generated by the Modeller program. The fold prediction methods used, proposed the crystal structure of a ba3-cytochrome c oxidase (PDB identification 1ehk) from Thermus thermophilus as a suitable template. The crystal structure for 1ehk (Soulimane et al. 2000) was solved a relatively low resolution of 2.4Å, was however still selected as the template structure. The target sequence showed a sequence identity of 36% towards the template sequence (Figure 3). The 1ehk structure, has an unusual property for proteins in the oxidase superfamily as its subunit I contained a 13th TM helix instead of just the usual 12 TM segments. The predicted structure was found to have an RMS fit of 3.1 Å to the template structure. The Ramachandran plot from the Procheck validation revealed five residues in the disallowed regions of the plot while 80.7% of the residues were in the most favoured regions of the plot and the remainder residues in the additional and generously allowed regions of the plot (Figure 3a). A check of the 1ehk template structure revealed two residues which violated the Ramachandran region. Validation by the Errat and Verify 3D programs showed that the predicted structure had a generally acceptable three dimensional profile (Figure 3a; Figure 3b and Table 2). Evaluation methods used generally agree on the correct threading of the backbone. Comparisons of the initial Errat and Verify-3D results with those conducted after refinement and subsequent geometry optimisations showed marked improvements (results not shown). Regions of bad geometry were found to be located mainly in regions with unaligned target-template sequences and structurally variable regions (Figure 1 and Figure 2). Despite the phylogenetically distant relationship of the target-template sequences, the sequence structure alignment yielded sufficient information of structurally conserved regions to enable a functionally probable model to be generated. The predicted protein structure for B. pseudomallei Cox consists of 12 discernible TM helices (Figure 4). An initial assessment for functional plausibility of the predicted protein fold was gauged from comparisons of the primary protein structure to the predicted tertiary structure. The folding of the functionally crucial residues, such as the haem and Cu ligands, electron transfer pathway residues and proton pathway residues were found to fold closely together in 3D space even though some of these residues were distant to each other in the primary structure. Furthermore, the structural placement of these residues, were found to be similar when compared to the template crystal structure. The overall structure shows a clear hydrophobic core with pores A, B and C, described in Iwata et al. 1995, visible when viewed from the periplasmic side. The structure-based sequence alignment of subunit I (Figure 2) between other cytochrome c oxidase sequences, the P. denitrificans and bovine heart oxidases shows that functionally vital residues, such as heme and Cu ligands or the residues proposed for the electron transfer from CuA to the hemes, are conserved. In addition to these residues, a highly conserved motif (VLYTFYPP, located between Val84 and Pro91) can be discerned from the alignment especially to T. thermophilus (Figure 3 and Figure 4). His236 in B. pseudomallei cox1, was postulated to be one ligand for CuB comparable to His326 and His 291 in P. denitrificans and bovine heart mitochondrion, respectively. This residue might form an electron transfer pathway from CuA directly to CuB. The current understanding of the oxygen reduction mechanism at the binuclear centre requires the input of at least one of the electrons via CuB (Hill, 1994; Michel et al. 1998). The electron transfer from CuA via haem a/b to heme a3 at the binuclear centre is well established (Hill, 1994), and the corresponding residues (Arg401, Arg400 and Phe338) are conserved in the B. pseudomallei cox1 (Table 1 and Figure 5). We propose the above-described pathway via Tyr89, Trp183 and His236 as an additional electron transfer pathway that could be used for electrons that are provided from CuB to the catalytic oxygen intermediates. Two possible proton transfer pathways have been suggested based on the crystal structure of the P. denitrificans enzyme and in agreement with the results of site directed mutagenesis (Garcia-Horsman et al. 1995) i.e. K-pathway and D-pathway. The shorter K-pathway, leads to the binuclear centre via the highly conserved residues SU I-Thr 351 and SU I-Tyr280 located in the TMH VI and VIII and the hydroxyl group or the heme a3 hydroxyethylfarnesyl chain (Michel et al. 1998). Two residues were postulated to have similar functions in Bp CoxI i.e. Su I-Thr262 and Thy190 (Table 1). Nevertheless, none of the residues in Bp CoxI were found to have counterparts in the longer D-pathway as for Paracoccus sp.
Computational facilities and resources used were based at the Bioinformatics Laboratory of the National Biotechnology and Bioinformatics Network (NBBnet), National Biotechnology Directorate (BIOTEK), Ministry of Science, Technology and the Environment, Malaysia.
ALTSCHUL, S.F.; THOMAS, L.M.; ALEJANDRO, A.S.; JINGHUI Z.; ZHENG Z.; WEBB, M. and DAVID, J.L. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Research, September 1997, vol. 25, no. 17, p. 3389-3402. BIRNBOIM, H.C. and DOLY, J. A rapid alkaline extraction procedure for screening recombinant plasmid DNA. Nucleic Acids Research, November 1979, vol. 7, no. 6, p. 1513-1523. CLAROS, M.G. and VON HEIJNE, G. TopPred II: An improved software for membrane protein structure predictions. CABIOS, December 1994, vol. 10, no. 6, p. 685-686. COLOVOS, C. and YEATES, T.O. Verification of protein structures: patterns of nonbonded atomic interactions. Protein Science, September 1993, vol. 2, p. 1511-1519. CSERZO, M.; WALLIN E.; SIMON, I.; VON HEIJNE, G. and ELOFSSON, A. prediction of transmembrane alpha-helices in prokaryotic membrane proteins: the dense alignment surface method. Protein Engineering, June 1997, vol. 10, no. 6, p. 673-676. DANCE, D.A.B. Melioidosis. Current Opinion in Infectious Diseases, April 2002, vol. 15, no. 2, p. 127-132. FERGUSON-MILLER, S. and BABCOCK, G.T. Heme/copper terminal oxidases. Chemical Reviews, November 1996, vol. 96, no. 7, p. 2889-2907. FISCHER, D.; ELOFSSON, A.; RYCHLEWSKI, L.; PAZOS, F.; VALENCIA, A.; ROST, B.; ORTIZ, A.R. and DUNBRACK, R.L. Jr. CAFASP2: The second critical assessment of fully automated structure prediction methods. Proteins, 2001, vol. 45, Suppl. 5, p. 171-183. FISCHER, D. Hybrid fold recognition: Combining sequence derived properties with evolutionary information. In: Pacific Symposium on Biocomputing, (January 4-9 2000 Hawaii), World Scientific, p. 119-130. GARCIA-HORSMAN, J.A.; PUUSTINEN, A.; GENNIS, R.B. and WIKSTRÖM, M. Proton transfer in cytochrome ba3 ubiquinol oxidase of Escherichia coli: second-site mutations in subunit I that restore proton pumping in the mutant Asp-135 Asn. Biochemistry, April 1995, vol. 34, no. 13, p. 4428-4433. HILL, B.C. Modelling the sequence of electron transfer reactions in the single turnover of reduced, mammalian cytochrome c oxidase with oxygen. Journal of Biological Chemistry, January 1994, vol. 28, no. 269, p. 2419-2425. HIROKAWA, T.; BOON-CHIENG, S. and MITAKU, S. SOSUI: classification and secondary structure prediction system for membrane proteins. Bioinformatics, 1998, vol. 14, no. 4, p. 378-379. IWATA, S.; OSTERMEIER, C.; LUDWIG, B. and MICHEL, H. Structure at 2.8 Å resolution of cytochrome c oxidase from Paracoccus denitrificans. Nature, August 1995, vol. 24, no. 376, p. 660-669. JONES, D.T. GenTHREADER: an efficient and reliable protein fold recognition method for genomic sequences. Journal of Molecular Biology, April 1999, vol. 287, no. 4, p. 797-815. JONES, D.T.; TAYLOR, W.R. and THORNTON, J.M. A model recognition approach to the prediction of all-helical membrane protein structure and topology. Biochemistry, March 1994, vol. 33, no. 10, p. 3038-3049. JURETIC, D.; JERONCIC, A. and ZUCIC, D. Sequence analysis of membrane proteins with the web server SPLIT. Croatica Chemica Acta, 1999, vol. 72, no. 4, p. 975-997. KELLEY, L.A.; MACCALLUM, R.M. and STERNBERG, M.J.E. Enhanced genome annotation using structural profiles in the Program 3D-PSSM. Journal of Molecular Biology, June 2000, vol. 299, no. 2, p. 501-522. KROGH, A.; LARSSON, B.; VON HEIJNE, G. and SONNHAMMER, E.L.L. Predicting transmembrane protein topology with a hidden Markov model: Application to complete genomes. Journal of Molecular Biology, January 2001, vol. 305, no. 3, p. 567-580. LASKOWSKI, R.A.; MACARTHUR, M.W.; MOSS, D.S. and THORNTON J.M. PROCHECK: a program to check the stereochemical quality of protein structures. Journal of Applied Crystallography, 1993, vol. 26, p. 283-291. LUKASHIN, A. and BORODOVSKY, M. GeneMark.hmm: new solutions for gene finding. Nucleic Acids Research, February 1998, vol. 26, no. 4, p. 1107-1115. LUTHY, R.; BOWIE, J.U. and EISENBERG, D. Assessment of protein models with three-dimensional profiles. Nature, March 1992, vol. 356, p. 83-85. MICHEL, H.; BEHR, J.; HARRENGA, A. and KANNT, A. Cytochrome c oxidase: structure and spectroscopy. Annual Reviews Biophysics and Biomolecular Structure, 1998, vol. 27, p. 329-356. MOLLER, S.; CRONING, M.D. and APWEILER, R. Evaluation of methods for the prediction of membrane spanning regions. Bioinformatics, July 2001, vol. 17, no. 7, p. 646-653. ROST, B.; SANDER, C. and SCHNEIDER, R. PHD-an automatic server for protein secondary structure prediction. CABIOS, February 1994, vol. 10, no. 1, p. 53-60. SALI, A. and BLUNDELL, T.L. Comparative protein modelling by satisfaction of spatial restraints. Journal of Molecular Biology, December 1993, vol. 234, p. 779-815. SOULIMANE, T.; BUSE, G.; BOURENKOV, G.P.; BARTUNIK, H.D.; HUBER, R. and THAN, M.E. Structure and mechanism of the aberrant ba3-cytochrome c oxidase from Thermus thermophilus. European Molecular Biology Organization Journal, April 2000, vol. 19, no. 8, p. 1766-1776. THOMPSON, J.D.; HIGGINS, D.G. and GIBSON, T.J. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positions-specific gap penalties and weight matrix choice. Nucleic Acids Research, November 1994, vol. 22, p. 4673-4680. VRIEND, G. What if: a molecular modelling and drug design program. Journal of Molecular Graphics, March 1990, vol. 8, p. 52-56. WITT, H. and LUDWIG, B. Isolation, analysis, and deletion of the gene coding for subunit IV of cytochrome c oxidase in Paracoccus denitrificans. Journal of Biological Chemistry, February 1997, vol. 272, no. 9, p. 5514-5517. |
Note: Electronic Journal of Biotechnology is not responsible if on-line references cited on manuscripts are not available any more after the date of publication. |