Cookbook For Eukaryotic Protein Expression: Yeast, Insect, and Plant Expression Systems

Date: November 9, 1998Baculovirus Expression Vectors In the recent past, efforts to elucidate the relationship between protein structure and biological function have intensified. Of particular interest is an understanding of the elements of sequence and structure that mediate specific functions. Often the protein of interest is in low abundance in its natural source and can be difficult to purify and/or unstable--subject to proteolytic cleavage or unfolding/non specific refolding during exhaust

By | November 9, 1998

Date: November 9, 1998Baculovirus Expression Vectors
In the recent past, efforts to elucidate the relationship between protein structure and biological function have intensified. Of particular interest is an understanding of the elements of sequence and structure that mediate specific functions. Often the protein of interest is in low abundance in its natural source and can be difficult to purify and/or unstable--subject to proteolytic cleavage or unfolding/non specific refolding during exhaustive purification. In other investigations, the examination of the involvement of specific residues in protein structure-function is hampered by the limitations of genetically modified proteins produced in bacterial systems. Consequently, a prerequisite to successful, detailed studies has, in many cases, necessitated the (over) production of biologically functional proteins. Modern advances in molecular biology and biochemistry can fulfill this need through the use of nonnative eukaryotic systems for the synthesis of recombinant proteins--that is, for the production of a "cloned" protein that exhibits all the three-dimensional, posttranslational, and functional features of the native protein.

The pESC family of yeast expression vectors from Stratagene.
While prokaryotic expression vectors (reviewed in the September 1, 1997, issue of The Scientist) provide a convenient system to synthesize eukaryotic proteins, when made in this fashion, proteins lack many of the immunogenic properties, 3D conformation, and other features exhibited by authentic eukaryotic proteins. Eukaryotic expression systems, including mammalian, amphibian, plant, insect, and yeast, overcome many of these limitations. The importance of eukaryotic expression systems is exemplified in the June 1998 issue of the journal Protein Expression and Purification, where over 50 percent of the research papers on protein expression concerned recombinant protein expression in eukaryotic cells. Mammalian cell expression systems, reviewed in the February 2, 1998, issue of The Scientist, can be effective expression tools, but they can be plagued by difficulties in purifying recombinant proteins, limitations on the size of the recombinant protein expressed, and mechanism(s) of protein expression induction. Many of these obstacles have been bridged using expression systems from insect and yeast cells.

The use of nonmammalian hosts has several generic advantages. The genetics and biochemistry of some hosts (yeast, for example) are well known. These expression systems exhibit near-native and hyperglycosylation and use well-defined secretory pathways for extracellular export of the recombinant gene product. These hosts are generally regarded as safe since they are not known pathogens to mammalian species, so when used to make recombinant proteins-even cytotoxic or oncogeneic products-they do not pose a threat to human handlers. Finally, high yields of these hosts can be grown using low-cost media.

The use of yeast and insect expression systems involves a cornucopia of very distinct biological, biochemical, and genetic elements and a host of unique research techniques to utilize these elements. Collectively, the science and technology of insect and yeast expression systems comprise a breadbasket of research opportunities. This process in many ways best represents how seemingly simple research tools, when used in novel ways, can provide the means to unravel more complex biological systems. Hence, this synopsis serves as a cookbook for heterologous protein expression in a variety of nonmammalian biological organisms. In essence, these systems constitute a marriage of yeast, insect, and viral biology with basic molecular biology and biochemistry.

Yeast hosts that can be used for expression studies include Saccharomyces cerevisiae, Schizosaccharomyces pombe, Pichia pastoris, Hansela polymorpha, Kluyveromyces lactis, and Yarrowia lipolytica . The first three are the most widely used. The success of yeast as a vehicle for high-level expression of heterologous protein can be attributed to the large body of knowledge that has been collected on this organism. The budding yeast, S. cerevisiae is perhaps the most extensively studied nonmammalian eukaryote; its genetics, biochemistry, and life cycle are well known. The fruits of this research are a multitude of strains and an extensive knowledge base. Technical advances, especially lithium acetate and electroporation-mediated transformation of intact yeast cells and the creation of 2µ yeast episomal plasmids, have provided a complete tool set required for successful yeast expression cloning. Advanced technologies and a thorough understanding of yeast biology have hastened the acceptance of this xssystem for expression studies.

Yeasts are particularly attractive as expression hosts for a number of reasons. They can be rapidly grown on minimal (inexpensive) media. Recombinants can be easily selected by complementation, using any one of a number of selectable (complementation) markers. Expressed proteins can be specifically engineered for cytoplasmic localization or for extracellular export. And finally yeasts are exceedingly well suited for large-scale fermentation to produce large quantities of heterologous protein. Classical studies in yeast genetics have generated a wide array of potential cloning vectors and, in the process, defined which plasmid and host genomic sequences are important in expression technology.

Wild-type yeasts are prototrophic, that is, they are nutritionally self-sufficient, capable of growing on minimal media. Classical genetic studies have created auxotrophic strains--those that require specific nutritional supplements to grow in minimal media. The nutritional requirements of the auxotrophic strains are the basis for selection of successfully transformed strains. By including a gene in the plasmid expression cassette that complements one or more defective genes in the host auxotroph, one can easily select recombinants on minimal media. Hence strains requiring leucine µeu2-) will grow on minimal media if they harbor a plasmid expressing the LEU2 gene. The most commonly used selectable markers found in yeast are for leucine µEU2), uracil (URA3), histidine (HIS3), and tryptophan (TRP1) deficiencies.

While the number and variety of S. cerevisiae and S. pombe strains possessing nutrition selectable markers make these yeasts attractive hosts, these two hosts are not without their limitations. Hyperglycosylation; weak, poorly regulated promoters; and biomass fermentation drawbacks have tempered their universal use. Many of these and other problems are circumvented using their cousins, P. pastoris. K. lactis and Y. lipolytica have been extensively utilized in the industrial-scale production of metabolites and native proteins (for example, ß-galactosidase). Vector-host genetic incompatibilities and a relatively undefined biology have limited their use as heterologous expression hosts.

The methylotrophic yeast, H. polymorpha, and to a greater extent P. pastoris --unique in that they will grow using methanol as the sole carbon source--are becoming a favorite expression host alternative for many researchers. P. pastoris has produced some of the highest heterologous protein yields to date (12 g/L fermentation culture), 10 to 100-fold higher than in S. cerevisiae. In P. pastoris, growth in methanol is mediated by alcohol oxidase, an enzyme whose de novo synthesis is tightly regulated by the alcohol oxidase promoter (AOX1). The enzyme has a very low specific activity. To compensate for this, it is overproduced, accounting for more than 30 percent of total soluble protein in methanol-induced cells. Thus, by engineering a heterologous protein gene downstream of the genomic AOX1 promoter, one can induce the its overproduction. This is the basis for the P. pastoris expression system. H. polymorpha produces the methanol oxidase (MOX) protein under control of the MOX1 promoter. A complete P. pastoris expression system is available from Invitrogen (Carlsbad, Calif.).

Most yeast vectors for protein expression contain one or more of these basic elements: the S. cerevisiae 2µ plasmid origin of replication, a ColE1 element, a antibiotic resistance "marker" gene (to aid development and screening of plasmid constructs in E. coli, a heterologous (constitutive or inducible) promoter, a termination signal, signal sequence (encoding secretion leader peptides), and occasionally fusion protein genes (to facilitate purification).

Constitutive gene expression by the yeast plasmid cassette is commonly mediated (in S. cerevisiae and S. pombe) by the promoters for genes to the glycolytic enzymes: glyceraldehyde-3-phosphate dehydrogenase (TDH3), triose phosphate isomerase (TPI1), or phosphoglycerate isomerase (PGK1). Protein expression can also be regulated (induced) using the alcohol dehydrogenase isozyme II (ADH2) gene promoter (glucose-repressed), glucocorticoid responsive elements (GREs, induced with deoxycorticosterone), GAL1 and GAL10 promoters (to control galactose utilization pathway enzymes, which are glucose-repressed and galactose-induced), the metallothionein promoter from the CUP1 gene (induced by copper sulfate), and the PHO5 promoter (induced by phosphate limitation). Most native yeast gene termination signals, when included in the plasmid expression cassette, will provide proper termination of RNA transcripts. The most commonly used are terminator signals for the MF-alpha-1, TPI1, CYC1, and PGK1 genes.

A bioresearcher can usually create a yeast expression system from "scratch," using technical information and protocols described in the resources listed in Current Protocols in Biology Vols 1 and 2, 1996, John Wiley Sons and assembling the numerous ingredients from a variety of biosuppliers. Or better yet, purchase the "Betty Crocker" version a complete packaged system from CLONTECH Laboratories, Inc. (Palo Alto, Calif.), Invitrogen (Carlsbad, Calif.), or Stratagene (La Jolla, Calif.).

The mainstay yeast expression product available from CLONTECH Laboratories is the YEXpress™ Yeast Expression System, which features the pYEX 4T family of vectors for S. cerevisiae hosts. The heterologous protein gene can be engineered in any of three reading frames, expression is Cu++ inducible, and the cytoplasmically located protein product has a glutathione-S-transferase (GST) fusion peptide to simplify purification by affinity chromatography. Similarly, a fusionless heterologous protein can be generated from the pYEX-BX vector (parent of the pYEX 4T family). Completing this expression vector suite is PYEX-S1, which yields constituitively (PGK promoter) expressed and extracellularly exported (mediated by an N-terminal K. lactis secretory fusion peptide) recombinant protein.

Invitrogen distributes an expression vector for S. cerevisiae and a complete system for P. pastoris expression cloning. The principal vector for the budding yeast hosts is pYES2. This vector harbors the tightly regulated GAL1 promoter for galactose-induced expression and the URA3 gene for complementation selection in the S. cerevisiae INVSc1 strain (ura3-52). The Easy Select Pichia Expression Kit includes vectors (pPICZ series), P. pastoris strains, reagents for transformation, sequencing primers, media, and a comprehensive manual. Researchers can clone their protein gene into any reading frame contained in each of three different vectors, select recombinants by Zeocin resistance, induce protein expression with methanol (sole carbon source), and identify expression using antibody to a C-terminal c-myc peptide tag. These vectors also harbor an S. cerevisiae alpha-factor secretion gene and polyHIS-encoding element, thus expressed protein is easily recovered from culture extract supernatant and purified using metal-chelate chromatography (for example, ProBond resin packaged in Invitrogen's Xpress Purification System).

The principal yeast expression products available from Stratagene are the ESP™ Yeast Protein Expression and Purification Systems for S. pombe and pESC vectors for S. cerevisiae hosts. The pESC vector system is a new product (introduced September 15, 1998) that allows for the coexpression of two different proteins (under control of the GAL1 and GAL10 promoters) from the same construct. This can be an invaluable tool in the study of protein-protein interactions or in cases where functional activity is dependent on a heteromeric protein complex formation. Each of the pESC vectors harbors one of four different yeast-complementation selectable markers (HIS3, TRP1, LEU2, or URA3) and two different epitopes (FLAG®) and c-myc) that can be engineered as C- or N-terminal tags. These specific tags, for which there are readily available and well-characterized antibodies, provide another unique mechanism with which to study individual protein products and/or complexes. The ESP™ system includes a series of vectors (pESP) differing in MCS (restriction enzyme selection and arrangement), but all harboring the nmt1 promoter (protein expression induced in the absence of thiamine), and a GST peptide tag to facilitate single step purification of recovered protein (culture extracts) on GST-affinity chromatography resin. In addition, ESP LIC µigation Independent Cloning) cloning kits provide a mechanism to clone PCR products directly into pESP vectors.

Eukaryotic expression systems employing insect cell hosts are based upon one of two vector types: plasmid or plasmid-virion hybrids. Although the latter is the most commonly used, plasmid-based systems offer methodological advantages. The typical insect host is the common fruit fly, Drosophila melanogaster, encountered in practically any classical genetics laboratory. Other insect hosts include mosquito (Aedes albopictus), fall army worm (Spodoptera frugiperda), cabbage looper (Trichoplusia ni), salt marsh caterpillar (Estigmene acrea) and silkworm (Bombyx mori). In most all cases, heterologous protein overexpression occurs in suspension cell cultures. The exception, and one of the advantages of plasmid-virion systems, is that the recombinant virus may also be injected into larval host hemocel or literally fed to the mature host.

Plasmid-based vector systems provide a mechanism for both transient and long-term expression of recombinant protein. This expression system is exemplified by the Drosophila Expression System (DES) available from Invitrogen. The transfection of competent D. melanogaster cells with engineered plasmid will mediate the transient (2-7 days) expression of heterologous protein. Establishing transformed cells that will express protein for longer time periods requires that the host cells be cotransfected with a "selection" vector, which results in the stable integration of the expression cassette into the host genome. This system offers two advantages over plasmid-virion systems: methodological simplicity, saving the researcher time, effort, and materials; and a choice of expression regimes, constitutive or inducible. Constitutive expression is mediated using the Ac5 Drosophila promoter, whereas a metallothionein promoter guides copper-inducible expression. The DES vectors are designed with multiple cloning sites for insertion of the heterologous protein gene in any of three reading frames. A choice of vectors also provides for the expression of a variety of C-terminal fusion tags: V5 epitope for identification of expressed protein with V5 epitope antibody, polyhistidine peptide for simplified purification with metal chelate affinity resin, and the BiP secretion leader peptide. The DES system also includes media for maintenance of the host cell line and expression of protein, as well as reagents to facilitate transfection.

Novagen's S-protein-FITC staining of Sf9 Cells expressing SoTag™ fusion proteins using BacVector™ recombinant baculovirus 48 hours after infection.
Inset Photo - Uninfected Sf9 cells.

Novagen's pIE vectors are based on the baculovirus immediate early promotor ie-l. These plasmids can be used with G418 selection to generate stable cell lines from Sf0 or Sf21 cell lines.

The plasmid-virion system is based upon the large, double stranded DNA poxvirus, baculovirus. The Autographica californica (alfalfa looper) nuclear polyhedrosis virus (AcNPV) virion is the most common source of the "expression cassette" for this system. A lesser-used source, at least in terms of commercial products, is the Bombyx mori (silkworm) NPV virion (BmNPV). There are several unique features and advantages of the baculovirus-insect expression system. Chief among these is the large native size of the viral genome. In the expression cassettes, much of the native genome (unnecessary elements) is removed, which provides for the potential insertion of a large heterologous protein gene or several genes (under their own promoters [multipromoter cassette]). This enables the expression of large proteins and/or the various protein components of large hetero-oligomeric complexes. The virion has a broad host range, so researchers can use any of a number of established insect cell lines for overproduction of recombinant protein or inject larval host hemocel for in situ studies.

The baculovirus expression cassette contains all the genetic information needed for propagation of progeny virus, so no helper virus is needed in the transfection process. The biology of the virus provides a simple means, using plaque morphology, to identify transformed host cells. The virus does not appear to be transmissible to vertebrate species; therefore, this virus-based system is safe for human handlers. Since with many virus vectors, heterologous protein genes are under the control of the late-stage baculovirus p10 and polyhedrin promoters, recombinant protein is, in most cases, the sole product produced. Hence, cells harboring the baculovirus expression cassette integrated in their genomes can produce relatively high amounts of heterologous protein. Most of this protein is easily extracted from the cytoplasm (no inclusion bodies characteristic of prokaryotic systems) or harvested from extracellular culture filtrate (when the expression cassette includes a secretory leader fusion peptide engineered to the recombinant protein). However, the cell machinery may be starting to shut down late in infection, which can impact, in particular, proteins requiring processing. Hence, some companies have introduced viral vectors with hybrid early/late promoters that permits the still functioning cell to process glycosylated or secreted proteins.

The process of creating and expressing heterologous protein with the plasmid-virion system is rather straightforward, in theory, but does require a bit of technical finesse and close attention to detail. The process begins with the engineering of the heterologous protein gene into a "transfer plasmid." This plasmid contains all the elements for autonomous replication in Escherichia coli, a bacterial selection marker (usually an ampicillin resistance gene), and elements of the baculovirus genome. The heterologous protein gene is inserted in a specific orientation and location into the plasmid so it is flanked by elements of the baculovirus genome. Successfully engineered plasmids are then cotransfected with viral expression vector (essentially wild-type baculovirus DNA with p10 and/or polyhedrin genes removed) into permissive host cells. Cell-mediated double recombination between viral sequences flanking the heterologous protein gene and the corresponding sequences of the viral expression vector results in the incorporation of the heterologous protein gene into the viral genome. Hence, recombinant progeny viruses will produce heterologous protein late in their life cycle.

The variety of commercially available insect-baculovirus expression systems, all of them proven in the research arena, makes it a very difficult task not to select one (as opposed to developing a system from "scratch") for a given protein expression need. Some of the major "chefs" in this regard include CLONTECH, Invitrogen, Life Technologies, Novagen, Pharmingen, Quantum Biotechnologies, and Stratagene.

CLONTECH offers a variety of baculovirus-insect expression vectors that are packaged as part of its BacPAK Baculovirus Expression System. The pBacPAK1, 2, 3 series of transfer vectors offer cloning in all three reading frames under the AcNPV polyhedrin promoter. Coexpression of two proteins from the same expression cassette under the polyhedrin and p10 promoters is also possible with the pAcUW31 vector. In addition to the transfer vectors, the system also includes pBacPAK6 Viral DNA for generation of target gene carrying recombinant virus.

The baculovirus expression system marketed by Invitrogen, MaxBac, provides transfer vectors for the generation of single-gene expression cassettes. Three different versions of the single-gene vectors are available to enable insertion of the heterologous protein gene into the correct reading frame. In addition, a unique secretory peptide (honeybee melittin) gene is available in the pMelBac vector. Host cells for the MaxBac system include D. melanogaster Sf9 and Sf21 strains and cabbage looper (T. ni) cell lines.

Life Technologies' principal insect-baculovirus product is the BAC-TO-BAC Baculovirus Expression System. This system is unique in that the generation of recombinant baculovirus relies on site- specific transposition (between transfer and expression vector) in E. coli, as opposed to homologous recombination in insect host cells. Transposition between bacmid (a form of baculovirus genome that replicates in E. coli), bMON14272, and transfer vector in E. coli DH10Bac produces a recombinant bacmid DNA. The transfection of insect host cells with bacmid DNA produces recombinant baculovirus virus, from which heterologous protein is expressed. The basis of this system is the pFASTBAC transfer vector, which contains the AcNPV polyhedrin promoter, with (pFASTBAC HT) or without a polyhistidine encoding gene, and the bacmid-containing E. coli host, DH10Bac. A dual-promoter transfer vector, pFASTBAC Dual, is also available for coexpression of two heterologous proteins, each under the control of the polyhedrin and p10 promoters.

Novagen's BacVector System from is one of the most comprehensive and versatile systems available, providing over 30 different transfer vectors (pBac) and 3 different baculovirus expression vectors (BacVector). Many baculovirus expression vectors have a deleted polyhedron gene, but Novagen has gone one step further. The BacVector-2000 lacks polyhedron and several additional non-essential genes. The BacVector-3000 is similar to the BacVector-2000, but also lacks protease and chitinase genes that reduce degradation of expressed proteins and decrease cell lysis. Novagen's transfer vectors include positive screening with gus, genes for N- and C-terminal peptide tags (cellulose binding domain [CBD], polyhistidine [HIS6], S-TagTM) to facilitate identification and purification, and secretory leader peptide (gp64 secretory leader) to direct extracellular export of the expressed protein product. There is also a choice of early, early/late, or very late (polyhedrin, p10, or pg64) promoters in the transfer vectors. A unique multipromoter transfer vector, pBAC4x-1, provides for the engineering of four target genes under the control of separate promoters (2 polyhedrin and 2 p10), enabling expression of four different proteins simultaneously. For virus surface display, Novagen's pBACsurf-1 incorporates a gp64 secretory signal peptide and anchoring sequences in fusions. The cloning of PCR products directly into transfer vectors is also possible with ligation-independent cloning-competent pBAC2, 7, and 8 vectors.

Pharmingen and Quantum Biotechnologies are other providers of baculovirus-insect expression systems. Pharmingen offers single- and dual-expression transfer vectors, with or without the glutathione S-transferase peptide tag. In addition, a ligation-independent cloning vector is also available for creating recombinant constructs from PCR products. Quantum Biotechnologies' primary product is the BacTen system, which provides both single- and dual-expression transfer vectors under the control of the p10 promoter. The pTenACE vector also possesses the Drosophila acetylcholine esterase signal peptide, to mediate extra-cellular export of the heterologous gene product through a secretory pathway.

The "flour" in this analogy refers to plant cells. Although they have not yet come into widespread use (relative to their insect or yeast cousins), Agrobacterium tumefaciens Ti plasmid-derived vectors have been used to express nonnative proteins in plant cells. Initial plant expression systems focused on the production of antibodies (U. Conrad and R. Fiedler, Plant Molecular Biology, 26:1023-30, 1994) and have since broadened to include other mammalians proteins, e.g., Leu-enkephalin, and human serum albumin. Although the functional status of some of these proteins remained in question for some time, a recent report on the expression of "biologically active" human interleukins in tobacco cell suspension culture (N. S. Magnuson et al., Protein Expression and Purification, 13:45-58, 1998), indicates that plant systems are capable of producing functional, nonnative protein. Eukaryotic expression vectors for plant systems are not currently available from commercial suppliers, but some products related to the task may be purchased from a few biosuppliers, for example, A. tumefaciens LBA4404 electrocompetent cells from LifeTechnologies, Inc. In addition, the American Type Culture Collection (ATCC, Manasas, Va.), a biological culture (vector and organism) repository, maintains an excellent collection of plant cell hosts and a few potential plant expression vectors.

Additional eukaryotic expression information, including general theory, detailed protocols, technical tips, and vector and host descriptions, may be found in any of the resources listed in Table 3 ("Eukaryotic Expression Information Resources"). In addition, Invitrogen and Novagen provide excellent online documentation, textual and graphic, through a combination of HTML pages and PDF files (Adobe Acrobat readable). These corporate resources are invaluable, not only for specific data on their products but also generic information on expression technology, biological hosts, and a plethora of citations to relevant literature.

Baculovirus Expression Vectors

Author Chris Smith can be reached at

Popular Now

  1. Scientists Continue to Use Outdated Methods
  2. Secret Eugenics Conference Uncovered at University College London
  3. Like Humans, Walruses and Bats Cuddle Infants on Their Left Sides
  4. How Do Infant Immune Systems Learn to Tolerate Gut Bacteria?