We consider expressing this protein in bacterial strains. Thus, we should further characterize this protein (and the nucleotides encoding it), and see if we have any indications to choose a specific bacterial strain.
  1. Are there rare codons in the nucleotide sequence?
  2. Does the protein contain S-S bonds?
  3. Are there any localization signals we should consider removing?
Let's run the following analysis:
Presence of rare codons using Rare Codon Caltor
Possible existence of S-S bonds using CysPred which is part of PredictProtein .
Putative localization signals using SignalP, PsortII
Putative phosphorylation and glycosylation signals using: Netphos, YinOYang , NetNGlyc, NetOGlyc
Check results
Check results (scroll down to CYSPRED prediction marked in red)
(To see full links, you'll have to run your own analysis)
Check results of SignalP
Check results of PSORT II 
(To see full links, you'll have to run your own analysis)
Check results of Netphos
Check results of YinOYang
Check results of NetNGlyc
Check results of NetOGlyc

Examine all the results (~10-15 minutes)
then continue.

Summary of Current Search

  1. Several rare codons exist within the nucletide sequence, including three pairs of consecutive rare codons.
  2. Cysteines 257 and 261 are predicted to be part of S-S bonds, whereas cysteine 187 is predicted to have a free SH.
  3. SignalP (where this sequence is defined as "eukaryotes") predicts a cleavage site of a signal peptide between residues 34 and 35. (This prediction is given based on both neural networks (NN) and hidden Markov models (HMM). This gives more credit to this prediction.)
  4. Psort II (where this sequence is defined as "yeast/animal") suggests a cleavage site between residues 34 and 35, a transmembrane domain between positions 175 - 191 and the C-terminus is recognized as a cytoplasmic tail. The overall prediction for this protein's localization is:

  5.   39.1 %: cytoplasmic
      17.4 %: mitochondrial
      17.4 %: nuclear
      13.0 %: endoplasmic reticulum
       4.3 %: Golgi
       4.3 %: vesicles of secretory system
       4.3 %: peroxisomal
    Note that the literature shows that NS4B is located on the ER membrane, whereas the computational prediction for this localization is rather low!
    5. Phosphorylation may occur on serines 88, 159, 164 (most probable); threonine 165, tyrosines 6, 223 .
    6. O-Glycosylation may occur on no residue (according to NetOGlyc 3.0 Server),  or on serines 113, 227, 258 and threonines 165, 234 (most probable), 259 (according to YinOYang 1.2).
    7. N-Glycosylation does not occur (according to NetNGlyc 1.0).


    Does this suggest anything as to the design of the expression/purification experiments?
    Read ALL answers and choose between "right" and "wrong".
    There may be more than one "right" answer.
    1. This sequence should be expressed in Codon Plus (RIL) bacterial strain.

    2. This sequence should be expressed in Rosetta-gami bacterial strain.

    3. If we truncate the C-terminal of the protein, it will have a better chance to be expressed as a soluble protein in bacteria.

    4. This sequence should be expressed in Glycoria bacterial strain, which allows excessive glycosylation.

    5. This sequence will be phosphorylated on serines 159 and 164 upon expression in any bacterial strain.

    6. Since this protein contains hydrophobic segments and is predicted to have a trans-membrane domain, we should express it in bacteria engineered to have more membranes (C41 or C43).

    7. I repeat my previous suggestion: this project is too complicated. Let's abandon it. (Better late than never!!) .

    Have you examined all possibilities?

    Based on the computational analysis, we've decided to use the construct our collaborator sent us. This is pDEST15-NS4B, where NS4B is fused to GST(so we expect a product of ~53 KDa). We've expressed it in two bacterial strains: Rosetta pLysS and Origami B pLysS. As a control we used another construct: pDEST15-GFP (a product of ~55KDa is expected). In addition we've tried to express this protein using Roche's RTS in-vitro translation system, which is based on E-coli extracts. Here we used the control vector supplied by the manufacturer - pIVEX-His-GFP. (A product of ~30KDa is expected).
    If all three expression systems fail to express NS4B, we'll consider either truncations and re-cloning of the protein or using an alternate expression system.

    First, let's examine the expression profile of the Origami strain.