Interfacing Between Chimera and Python Scripts


Our goals for the afternoon will be to:
  • Model the sectors discovered by Dahirel et al. onto the HIV Capsid structures
  • Reproduce the figures and statistics in the modeling section of the Dahirel et al. paper
  • Use python scripts to make our lives easier
  • Reproduce portions of the Dahirel et al. figures

Using Chimera as a library AND as a graphical interface

Open up a terminal window and try out the following commands:

chimera --help
 
chimera --nogui myscript.py
 
chimera --nogui --script myscript.py arg1 arg2 ...

The --nogui flag keeps Chimera from trying to keep up with these actions in the gui, and will thus let your script run faster. You can always remove this flag if you want visual confirmation of whatever you are trying to accomplish.

Places to go for help

As an alternative to writing scripts and running them from the terminal, you can also access a python shell from within Chimera. Open up the IDLE window from within Chimera and play around with it for a bit. Once you're done, download the 3GV2 structure to a directory on your computer, start Chimera from that directory, and enter the following code in the IDLE window:

template = chimera.openModels.open('3GV2.pdb')

Let's start off with something pretty simple, such as pulling the sequence of Chain A out of the model.

for r in template[0].residues:
    if r.id.chainId == 'A':
              print r.type, r.id

If we want to do something a bit more complex with the output, we're probably better off using a python script:

#8_2_1.py
import chimera
 
template = chimera.openModels.open('3GV2.pdb')
for r in template[0].residues:
    if r.id.chainId == 'A':
              print r.type, r.id


Again, we run this code by using a chimera command from the terminal:
chimera --nogui 8_2_1.py

That's great to see that we can pull information out of Chimera, but the formatting is awful, and we probably want the output in a FASTA format. Here's a quick script that could accomplish that task:

#8_2_2.py
import chimera
import os
 
uglyresi = []
 
template = chimera.openModels.open('3GV2.pdb')
for r in template[0].residues:
    if r.id.chainId == 'A':
        uglyresi.append(r.type)
#print uglyresi
#print uglyresi[2]
def threelettertoone(ugly):
    out = ''
    dict = {'ALA':'A','ARG':'R','ASN':'N','ASP':'D','ASX':'B','CYS':'C','GLU':'E','GLN':'Q','GLX':'Z','GLY':'G','HIS':'H','ILE':'I','LEU':'L','LYS':'K','MET':'M','PHE':'F','PRO':'P','SER':'S','THR':'T','TRP':'W','TYR':'Y','VAL':'V'}
    for r in ugly:
        #print "IN FOR LOOP!!"
        out += dict[r]
    return out
 
tmp2 = threelettertoone(uglyresi)
fh = open('chainA.fasta', 'w')
fh.write(">Chain A \n")
fh.write(tmp2)
fh.close()
 

We can now use this output file for an alignment against the Clade B HIV Consensus sequence (as described in Table S1) to find where the p24 protein lies in Gag:
Alignment of p24 Chain A against Consensus HIV Sequence

For our last modeling exercise, we'll finally attempt to recreate the Dahirel et al. figures. Before we do that, you should look through the Chimera documentation on the selection module: Selections


Afternoon Exercise:
1) Use the selection module to select all residues in the 3GV2 structure, and run the script without the --nogui flag to see if it worked properly.
2) Write a python script that iterates through the lists of Sector 3, finds the corresponding residues in 3GV2, and selects only those residues. Save them from within the Chimera GUI using Select -> Name Selection.

gagsect1 =[1,2,3,4,5,6,8,9,11,12,14,16,19,20,21,24,27,29,32,33,35,36,38,39,41,45,48,50,51,52,57,60,63,73,77,79,83,86,87,88,94,87,99,100,108,118,120,122,123,128,129,131,133,134,135,136,138,139,140,141,143,144,145,148,149,150,151,152,153,154,155,156,158,160,251,276,279,433]
 
gagsect3 = [53,140,163,167,169,170,171,172,174,175,179,180,181,182,185,186,187,189,191,198,199,212,221,225,229,233,240,243,245,249,257,260,263,265,269,284,288,291,295,305,306,310,316,317,323,326,338,344,345,346,347,363,364,365,366,367]

3) Reproduce Figure 2A from the Dahirel et al paper with a monomer of 3GV2 and with 3H47. Given what you know about these two structures, which do you think is the more informative figure?
4) Modify your code from (2) to work for Sector 1, and color those residues cyan. Compare this to Figure 2C.

Solutions:
1. [[code format="python"]]
import chimera
from chimera import selection
import os

op = selection.EXTEND
sector = selection.ItemizedSelection()

template = chimera.openModels.open('3GV2.pdb')
reslist = []
for r in template[0].residues:
reslist.append(r)

sector.add(reslist)
selection.mergeCurrent(op,sector)
 
2.  [[code format="python"]]
import chimera
from chimera import selection
import os
offset = 132
gagsect1 =[1,2,3,4,5,6,8,9,11,12,14,16,19,20,21,24,27,29,32,33,35,36,38,39,41,45,48,50,51,52,57,60,63,73,77,79,83,86,87,88,94,87,99,100,108,118,120,122,123,128,129,131,133,134,135,136,138,139,140,141,143,144,145,148,149,150,151,152,153,154,155,156,158,160,251,276,279,433]
 
gagsect3 = [53,140,163,167,169,170,171,172,174,175,179,180,181,182,185,186,187,189,191,198,199,212,221,225,229,233,240,243,245,249,257,260,263,265,269,284,288,291,295,305,306,310,316,317,323,326,338,344,345,346,347,363,364,365,366,367]
 
def gagtop24(list):
    out = []
    for item in list:
        if(item <= 364 and item > 132):
            out.append(item-offset)
    return out
 
p24sect1 = gagtop24(gagsect1)
p24sect3 = gagtop24(gagsect3)
#print p24sect1
#print p24sect3
 
op = selection.EXTEND
sector = selection.ItemizedSelection()
 
template = chimera.openModels.open('3GV2.pdb')
reslist = []
##if you change the variable "p24sect1" here to "p24sect3", it will select all residues in sector 3 instead!
for r in template[0].residues:
    tmp = (str(r.id)).split('.')
    val =int(tmp[0])
    #print "The val is " + str(val)
    #print type(val)
    for x in p24sect1:
        #print "x is " + str(x)
        #print type(x)
        if (x == val):
            reslist.append(r)
 
sector.add(reslist)
selection.mergeCurrent(op,sector)

3. 3H47 is probably more informative, as it's a 1.8 A structure instead of a 7A structure.

4. It looks the same!