RAMOS (RApid MOtif Search) / Signature Search / Docking

A three-dimensional signature search of a motif inside a larger cryo-EM density map can be done using the following SPIDER procedures. Multiple occurrences of the motif and their locations and orientations inside the searched volume can be found. A rotationally invariant mask or an asymmetric mask can be used for calculating locally normalized correlation function. The motif could be obtained from various sources including crystal structure (if known) or a single particle reconstruction (if available). Or, alternatively, if a copy of the searched molecule is recognizable inside the larger volume, it can be cut out and used as the motif. The searched volume and the motif should have the same scaling. Uses Alan Roseman's Fourier formulation to calculate locally normalized correlation function.


The SPIDER procedures for doing the signature search are designed to run on a Linux cluster using PubSub parallel processing scheduler. sigs_pub.spi is the main procedure which calls: sigs_settings.spi to set parameters and another procedure: sigsloop.spi which does the actual computation.

STEP BY STEP INSTRUCTION FOR RUNNING THE PROGRAM :

  1. Copy sigs_pub.spi, sigs_settings.spi and sigsloop.spi to the directory where you want to run the program.
  2. Make sure that SPIDER is callable using the: spider operation.
  3. Convert the PDB motif into a SPIDER volume by using CP FROM PDB. If the motif is already in SPIDER format then skip this step. Filter the motif to the same resolution of the large volume if desired.
  4. Window the motif in x, y and z direction such that it is just contained inside a volume. This windowed motif is then used as the input volume whose occurrence is being searched inside the large volume. The windowed motif volume needs not be a cube unless a rotationally invariant mask is intended to be used.
  5. Choose a threshold value to create a binary mask of the motif. This mask should resemble the structure of the motif. Use TH M experimenting with different threshold levels until satisfied. The resulting threshold value should be given as an input in sigs_settings.spi .
  6. A coarse scanning in all three rotational degrees of freedom should be attempted first.
    A reasonable set would be :

    PHI = 0 - 350 at 10 degrees interval
    THETA = 0 - 180 at 10 degrees interval
    PSI = 0 - 350 at 10 degrees interval

    These values need to be set in the procedure sigs_settings.spi. Usually, a coarse scanning already finds the signatures of the motif at correct x,y,z locations inside the large volume.

  7. Create a directory "input" using "mkdir ./input". Copy the target volume and the prepared (windowed) motif into this "input" directory. <
  8. /p>

  9. Set all the input parameters in sigs_settings.spi .
  10. Run sigs_pub.spi using the following command:
    Command :    ./spider spi/dat @sigs_pub 0 &

  11. Process the output document files in the "./output" directory using the following Unix commands and perl programs

    UNIX commands:
        (i) cut -c9-100 DOC_OUT_*.dat > DOC_CUT_COMBINED

    (Combines the output files. DOC_OUT_* are the ouput files generated by the signature search procedures. This is done in this way to handle more than a million keys (lines in the document file) which cann't be handled by SPIDER document files.)

        (ii) sort -nr -k 7 DOC_CUT_COMBINED > DOC_SORTED

    (If any of the cross-correlation coefficient heights is written in Scientific format in the merged output file, DOC_CUT_COMBINED then, in LINUX, instead of the above command use :)

        (ii) sort -gr -k 7 DOC_CUT_COMBINED > DOC_SORTED

    (Sorts the file according to the cross-correlation coefficient heights.)

    PERL programs:
        (i) uniq.perl < DOC_SORTED > DOC_UNIQUE

    (Supresses the peaks inside an area equal to the size of the motif around the highest peak. The procedure continues for the next largest peak after the previous step is executed. The output file will now contain the unique peaks corresponding to the probable presence of the signature of the motif at various locations and in different orientations.)

        (ii) make_docfile.perl < DOC_UNIQUE > doc_unique.dat

    (Converts the output file from uniq.perl program to SPIDER document file by adding keys and number of registers etc. info.)

  12. A finner scan (range is dependent on the angular interval used in coarse scanning) around the Eulerian angles determined in the coarse scanning step may be done to accurately determine the position and orientation of the signature of the motif inside the the larger volume (repeat steps 8-10).
  13. The procedure orient_motif.spi can be used to create versions of the motif as its signatures are present inside the larger volume. Software EXPLORER can be used to visualize these versions of the motifs superimposed on the large volume.
  14. One may use the procedure window.spi for generating an average of the aligned signatures of the motif (cut and aligned from the large volume) as determined by the program.
  15. For aiding in visual identification of the locations inside the large volume where the signatures of the motif are found by the program, the procedures circle.spi and number.spi may be used.
  16. If the motif is created using a PDB file, one can rotate and position the pdb co-ordinates (according to the Euler angles and motif location found by the signature search procedures) using the SPIDER procedure pdbfit.spi and the PYTHON program pdbfit.py The rotated PDB volumes can now be visualized by using software "O" or EXPLORER.
NOTES: For a globular motif, one may use a rotationally invariant mask. The computation will then be much faster than when using an asymmetric mask. For non-globular motifs, an asymmetric mask should be used.

REFERENCES:

  1. B.K. Rath et al. (2003) Journal of Structural Biology 144, 95-103.
  2. Alan Roseman (2003) Ultramicroscopy 94, 225-236.


Source file: spider/docs/techs/sigsearch/sigsearch.html     Updated: 10/29/03     Bimal Rath