Approaches to 2D particle alignment can be subdivided into
several categories. The main division is created by the availability
of a reference image, and the secondary division by the degree of variability
within the data set, i.e., in how many orientations the particle
is observed to lie in a micrograph.
Types of alignment problems:
- One or a small known number of reference images
are known or can be easily approximated, and
particle orientations, i.e. the way the particle sits on a surface,
are well defined (with possible small variations).
This case will be referred to as
Reference-based alignment.
- An approximation of a reference image is known and there is only one
particle orientation (with possible small variations).
This case will be referred to as
Refined Alignment with a reference.
- Reference images are not known, but the data set
can in principle be divided into a known number of homogeneous
classes. This case will be referred to as
Multireference classification alignment.
- Reference images are not known, but the data set
can in principle be divided into a known number of homogeneous
classes. The particles can be centered. This case will be referred to as
Rotationally invariant K-means alignment.
- Reference images are not known, and there is no clear
groupings in the data set. This case will be referred to as
Reference-free alignment.
Reference-based alignment
We assume that a limited number of reference images are known or that a good
approximation of them are available. We expect all the particles to be noisy
versions of the reference, with possible small variations. In this case
the alignment problem becomes a pattern matching problem. We
have to place every particle in an orientation in which it will best match
the reference image. In the case of multiple reference images, in
addition, we have to decide which reference is the most similar one. We
must also try the mirror orientation since the particle may be flipped.
We use the cross-correlation coefficient to measure the similarity between
a particle and a reference.
The ref-mult-ali.spi procedure implements
reference based alignment with multiple references. In this procedure alignment
is done using 'AP SHC' where
search for rotation is integrated with the search for translation resulting
highly accurate but somewhat slow alignment determination.
The operation: AP REF could be used for poorer but
faster alignment determination.
Advantages of reference-based alignment:
- It is very fast and robust. Since all the reference images are
known, every particle can be matched independently to all of them
and the correct assignment can be based on a well-defined similarity measure
(the correlation coefficient).
- The best alignment is found in one pass through the reference
images.
- Results are easily verifiable. Since the reference images are known,
it can be easily verified by visual inspection whether the aligned particles
are in the proper orientation and how well they match the reference images.
Disadvantages of reference-based alignment:
- It relies strongly on the assumption that the particles
resemble the reference image. If this assumption is not true, the
average of the aligned particles will (for noisy data) look like the
reference, and it is difficult to decide whether this similarity is
real or is caused by enhanced noise.
- If exact reference images are not known, it is difficult and
time consuming to come up with good approximation of the reference.
Refined Alignment with a reference
We assume that a set of particles from one motif is
available. Particles are not identical, but they share the same motif (e.g.
they are all oriented on their same side on a surface). A reference
image may be available or can be calculated from the sample images.
The refi-ref-ali.spi
procedure begins with
calculation of the global average to approximate the reference, then
aligns all the images using the
'AP SHC' operation, and calculates new
average to obtain an improved reference. These steps are iterated a
prescribed number of times.
Advantages of refined Alignment with a reference:
- This procedure is simple, fast, and robust. In case of a
near-homogeneous data set one can obtain high-quality alignment.
Disadvantages of refined Alignment with a reference:
- The result depends on the first approximation of the reference image.
By changing the way the first reference image is created one can obtain
different results and it is difficult to determine which one is
correct/better.
- If the first reference image is not a good approximation of the
"true" average or if data set contains more than one orientation
the results may not be stable.
Back to the beginning
Multireference classification alignment
We assume that a very large data set is available. It comprises
particles in a few distinct orientations. The data set is sufficiently
large that at least some of the similar views occur in similar in-plane
orientations, and so can be averaged. Thus,
if we can approximately center the particles, the
subsequent classification step should reveal some of the classes. These
classes are used as reference images in the next multireference
alignment step, classification is repeated, and new classes are formed.
This procedure is iterated until stable classes are obtained.
Such a multireference classification alignment is sometimes called
alignment through classification. This name reflects the idea
that alignment is done separately within groups produced by the
classification step.
The ref-mul-class-ali.spi
procedure implements multireference alignment using 'AP SH'
operation to do the alignment. This operation employs exhaustive search
to find rotation and translation simultaneously. In principle it should
be more accurate than using 'AP REF',
but it is much slower (particularly for large number of classes). This
program uses the additional procedure:
centr.spi
Since multireference alignment is a general idea rather than a
detailed algorithm,
ref-mul-class-ali.spi
constitutes a particular implementation. It
should be considered a blueprint upon which one can build one's own
procedure optimized for the particular data set.
- It is assumed that all the windowed particles are normalized in
the same way.
- The following free parameters have to be decided:
- - Radius for alignment and mask -- should correspond to the
particle radius;
- - Whether classification is done using all pixels within
mask in the computation of Euclidean distance, or
factors from Principal Component Analysis (PCA);
- - If PCA is to be used, the number of factors has to be set;
- - the number of groups into which the data set will be divided -- this
determines the number of class averages that will be obtained;
- - The number of times the procedure should be repeated.
- The steps implemented in
ref-mul-class-ali.spi:
- All the particles are centered using
centr.spi.
In this procedure each particle
is centered using its own rotational average as a reference, the
particle is shifted, its new rotational average is formed and used as a
reference, and so on, until no further shift is possible.
- The particles are classified using k-means clustering.
Depending on the flag set either the raw particles are classified or a
preset number of factors from PCA are used for classification.
- Class averages are calculated.
- Class averages are centered using the
'CG PH' operation (phase
approximation of the center of gravity).
- Class averages are rotationally aligned using the
'AP RA' operation
(reference-free rotational alignment).
- All the particles are aligned using class averages as
reference.
Each particle is placed in the orientation of its most similar reference
image. The alignment includes rotational alignment, shift alignment,
and a check of mirrored orientation. Rotational alignment is done using the
AP MD operation and is separated from the shift alignment. Shift is
corrected using the most similar image (as determined by AP MD) as a
reference.
- Alignment parameters are combined with the alignment
parameters obtained in the previous step and a new, aligned image series
is formed.
- Steps 2-7 are repeated a prescribed number of times.
Advantages of multireference classification alignment:
- It is quite powerful. It is possible to obtain stable groups for data
with very low signal-to-noise ratio (SNR). It works for data sets containing
a mixture of entirely different views (an often-encountered problem, in which side
views are, say, rectangular, and top views are circular).
- The approach is a general idea rather than a "black-box" program;
thus, it can be easily modified to the requirements of a particular data
set.
- There are many parameters that can be adjusted to better control the results.
- Results are easily verifiable. Since the class averages are formed
it can be easily verified whether the aligned particles are in the proper
orientation and how well they match the averages.
Disadvantages of multireference classification alignment:
- A very large data set is needed. The program depends on the initial
orientation of particles, i.e., at least some of the similar views occur
in similar in-plane orientations, so that meaningful averages can be formed.
Statistically, this can only happen in an adequately large data set.
Moreover, these averages should have a sufficiently high SNR to jumpstart the
alignment, so they should each contain a sufficient number of particles.
- The result is somewhat unpredictable. It is impossible in practice
to verify whether rare views were revealed as classes or remained
misaligned and/or misclassified.
- Since the approach is a general idea rather than a well-defined
procedure, the result will differ depending on the particular implementation.
Thus, results obtained by different users/groups are difficult to
compare.
- Even if the general framework is decided upon the large number of
crucial free parameters leaves the user with hard choices to make. The
results will depend on the values chosen and will differ from one trial
to another. The two most difficult choices are the number of clusters and
number of factors for PCA. Too few clusters will conceal rare views, while
too many will result in large numbers of very similar averages, or else the
procedure will fail due to a too-low SNR.
- The procedure is very slow.
Back to the beginning
Rotationally invariant K-means Alignment
We assume that the particles were centered and we can divide
the data set into a specified number of orientation classes. In this case,
operation 'AP CA'
will perform classification and alignment. For each particle the
rotation angle as well as the group assignment will be found.
The procedure: rotkm-ali.spi
demonstrates how to use 'AP CA'
and how to calculate group averages.
Back to the beginning
Reference-free alignment
The rationale of the reference-free alignment is explained in the
Introduction to Reference-Free Alignment.
The procedure will seek such orientations of all the particles in the data
set that all the possible pairs of images from this
set are in the 'best' relative orientation as determined by the
maximum of the CCF.
The reference-free alignment procedures were designed for very noisy data, for
particles in many different orientations, and in general for cases in which
a reference image is unknown or in which its usage could result in a bias and
incorrect results. There are three basic operations in SPIDER that implement
this strategy:
'AP SA' is a shift alignment,
'AP RA' is a rotational alignment, and
'AP SR' is a combined shift and
rotational alignment.
In addition, 'AP CA'
performs both classification and
rotational alignment for pre-centered data. Unlike previous procedures
none of these procedures
checks mirrored orientations; thus, any mirror-related views will appear
as two different orientations. All the alignment operations can be either
used separately or as a part of longer, more elaborate alignment
schemes.
The procedure:
ref-free-apra-ali.spi uses
'AP RA' to
rotationally align an image series and applies parameters stored by
the operation in a document file to rotate all the particles.
Subsequently, aligned articles are subjected to PCA and
classified using
hierarchical classification .
The procedure:
ref-free-apsara-ali.spi
alternates between
'AP SA' and
'AP RA' to align an image series both
translationally and rotationally.
The procedure:
ref-free-apsr-ali.spi uses
operation 'AP SR' to
align an image series and applies parameters stored
in a document file to rotate and shift all the particles.
Subsequently, aligned particles are subjected to PCA and
classified using Hierarchical Classification.
Another approach to alignment uses self-correlation functions.
See
'PO',
'CC P',
'AC S',
'AC NS',
'AC MSS',
'EP TM', and
'CC MS' for info on usefull operations.
Advantages of reference-free alignment:
- The operation 'AP SR'
is very fast and robust.
- The method has very few free parameters -- essentially only the
radius of the particle. The results do not depend significantly on
these parameters, and there are no assumptions made about the reference,
number of groups, and so on.
Disadvantages of reference-free alignment:
- It is difficult to assess how well the particles were aligned. In
most practical cases the program gives a nearly-optimum solution, but in
some cases (particularly for mixtures of entirely different shapes, but also
for very low SNR or very small data sets) it may fail. In these
situations one should either use a combination of
'AP SA' and
'AP RA'
(with more free parameters, and thus easier to control),
or multireference alignment.
Back to the beginning
Source: align.html
Last update: 21 Mar 2012