SPIDER: WV P (Window averaging

WV P - Window averaging - over Patches

(3/2/89)

PURPOSE: Correlation averaging of crystalline lattices: [This operation assumes that the first step of correlation averaging has been completed: use of 'CC' (compute cross-correlation) and 'PK C' (peak search with center-of-gravity option) to obtain a ranked list of peak coordinates indicating the locations of the repeats of the unit cell.] Partition large raw file into regular patches. Window out areas from raw file, according to X,Y coordinates in peak search document file, and sum these separately according to their patch location. An average is created for each patch. The resulting stack of "patch averages" is stored in 3-D format in a single output file. The individual averages may be retrieved by the operation 'PS Z' (Pick Slice in Z direction). Reference: see J. Frank, W. Chiu, and L. Degn, Ultramicroscopy26 (1988) 345-360.

SEE ALSO

WV	[Window averaging]
WV S	[Window averaging - Sequential document search]

USAGE

.OPERATION: WV P

.SCANNED DATA FILE: RAW001
[Enter the name of the large data file to be used as input]

.AVERAGE FILE: AVE001
[Enter the name of the file where the stack of averages is to be kept. The file will be opened with a 3-D data format, where NX, NYW are the dimensions of the data window, to be specified in the next query, and NZ is the number of patches to be specified below]

.WINDOW DIMS NX,NY: 64,64
[Enter the dimensions of the data window to be selected at each peak location (= dimensions of the average created for each patch)]

.PATCH DIMENSIONS: 200,200
[Enter the size of the patch. In order not to loose data close to the edges of the patch, the following relationship should be observed in both dimensions: PATCH SIZE = INCREMENT + WINDOW SIZE]

.NUMBER OF PATCHES, GENERATE PATCHES?(0/1): 100,1
[Enter the total number, NPATCH, of patches to be used, and select the way the patches are specified: GENERATE PATCHES=0: patch coordinates are supplied for NPATCH patches; GENERATE PATCHES=1: patch coordinates are generated by the program based on a regular grid.]

Option '0': In this case, NPATCH pairs of numbers must be supplied:

.INTEGER CORNER COOS OF PATCHES: 10,20 310,20 540,50
540,490 ... [In each pair, the first number is interpreted as the x coordinate, the second as y coordinate. The pairs of numbers may be entered in free format, on consecutive lines if necessary. The choice of patch positions can be totally arbitrary.

Option '1': In this case, NPATCH pairs of coordinates are generated by the program according to the INCREMENT specification:

.STARTING COOS: 10,50
[give the x,y coordinates of the first patch]

.INCREMENTS X,Y: 136,136
[give the patch position increments in x and y direction. Beginning with the STARTING COOS, a regular rectangular grid is created which defines the patches. Patches are counted from left to right, top to bottom, in the way a book is read. This means that the average of the top left patch will be the first slice in the AVERAGE FILE, the average of the bottom right patch will be stored as the last slice.]

The NUMBER OF PATCHES USED will be printed out, followed by a list of patch coordinates to be used. Note that for the GENERATE PATCHES=1 option, the NUMBER OF PATCHES USED may be different from the NUMBER OF PATCHES specified earlier, because it is calculated from the STARTING COOS and INCREMENT coordinates.

.DOCUMENT FILE: DOC001
[Enter the name of the document file containing the peak coordinates. The document file must contain the peak values in the following sequence: KEY, X, Y, HEIGHT. In this way they are created by operation 'PK C']

The following message appears:

** NUMBER OF PEAKS IN DOCUMENT FILE <NPEAK>

where <NPEAK> is the number of peaks retrieved from the Document file. [This number may be different from the the total number of peaks stored by 'PK C' in the file, because the peak information is read back into core using an in-core unsave routine which is currently limited to 1000 keys.]

.PEAK VALUES (FROM, TO): 0., 1.4
[this allows the absolute range of peak heights to be limited. Peaks with values outside this range will be excluded.]

.RANGE OF RANKS IN PERCENT (FROM,TO; 100=BEST): 60,100
[enter the rank range, in terms of percentage of total number of peaks, of peaks to be used. A specification of 0,0 will cause all peaks to be accepted within the PEAK VALUES window. It will also cause rank sorting by decreasing size to be skipped, which will not be noticed unless the document file was prepared by a non-standard method (i.e., not by 'PK C').]

The following message appears:

** <NPK> PEAKS BETWEEN <THL> AND <THH> READ IN

where <NPK> is the number of peaks in the absolute value range (<THL>,THH>) specified above.

.X-OFFSET, Y-OFFSET: 100,500
[this allows the use of a raw data file that is a portion of the original file for which the peak list has been generated. If both are identical, specify 0,0]

.SCALE FACTOR: 2.
[if the CCF is based on a reduced version of the original raw data file, the geometric scaling factor has to be used to re-scale coordinates of the peak positions to the actual coordinates of the unit cell repeats in the raw data file. If both CCF and raw data file are on the same scale, use 1.]

Next, the following messages appear:

** NUMBER OF PEAKS AFTER COO CHECK = <NPKCK>

where <NPKCK> is the number of peaks left after the outside margin of the raw data field has been taken into account;

** NUMBER OF PEAKS WITHIN RANK RANGE = <NPKR>

where <NPKR> is the number of peaks left after the additional rank limitation has been applied.

.NO. OF PEAKS TO USE, DUMP? (0=NO; 1=YES; 2=FULL): 800,0
[of the <NPKR> peaks left, the highest NPEAK peaks may be selected. If no selection is requested, specify a number that is large enough; e.g. the original number of peaks. The DUMP option is only used for special debugging purposes. Do not use it, because the amount of print output may be staggering!]

At the end of the program run, the following information is given:

** DIMENSION OF CCF WINDOW USED: <NXC>, <NYC>

where <NXC>, <NYC> are the X and Y dimensions of the raw data field after edge exclusion;

** NUMBER OF WINDOWS IN EACH PATCH
20 19 20 21 ...

This number is not necessarily constant because the crystal grid and the patchwork grid normally "beat" each other. Another reason would be the existence of poor areas within the field, which would cause a drop in the number of good peaks found. The numbers of windows are printed out for completeness, because they give an account for changes of statistics from one patch to the next.

SUBROUTINES: WINAVE2, NAREA

CALLER: UTIL2