Prev User's Tutorial: Exploratory Data Analysis of fMRI Time Courses Next

Now that the map can load a multi-slice volumetric dataset, and the data vectors can be extracted, we are ready to add a data analysis module to the map and make the appropriate connections. We will be using the Bezdek fuzzy c-means (FCM) algorithm to find groups of data vectors that have a similar temporal activation response. FCM is a model-free algorithm that requires minimal information about the experimental paradigm.

A common data vector pre-processing step is normalization. Figure 36 shows the map with a normalization module connected to the output of the ROI Data Vecs Quad module. The output slot data_vecs from the ROI Data Vecs Quad module is connected to the input slot data_vecs(double-matrix)(req:1) in the Vector Normalization module, which is found in the IBD Analysis / Filters package.

The data vectors are in a double-matrix. Each time-course data vector is a row in the matrix. The width of the matrix is the same as the number of temporal time-instances used in ROI Data Vecs Quad, and the height of the matrix is the total number of ROI points (x,z) for all the z slices, as calculated by Binary Cube ROI. See the User's Guide for documentation on the module Vector Normalization in the IBD Analysis / Filters package.

Figure 36: Data vectors for analysis, from ROI Data Vecs Quad pre-processed by Vector Normalization.

The Vector Normalization module has a proponent screen where users can choose various vector normalization methods. The default is subtract mean, as shown in Figure 37.

Figure 37: Users can select different normalization methods in the proponent window for the Vector Normalization module.

Once the data vectors are normalized, they can be clustered by the fuzzy c-means algorithm. Figure 38 shows the Scopira map with the fuzzy c-means algorithm connected to the Vector Normalization module. The Fuzzy C_Means module, located in the IBD Analysis / Pattern Recognition package, is fully documented in the User's Guide.

Figure 38: Fuzzy C_Means module's input slot, features, connected to norm_vecs from the module Vector Normalization.

The features(num-matrix-type) is a required input slot, and it accepts a type-independent matrix, the num-matrix type, that is, the input connection may be a double-matrix, an int-matrix, or a float-matrix, and the abstract num-matrix type will return double or int values using virtual methods. It is not recommended that module writers use this type of input for algorithms due to performance costs. Generic modules, such as a data printer, which are not computationally intensive, should accept this type of abstract data type. The Fuzzy C_Means module is scheduled for code review and re-write, and the input slot will be changed to a double-matrix for computational efficiency. All the other input slots are actually property slots connected to proponents and control the algorithm parameters such as number of iterations and the number of clusters to find in the dataset, that is, the feature matrix. Figure 39 is the proponent windows for the Fuzzy C_Means module.

Figure 39: Fuzzy C_Means module's proponent window is used to set run-time parameters.

The default number of clusters is 2, but for this dataset it should be changed to at least 4, though for demonstration purposes it has been set to 7. Clicking on Apply will commit the changes to the proponent and when the module runs the new field values will be used. Clicking Apply and Run will not run the module as there is not data pending in the input slot when the module is newly added.

Closing the Fuzzy C_Means proponent window and running the Load Image Quad module by double-clicking it starts map execution, and eventually the data flow will be propagated to the Fuzzy C_Means module as illustrated in Figure 40. The current implementation of fuzzy clustering is computationally expensive, and the more clusters to find, and the larger the data vectors matrix, the longer it will take to execute. As the module runs, the current iteration number, is displayed on the output window as shown in Figure 41, and the output will also be echoed on the terminal console window from where Scopira was started.

Figure 40: Starting the map by double clicking Load Image Quad will eventually propagate data to Fuzzy C_Means.

Figure 41. As the map is running, module output will be displayed on the console window.

Changing the number of clusters to 5 and clicking on Apply and Run starts the engine, as shown in Figure 42. However, the fuzzy c_means module does not run because the feature vectors in the input slot were not specified as sticky and were consumed and flushed from the input slot the last time the module ran.

Figure 42: Updated Properties window for Fuzzy C_Means, the number of clusters reduced to 5, Apply and Run was clicked.

The Output window will not add a new output line for the Fuzzy C_Means module as it never ran properly due to lack of required input values. Right clicking on the Fuzzy C_Means module, and selecting Input Slots from the pop-up menu will display the Input Slot Properties as shown in Figure 43.

Figure 43: Input Slot Properties for the Fuzzy C_Means module and the required input slot features is now made sticky.

Also note that in Figure 43 the property slot num-clusters has a pending value of 5 and a current value of 7. That means the module last used 7 for the num-clusters slot, and the next time it runs (when all the required input slots have data pending), it will use 5 as the number of clusters. The Pending Value column shows that the required input slot num-clusters has no data for the next run, so it will not execute.

The input slots for the following modules should now be made sticky: Scale Quad so the dataset does not have to be loaded every time it is to be analyzed with different parameters; Binary Thresh Cube in case the user wants to adjust the threshold value and create a different ROI for the brain data; Vector Normalization so the user can use another normalization method without having to regenerate an entire data vector matrix.

Clicking on Load Image Quad will start the map from the start, but now data will be kept around in key modules where the user can interact and try different parameters. Changing the number of iterations from 33 to 22, will now work as expected when Apply and Run is clicked, as shown in Figure 44.

Figure 44. Once input slot is made sticky, module runs properly when parameter Maximum Iterations changed to 22.
Prev Top Next
Extracting Time Course Data Vectors to Analyze  Scopira Home Display of Analysis Results

Copyright © 2003 National Research Council