
1. Pardo M, Sberveglieri G, Wold B Yet another Feature Selection Study for Microarrays Meeting: BITS 2004  Year: 2004 Full text in a new tab Topic: Microarray algorithms and data analysis Abstract: Two histopathologically different kinds of rhabdomyosarcoma (RMS) alveolar and embryonal RMS are associated with distinct clinical characteristics and different cytogenetic properties. Affymetrix microarrays (U133A/B) were used to characterize the 74 tumoral tissues of both kinds. For consistency with previous work, 8801 genes have been considered in our analysis. Also, the train/test division had been fixed to 56 training and 18 test data. Feature Selection (FS) is both useful for enhancing the classification performance and, more importantly, to discover biologically relevant genes. Therefore, FS is a hot topic in the application of machine learning to the analysis of microarray data. 
2. Fu LM, Medico E FMC, a Fuzzy Map Clustering algorithm for microarray data analysis Meeting: BITS 2004  Year: 2004 Full text in a new tab Topic: Microarray algorithms and data analysis Abstract: As the microarray technology is emerging as a widely used tool to investigate gene expression and function, laboratories over the world have produced and are producing a huge amount of data, which demand advanced and specialized computational tools to process them. Clustering methods have been successfully applied to such data to reorganize the data and extract biological information from them. But the classical clustering methods [1] such as kmeans and hierarchical clustering have some intrinsic limits such as the linear, pairwise nature of the similarity metrics (which fail to highlight nonlinear substructures of the data) and the univocal assignment of each gene to one cluster (which may fail to highlight clustertocluster relationships) [2]. Here we introduce a novel method for clustering microarray data, named Fuzzy Map Clustering (FMC), which may partly overcome these limits. Basically, the clustering process of FMC starts from identification of an initial set of clusters by calculating the “density” around each data point (object), that is, the average proximity of its K nearest other objects (K neighbours) and choosing the ones that have the highest density among all their K neighbors. K can be a fixed number of choice or the number of neighbors within a distance threshold. Then, each object in the dataset is assigned a fuzzy membership to all the defined clusters (a vector containing a percentage of membership to all the clusters). Membership is assigned so that similar objects have similar fuzzy membership vectors. Membership assignment is optimized by measuring how the fuzzy membership vector of one object can be approximated by the vectors of its neighbors. Finally, a process based on the merging of adjacent clusters and fuzzy membership reassignment is reiterated until the number of clusters is reduced to a fixed one decided by the operator. Our computational experiments have shown that FMC can correctly reveal the true cluster structure of the dataset if such structure exists, even if the clusters contained in the dataset have arbitrary shape. And perhaps the basic idea underlying FMC points out a new way to develop novel clustering methods with good mathematical foundation. 
3. Masulli F, Rovetta S Ensembling and Clustering Approach to Gene Selection Meeting: BITS 2004  Year: 2004 Full text in a new tab Topic: Microarray algorithms and data analysis Abstract: In pattern recognition the problem of input variable selection has been traditionally focused on technological issues, e.g., performance enhancement, lowering computational requirements, and reduction of data acquisition costs. However, in the last few years, it has found many applications in basic science as a model selection and discovery technique, as shown by a rich literature on this subject, witnessing the interest of the topic especially in the field of bioinformatics. A clear example arises from DNA microarray technology that provides high volumes of data for each single experiment, yielding measurements for hundreds of genes simultaneously. In this paper, we propose a flexible method for analyzing the relevance of input variables in high dimensional problems with respect to a given dichotomic classication problem. Both linear and nonlinear cases are considered. In the linear case, the application of derivativebased saliency yields a commonly adopted ranking criterion. In the nonlinear case, the approach is extended by introducing a resampling technique and by clustering the obtained results for stability of the estimate. The method we propose (seeTab. 1) is termed Random Voronoi Ensemble since it is based on random Voronoi partitions , and these partitions are replicated by resampling, so the method actually uses an ensemble of random Voronoi partitions. Within each Voronoi region, a linear classification is performed using Support Vector Machines (SVM) with a linear kernel , while, to integrate the outcomes of the ensemble, we use the Graded Possibilistic Clustering technique to ensure an appropriate level of outlier insensitivity. 