clusterclara            package:goCluster            R Documentation

_C_l_u_s_t_e_r_s _a _d_a_t_a_s_e_t _w_i_t_h _t_h_e _c_l_a_r_a _f_u_n_c_t_i_o_n

_D_e_s_c_r_i_p_t_i_o_n:

     This function is used in the goCluster framework to cluster a
     dataset with the clara function.

_U_s_a_g_e:

     clusterclara(dataset, clusters, distance = "euclidean", repeats = 1, fixed = TRUE)

_A_r_g_u_m_e_n_t_s:

 dataset: The dataset to be clustered. This has to be a matrix. 

clusters: This specifies the number of clusters that the dataset should
          be partitioned into. 

distance: The distance metric that is going to be used by clara. 

   fixed: This option determines whether the analysis should start with
          a fixed random seed or with a truely random seed. A fixed
          seed leads to a stable result but does not represent the
          inherent variability of the clustering approach. 

 repeats: In case clara clusters without a fixed seed it may be useful
          to repeat the clustering in order to get an impression of the
           variability of the clustering result. This option specifies
          the number of repeats. 

_D_e_t_a_i_l_s:

     Clara clustering will partition the dataset of the parent object
     into the number of clusters specified by the user.  Clara is very
     similar to PAM (partitioning around medoids) but has been adapted
     to large datasets. It has the same advantages as PAM concerning
     the judgement of quality for the resulting clusters but it lacks a
     deterministic outcome.  Therefore it is significantly faster than
     PAM. You may request a stable result by using a fixed seed, but
     this can convey an incorrect impression of the stability of the
     result. Alternatively clusterclara can repeat the clustering
     though that will partially defeat the gain in speed over PAM.

_V_a_l_u_e:

     A "tree" (list of lists) of clusters. The first level will hold as
     many list elements as the number of times the clustering has been
     repeated. Each of these elements holds a number of lists equal to
     the number of clusters requested .Each of node on this second
     level hold the unique ids of the genes in the cluster.

_A_u_t_h_o_r(_s):

     Gunnar Wrobel, <URL: http://www.gunnarwrobel.de>.

_S_e_e _A_l_s_o:

     'clusterAlgorithmClara-class' 'clara'

_E_x_a_m_p_l_e_s:

     require(cluster)

     ## Get the benomyl setup
     data(benomylsetup)

     ## Extract a fraction of the dataset
     benomyldata <- benomylsetup$data$dataset[1:200,]
     benomylids  <- benomylsetup$data$uniqueid[1:200]

     ## Cluster the dataset
     clusterclara(exprs(benomyldata), 4)

