Biological Clustergram Server

Exploratory Clustering

Brief explanation of this page

Paste your data to be clustered here, and select options you want below.
You should probably keep the number of genes below 500 or so.
If you hit the 'Sample' button on the right of the input area, it will put in a sample data file which you can experiment with.



Option Test Always Never
Log Transform
Gene Centering
Gene Normalization
Array Centering
Array Normalization
Centered metric
(Always = Centered)
(Never = Uncentered)
Distance metric
(Always = Euclidean)
(Never = Pearson)

Do centering with mean or median?
Center Genes/Arrays with Mean
Center Genes/Arrays with Median
Cluster Genes or Arrays?
Cluster Genes Only
Cluster Arrays Only
Cluster both Genes and Arrays

  

Brief Explanation of this web page

Clustering can sometimes seem complicated - there are many ways to cluster, and how you should do it often depends on what you're trying to look at. This page was designed to help you look quickly through the results of changing different parameters. For a description of these parameters I refer you to the documentation to Mike Eisen's Cluster and TreeView programs: http://rana.lbl.gov/manuals/ClusterTreeView.pdf. Another page of some interest which shows the variation in results you can get with different clustering metrics is http://ep.ebi.ac.uk/Docs/dist_clust.

Thanks to Gavin Sherlock for allowing me to use XCluster as the clustering engine for this page. This also means that your data needs to be in a format that is readable by XCluster. A description of the file format can be found at http://genome-www.stanford.edu/~sherlock/cluster.html#formats. Some simple file format adjustment is taken care of automatically to make it slightly more flexible, so you don't actually need a 'GWEIGHT' column and blank lines get filtered out.

Output is generated by slcview.

I do not keep track of who uses this page nor what they cluster. All files generated by clustering your data are (usually) deleted shortly after they are generated. If you still have concerns about the privacy of your data I suggest you not use this page.

If you select 'Test' for an option, it will cluster your data both ways. For example, if you select 'Test' for log transform, then it will cluster your data 2 ways - first log transform, then cluster the data; and cluster the data without doing a log transform. 'Always' and 'Never' will affect how all the clustering is done. For example, selecting 'Never' for Gene Centering means that whatever other options are selected, none of them will have been gene-centered.

The order of operations for preprocessing data is the same as in Mike Eisen's Cluster program - log transform, center genes, normalize genes, center arrays, then normalize arrays.

Please try not to select 'Test' for all the options available, in the interest of server load. Also please do not put in massive data sets for the same reason. But you are allowed to use your own definition of massive - please be patient, as you will sometimes have to wait to cluster 32 or 64 data sets and create clustergrams for all of them before anything shows up on your screen.
It seems that several hundred genes is probably ok, but when you get over 500 or so the web server seems to have some limitations so that it will not generate the pictures. I'm not sure why, but if you don't see any images and also don't see any error messages after you hit submit and the page loads, it's likely your data set was too large.

This page was designed so that you could take the options you like and use their corresponding options in Mike Eisen's Cluster program and get relatively similar output. If you use Gavin Sherlock's XCluster you should definitely get the same output, but you will probably have to preprocess your file first to do the gene/array centering/normalization, etc. (XCluster, which does the clustering for this web page, uses Average Linkage Hierarchical Clustering.)

I hope this is helpful/informative for you.



This page is still under development. Feedback and comments: email me

Back to main slcview page.

Powered by and many thanks to SourceForge.net.
SourceForge Logo