Paste your data to be clustered here, and select options you want below.
You should probably keep the number of genes below 500 or so.
If you hit the 'Sample' button on the right of the input area, it will put in
a sample data file which you can experiment with.
Clustering can sometimes seem complicated - there are many ways to cluster, and how you should do it often depends on what you're trying to look at. This page was designed to help you look quickly through the results of changing different parameters. For a description of these parameters I refer you to the documentation to Mike Eisen's Cluster and TreeView programs: http://rana.lbl.gov/manuals/ClusterTreeView.pdf. Another page of some interest which shows the variation in results you can get with different clustering metrics is http://ep.ebi.ac.uk/Docs/dist_clust.
Thanks to Gavin Sherlock for allowing me to use XCluster as the clustering
engine for this page. This also means that your data needs to be in a format
that is readable by XCluster. A description of the file format can be found at
http://genome-www.stanford.edu/~sherlock/cluster.html#formats.
Some simple file format adjustment is taken care of automatically to make it
slightly more flexible, so you don't actually need a 'GWEIGHT' column and blank
lines get filtered out.
I do not keep track of who uses this page nor what they cluster. All files generated by clustering your data are (usually) deleted shortly after they are generated. If you still have concerns about the privacy of your data I suggest you not use this page.
If you select 'Test' for an option, it will cluster your data both ways. For example, if you select 'Test' for log transform, then it will cluster your data 2 ways - first log transform, then cluster the data; and cluster the data without doing a log transform. 'Always' and 'Never' will affect how all the clustering is done. For example, selecting 'Never' for Gene Centering means that whatever other options are selected, none of them will have been gene-centered.
The order of operations for preprocessing data is the same as in Mike Eisen's Cluster program - log transform, center genes, normalize genes, center arrays, then normalize arrays.
Please try not to select 'Test' for all the options available, in the interest
of server load. Also please do not put in massive data sets for the same
reason. But you are allowed to use your own definition of massive - please be
patient, as you will sometimes have to wait to cluster 32 or 64 data sets and
create clustergrams for all of them before anything shows up on your screen.
It seems that several hundred genes is probably ok, but when you get
over 500 or so the web server seems to have some limitations so that it will
not generate the pictures. I'm not sure why, but if you don't see any images
and also don't see any error messages after you hit submit and the page loads,
it's likely your data set was too large.
This page was designed so that you could take the options you like and use their corresponding options in Mike Eisen's Cluster program and get relatively similar output. If you use Gavin Sherlock's XCluster you should definitely get the same output, but you will probably have to preprocess your file first to do the gene/array centering/normalization, etc. (XCluster, which does the clustering for this web page, uses Average Linkage Hierarchical Clustering.)
I hope this is helpful/informative for you.