Oftentimes, the sample I deal with is full of noise or confounding factors that I am not interested in. For example, human specimen is doomed noisy because the race, age, sex, occupation, or the life story of the subject would have influenced the results.
As a novice in genomic data analysis, one of my goal is to benchmark how well a clustering method works. I ran across this practice of doing k-means at R-exercises the other day and felt it might be a nice start because k-means is easy to perform and conceptually simple for me to correlate what is happening behind the clustering machinery.