Dear all...

Could anyone please help me give an example of k-means|| clustering implementation with distributedWekaHadoop, I hv already read mark hall blog[1], it seems like common k-means algorithm but not clear enough how to configure, run and evaluate cluster with hadoop. compared to traditional k-means algorithm [2] and another k-means enhancement [3], does k-means|| pretty accurate in clustering with large datasets? does k-means|| guarantee a better or optimum result, or just better performance, by mean faster computing than another k-means algorithm?. I'm sorry for my newb question, I'm really appreciate for any help can provide. Thank You.

[1] http://markahall.blogspot.co.id/2014...or-hadoop.html
[2] http://theory.stanford.edu/~sergei/p...db12-kmpar.pdf
[3] http://www.eecs.tufts.edu/~dsculley/...fastkmeans.pdf