PDA

View Full Version : dataset question (selection issue)



marta_figuei
06-03-2008, 06:22 PM
Hi
we are completetly new to Weka and currently using it for an academic project. We have a very large amount of data from a datawarehouse and we would like to know if we should use all of the data in the preprocessing phase or simply select a small subset of the information. many thanks in advance.

Mark
06-04-2008, 12:22 AM
Hi,

I think that, in general, it makes sense to pre-process as much data as you possibly can. Whilst it is probably not feasible to use all of it to construct models, at least you will have a large, clean data set to draw samples from.

Cheers,
Mark.