PDA

View Full Version : Use of multiple cores



greg.soulsby
05-16-2009, 04:57 AM
While running some jobs on this machine I note that only one of its 4 cores is running. Is there a way to make use of a multi core machine? I know in the Weka book there is a conversation about distributing experiments over multiple machines but what about multiple cores?

Mark
05-17-2009, 02:08 AM
You can make use of multiple cores in a number of ways with Weka. In the Explorer each panel runs schemes in its own thread - so, you can have a clusterer and a classifier being trained in parallel for example. The Knowledge Flow is also multi-threaded. Each independent flow runs in it's own thread, so you can start more than one in parallel. Furthermore, in 3.6 and the developer branch of Weka the "Classifier" component in the Knowledge Flow is multi-threaded. This means it is possible to train models for multiple folds of a cross-validation in parallel (there is an "execution slots" parameter in the customizer dialog for Classifiers in the Knowledge Flow).

As you noted, experiments can be distributed to multiple machines by the Experimenter. It is possible to start multiple RemoteEngines on the same machine (thus taking advantage of multiple cores) by specifying a different port for each (-p option).

Cheers,
Mark.

greg.soulsby
05-17-2009, 06:06 AM
Many thanks. Weka makes me very happy.

So if we are utilising the cores to maximum do we then run into memory constraints?

If we have maxheap=1300m when we fire up Weka and we have Execution Slots = 2 for a classifier within a KnowledgeFlow what is the protocol? Do they share the 1300m? Or do they get maxheap=1300m each? Doesnt sound likely as if there is only 2000m of physical memory would they not start swapping each other in and out?

Under the multiple machines scenario, where we have multiple RemoteMachines on the same box are there different memory issues there?

Mark
05-17-2009, 03:37 PM
If you start one virtual machine, then all threads share the same physical memory. Multiple virtual machines (on the same or different hosts) get their own memory allocation.

Cheers,
Mark.

virginie
09-04-2014, 11:07 AM
Hi,

I have a similar question to greg's: I 'd like to know whether it is possible to distribute the workload of one processing over several cores. I do not want e.g. to train two classifiers at the same time but I'd like to reduce the processing time of my feature selection (ClassifierSubsetEval + Bestfirst with Weka 3.6.11). Is there anything similar to the multi-threaded "Classifier" component in the Knowledge Flow but for feature selection that I could use ?

2nd question: Does the multi-threaded "Classifier" component in the Knowledge Flow work only for the cross-validation option? I would not want to use cross-validation as I have large dataset but instead my predifined train and test sets.

Thank you.

Mark
09-05-2014, 12:37 AM
The development version of Weka (3.7) has more support for multi-threading. Several meta classifiers (Bagging, Random forests, Rotation forests etc.) can take advantage of multiple cpu cores when building trees. There is also support for multiple cores in the multiLayerPerceptrons package. As for attribute selection, CFSSubsetEval can use multiple threads when computing its correlation matrix and GreedyStepwise can evaluate subsets in parallel. If you combine ClassifierSubsetEval with a multi-threaded base classifier and GreedyStepwise as the search then you will be in parallel nirvana :-)

Cheers,
Mark.