PDA

View Full Version : Use Weka model with Pentaho



PiKo
03-02-2009, 10:27 AM
Hi,

I'm searching how I can use models that I created with Weka in the Pentaho platform.

In fact, with Weka knowledgeflow I have generated a linear regression and i have saved the model. So now I have a *.model file but I don't how i can use this file in Pentaho.
My objective is to create a graph of my linear regression.

Someone can help me ?

Thank you very much.

PiKo

Mark
03-02-2009, 03:51 PM
Hi PiKo,

Weka models can be used for scoring new data from the Pentaho platform by creating a PDI (Kettle) transformation using the WekaScoring Kettle plugin. This plugin can load serialized Weka models (and some PMML models) and apply them to score incoming data rows. You can then set up an xaction to execute this Kettle transformation on the BI server. The WekaScoring plugin is available from:

http://wiki.pentaho.com/display/EAI/List+of+Available+Pentaho+Data+Integration+Plug-Ins

Cheers,
Mark.

PiKo
03-03-2009, 12:13 PM
Hi Mark,

Thank you very much for your answer.

The WekaScoring Kettle plugin is working well with my linear regression model. I have at the end the predicted values, thx for you help.

But i have tried with a RBF Network model and it's not working. The "field mapping" and "Model" tabs do not fill so I can't have my predicted values.

Is there something that i miss or a special thing to do ?

Thanks

PiKo

Mark
03-03-2009, 05:37 PM
Hi PiKo,

RBFNetwork seems to working OK for me (I've just tested using it in the WekaScoring plugin to score the iris data). Can you tell me which version of Weka you are using (and which version of Kettle)? Also, is it possible for you to start Spoon from a command prompt window so that you can see any error messages generated?

Cheers,
Mark.

PiKo
03-04-2009, 06:17 AM
Hi Mark,

I'm using Spoon 3.1.0 and Weka 3.6.0 on Windows Vista.
I've run Spoon.bat from a command prompt but it seems that the process is still running in background, so there is no error messages.

I tested like you to score the Iris data but it was not working for me.
Maybe I do someting wrong to create the model. You can see below my KnowledgeFlow, I saved my model with the pop up menu on RBF Network.
http://img50.imageshack.us/img50/4326/wekakfrbfnet.jpg

Here the text viewer result :
http://img3.imageshack.us/img3/2405/wekatextviewresult.jpg

Then in Spoon, I take the "Weka pre-build model scoring" and i use the "Load/import model" field to load my model. But nothing append in the other tabs.

Is there something wrong ? Or is there an exemple or a tutorial using RBF Network that exist ?

Thanks for your help.

Mark
03-04-2009, 07:21 PM
Hi PiKo,

I'm pretty sure that I've figured out what is going wrong. When you installed the WekaScoring plugin in Kettle you probably downloaded the weka.jar file that was linked to alongside the WekaScoring download on the list of available plugins web page. I should have deleted this weka.jar file from the that page ages ago - it is no longer needed since we are now up to release 3.6.0. If this is indeed the case for you, then delete that weka.jar file from your plugins directory in Kettle and replace it with the weka.jar that came with your copy of Weka 3.6.0. I've tested this under WinXP and have no problems with loading RBFNetwork models into WekaScoring.

One other thing you may not have realized: on the flow shown in your attachment you are using 10-fold cross validation. If you save out the RBFModel from the RBFNetwork component you will be saving the model that was created on the last (i.e. 10th) training fold of the data. What you probably want is a model trained on all your data. This can be achieved by replacing the CrossValidationFoldMaker component with a TrainingSetMaker.

Cheers,
Mark.

PiKo
03-05-2009, 07:01 AM
Hi Mark,

The problem was effectivly the weka.jar file, I used the file of the Weka-3.6.0 directory and it's working fine now. Thank you very much for your help !

And thank you for your remark about the saving of the model, I was not aware of it.

Have a nice day.