Hitachi Vantara Pentaho Community Forums

Search Forums:

Type: Posts; User: Mark; Keyword(s):

Page 1 of 35 1 2 3 4

Search Forums: Search took 0.04 seconds.

  1. If it was possible to disable this then the...

    If it was possible to disable this then the Explorer would generate an exception instead. This is because it has detected that the structure of you test data differs from that of the training data...
  2. Replies
    1
    Views
    1,390

    I'm afraid you'll need to upgrade to 3.9.1....

    I'm afraid you'll need to upgrade to 3.9.1. Unfortunately, SourceForge changed how they serve up direct download links - sometimes they generate a redirect, and the package manager can't handle this....
  3. Replies
    1
    Views
    966

    For new instances without class labels you need...

    For new instances without class labels you need to create an ARFF file with exactly the same structure (number of attributes, attribute names, order of attributes, and declaration of nominal values)...
  4. Replies
    2
    Views
    1,437

    In the editor dialog for the MultilayerPerceptron...

    In the editor dialog for the MultilayerPerceptron there is a button labeled "More" in the "About" field at the top - you can click this to get more information about each of the options. The...
  5. Replies
    1
    Views
    799

    I'm not too sure what you are asking here - do...

    I'm not too sure what you are asking here - do you want to see the individual predictions from the tree/regression on test data points? If so, there are options to output predictions, in various...
  6. Replies
    0
    Views
    738

    2017 REXER Data Science Survey

    Hi everyone,

    REXER are running their data science survey again. Please participate and indicate that without Weka you'd be living under a bridge :-)
    ...
  7. AddNoise only operates on nominal attributes....

    AddNoise only operates on nominal attributes. Values are altered randomly using a uniform distribution.

    Cheers,
    Mark.
  8. That would be the "exponent" option for the...

    That would be the "exponent" option for the PolyKernel.

    Cheers,
    Mark.
  9. What evaluation mode are you using in the...

    What evaluation mode are you using in the Explorer? Predictions and evaluation results for percentage split and x-val involve using the models trained on the training folds. However, the model output...
  10. Replies
    2
    Views
    925

    I'm not sure exactly what your application is. If...

    I'm not sure exactly what your application is. If the variables are all numeric then you will need to use a regression scheme such as linear regression or M5P (regression/model trees). J48 can only...
  11. Turn on the "printClassifiers" option - this will...

    Turn on the "printClassifiers" option - this will output the textual representation for each of the trees in the forest. They are not output by default because OOM errors can occur when forming a...
  12. Replies
    1
    Views
    768

    If you are using a version of Weka >= 3.7.2 then...

    If you are using a version of Weka >= 3.7.2 then these classifiers will be in a separate plugin "package" that can be installed via Weka's package manager (GUIChooser-->Tools). The package manager...
  13. I think you misunderstand the meaning of the -K...

    I think you misunderstand the meaning of the -K parameter in RandomForest. It controls how many attributes (out of the complete set) are investigated each time a node is split in the RandomTree. E.g...
  14. What makes you say it doesn't honor the -K...

    What makes you say it doesn't honor the -K parameter? Do you have a test case that shows the problem?

    Cheers,
    Mark.
  15. Replies
    1
    Views
    1,594

    It looks like your data does not contain a date...

    It looks like your data does not contain a date field, so you can't set a time stamp field to use. The forecasting system can generate an artificial timestamp for you, so remove...
  16. Replies
    1
    Views
    950

    Ah yes :-) RandomForest has been refactored to...

    Ah yes :-) RandomForest has been refactored to extend Bagging (instead of using Bagging internally). You now set the number of trees to build via setNumIterations().

    Cheers,
    Mark.
  17. At present, there is a bug in the save dialog in...

    At present, there is a bug in the save dialog in the Knowledge Flow. It should only show .kf, as this is the only format now supported for saving. In Weka >= 3.8.0 the Knowledge Flow is a completely...
  18. Replies
    1
    Views
    915

    Functionality offered by the Explorer and...

    Functionality offered by the Explorer and Knowledge Flow is fairly similar. One major difference is that the Knowledge Flow offers support for incremental learning schemes (i.e. those that only...
  19. Replies
    3
    Views
    695

    This is a third-party unofficial package for...

    This is a third-party unofficial package for Weka. You could try installing it in older versions of Weka (<= 3.8.0); otherwise contact the authors directly for guidance.

    Cheers,
    Mark.
  20. R does not have an implementation of logistic...

    R does not have an implementation of logistic model trees that I'm aware of. Does your production data include a lot of nominal attributes with many values? Logistic regression/logit boost will...
  21. The relative metrics are normalized by the same...

    The relative metrics are normalized by the same base metric computed from using the mean/mode of the class values in the training data as the predicted value. I'm not sure exactly what you are doing,...
  22. Replies
    1
    Views
    2,538

    Unfortunately, the direct download URLs that Weka...

    Unfortunately, the direct download URLs that Weka has been using (for core Weka and package downloads) for the last decade or so have started to generate redirects. So SourceForge have changed how...
  23. Replies
    0
    Views
    1,119

    New WEKA releases: 3.6.15, 3.8.1 and 3.9.1

    Hi everyone!

    New versions of Weka are available for download from the Weka homepage:

    * Weka 3.8.1 - stable version. It is available as ZIP, with Win32 installer, Win32 installer incl. JRE...
  24. You are correct. In most cases (where a learning...

    You are correct. In most cases (where a learning algorithm will be applied) all preprocessing should only be done using training data, and then the results applied to the test data. Otherwise...
  25. Replies
    1
    Views
    868

    Methods mentioned in the optimizing...

    Methods mentioned in the optimizing hyperparameters wiki page are all Weka classifiers themselves and, as such, can be configured and used in the same way as any other classifier. So you can use the...
  26. I think you have covered the main difference...

    I think you have covered the main difference between the two implementations already. All of scikit-learn's methods work on numeric numpy arrays, so categorical variables have to be encoded as...
  27. There isn't anything specific in Weka to do this...

    There isn't anything specific in Weka to do this I'm afraid. You could probably write some scripting code in Weka (using groovy or jython) that uses Weka's classes to produce a stratified learning...
  28. Replies
    1
    Views
    824

    You can read up on the metrics output by Weka....

    You can read up on the metrics output by Weka. Here is a link to information retrieval metrics on Wikipedia. You can google the others:

    https://en.wikipedia.org/wiki/Information_retrieval
    ...
  29. Thread: Weka

    by Mark
    Replies
    1
    Views
    696

    It's actually just a warning, not an error. It's...

    It's actually just a warning, not an error. It's just saying that there are no jdbc drivers in the classpath, which is not a problem if you are not using a database.

    Cheers,
    Mark.
  30. Replies
    1
    Views
    1,087

    How are you parallelising Weka tasks? Are you...

    How are you parallelising Weka tasks? Are you using the distirbutedWeka packages for Hadoop/Spark? I've used distirubtedWeka for Hadoop and Spark on a Torque cluster (not using any Torque-specific...
  31. Replies
    1
    Views
    761

    This has been answered over on the Weka mailing...

    This has been answered over on the Weka mailing list:

    https://list.waikato.ac.nz/pipermail/wekalist/2016-October/067894.html

    Cheers,
    Mark.
  32. This is probably a bug. Please create a JIRA...

    This is probably a bug. Please create a JIRA ticket (with steps to reproduce) at:

    http://jira.pentaho.com/browse/PDI/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel

    Cheers,...
  33. Replies
    4
    Views
    1,016

    Yes, you are correct. There was a bug in filter...

    Yes, you are correct. There was a bug in filter being used to transform data. I've fixed this and made a new release of the timeseriesForecasting package. Refresh your repository cache and install...
  34. The checkboxes in the Explorer are used simply...

    The checkboxes in the Explorer are used simply for removing attributes via the "Remove" button (which is a shortcut for using the Remove filter) at the bottom of the panel. The attributes converted...
  35. That link contains fairly comprehensive...

    That link contains fairly comprehensive documentation for the time series forecasting environment. The number of instances used to produce a forecast at prediction time depends on the minimum and...
  36. Replies
    4
    Views
    1,016

    Java assertions are turned off by default at...

    Java assertions are turned off by default at runtime (and turned on by Maven's test environment). Your instances structure does not match - why is the Lag_POPULARITY-12 attribute named Lag_POPULARITY...
  37. This has been answered over in the Weka mailing...

    This has been answered over in the Weka mailing list:

    https://list.waikato.ac.nz/pipermail/wekalist/2016-October/067648.html

    Cheers,
    Mark.
  38. Replies
    1
    Views
    1,064

    Try increasing the number of trees learned from...

    Try increasing the number of trees learned from the default of 100. Try 500 or 1000. You can also fiddle with the -K option - this controls the number of attributes randomly considered for splitting...
  39. Replies
    1
    Views
    1,323

    If you mean that J48 does not build a tree on...

    If you mean that J48 does not build a tree on your data (i.e. it just contains the root node predicting the majority class value from the training data) then this means that the algorithm could not...
  40. Replies
    1
    Views
    701

    I'm not too sure I understand. It is not possible...

    I'm not too sure I understand. It is not possible to generate a forecast from an untrained WekaForecaster that is using HoltWinters as the underlying scheme - the forecast() method in HoltWinters...
  41. WekaForecaster uses standard propositional...

    WekaForecaster uses standard propositional machine learning regression methods under the hood to make forecasts. In order to remove the time dependency between data points it transforms it into...
  42. Replies
    1
    Views
    800

    If you are forecasting programatically, and don't...

    If you are forecasting programatically, and don't require the evaluation routines provided by the forecasting environment, then you can use HoltWinters directly rather than via WekaForecaster. Note...
  43. Replies
    1
    Views
    1,064

    Powers of time help with modelling long term...

    Powers of time help with modelling long term trends according to Richard Darlington from Cornell. Read through his pages on using standard multiple linear regression for time series (unfortunately...
  44. Yes, you are correct - the outcome variable needs...

    Yes, you are correct - the outcome variable needs to be declared in the test data. However, all values can be set to missing (i.e. ?). Weka's Add filter is a convenient way to add such a column, as...
  45. ROC is a summary statistic that measures the...

    ROC is a summary statistic that measures the ranking performance of a classifier. Accuracy is just one point on the ROC curve - the point that corresponds to a threshold of 0.5 on the probability...
  46. Ah yes, bummer :-) In later versions of Weka the...

    Ah yes, bummer :-) In later versions of Weka the buildClassifier() method in Bagging has

    m_inBag = null;

    at the end of the method, thus saving memory (and size on disk).

    Can you upgrade to a...
  47. Replies
    1
    Views
    2,502

    By default (unless the kernel option is altered),...

    By default (unless the kernel option is altered), SMO learns a linear model. In the case of classification with multiple input variables, this is actually a separating hyperplane - that is, it...
  48. Replies
    3
    Views
    1,096

    Yes, that is correct. There is no support for...

    Yes, that is correct.

    There is no support for executing attribute selection from the Explorer on Weka Server yet. The cross-validation evaluation for attribute selection could certainly be...
  49. Replies
    3
    Views
    1,096

    Yes, via the new Knowledge Flow engine...

    Yes, via the new Knowledge Flow engine implementation in Weka 3.8.0. Attribute Selection can now be run in the Knowledge Flow (with output like in the Explorer). So you can execute a flow on Weka...
  50. Replies
    1
    Views
    1,041

    There are some Weka plugin steps for PDI. Take a...

    There are some Weka plugin steps for PDI. Take a look at the Plugins for PDI section on the Wiki:

    http://wiki.pentaho.com/display/DATAMINING/Pentaho+Data+Mining+Community+Documentation

    Cheers,...
Results 1 to 50 of 1745
Page 1 of 35 1 2 3 4
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.