Hitachi Vantara Pentaho Community Forums
Results 1 to 4 of 4

Thread: Request to download "Pentaho Data Integration for Hadoop 4.1 GA" for Linux

  1. #1
    Join Date
    Dec 2010

    Default Request to download "Pentaho Data Integration for Hadoop 4.1 GA" for Linux


    I have requested to download "Pentaho Data Integration for Hadoop 4.1 GA" for Linux for many times. Why have I still not received the download link? Could you check it for me? Thanks

    Now I have downloaded a PDI for Window, however, my Hadoop is installed on Linux. It throws the following exception for it fails to find the class file. It seems that some jar files should be copied to the Hadoop server. Could you tell me which jars should be copied? thanks!

    Caused by: java.lang.NoClassDefFoundError: org/pentaho/di/core/exception/KettleException
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(
    at org.apache.hadoop.conf.Configuration.getClassByName(
    at org.apache.hadoop.conf.Configuration.getClass(
    at org.apache.hadoop.conf.Configuration.getClass(
    at org.apache.hadoop.mapred.JobConf.getMapperClass(
    at org.apache.hadoop.mapred.MapRunner.configure(
    ... 10 more

  2. #2


    It doesn't matter if your client runs on Windows. It should normally communicate to the Pentaho Distribution for Hadoop (PHD) on the Linux side. I have the same set up and it works.

    Did you install the PHD on your Hadoop nodes? There is some 300+ MB's of Pentaho SW to be installed on top of each node in your cluster. This PHD can be found in the PDI 4.1 server download. In there is a zip file under pentaho/server/ named
    Last edited by Jasper; 01-04-2011 at 12:59 PM.

  3. #3
    Join Date
    Dec 2010


    Hi, Jasper,

    Thanks! I have found the zip file pentaho\server\
    How to install it in Hadoop nodes? I cannot found the relevant installation document. The only one I found is this one, but it seems not for phd-ee-4.1.

  4. #4


    Well that installation is outdated now. Now you can just unzip the PHD to the Hadoop home install directory. The PHD adds some new files to the $HADOOP_HOME/lib directory, unzipping that care of that.

    After unzipping you have to install 2 licenses (PDI Ent. + Hadoop Ent.) and off you go.

    You have to repeat this for every node in your cluster.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.