US and Worldwide: +1 (866) 660-7555
Results 1 to 8 of 8

Thread: PDI 4.3 configuration with Apache Hadoop 1.0.0 on Redhat 2.6.18-164.E15

  1. #1

    Default PDI 4.3 configuration with Apache Hadoop 1.0.0 on Redhat 2.6.18-164.E15

    Hi All,

    I am Shivanandan Gupta (Shiva) and recently started using PDI 4.3 for a project where we are using Hadoop as one of the source and target. I wanted to connect to hadoop cluster and move one file from local windows system to the hadoop on redhat Linux. This is just the first step and I want to talk to hadoop using PDI 4.3 job using BigData Component.

    Please help me to know if I have to do some pre-requisite configuration changes in PDI 4.3 to talk to Apache hadoop 1.0.0

    The version of hadoop, Linux and pdi I am using are given below:

    redhat Linux Enterprise version 2.6.18-164.E15
    Apache Hadoop 1.0.0
    PDI 4.3

    Please let me know if you need some more details.

    Thanks in advance.
    Thanks,
    Shivanandan Gupta (Shiva)

  2. #2
    Join Date
    Jul 2012
    Posts
    164

    Default

    Hi Shiva,

    First u try to copy the file from cli and then try it using with Pentaho big data plugin..
    One more important thing when u are installing hadoop 1.0.x try to replace the jar files with hadoop installation jar files in pentaho data integration big data..
    configure as per ur requirement in hdfs-site.xml,core-site.xml and mapred-site.xml..
    If u face any errors try to post the error we will reply u back.


    Quote Originally Posted by shivanandangupta View Post
    Hi All,

    I am Shivanandan Gupta (Shiva) and recently started using PDI 4.3 for a project where we are using Hadoop as one of the source and target. I wanted to connect to hadoop cluster and move one file from local windows system to the hadoop on redhat Linux. This is just the first step and I want to talk to hadoop using PDI 4.3 job using BigData Component.

    Please help me to know if I have to do some pre-requisite configuration changes in PDI 4.3 to talk to Apache hadoop 1.0.0

    The version of hadoop, Linux and pdi I am using are given below:

    redhat Linux Enterprise version 2.6.18-164.E15
    Apache Hadoop 1.0.0
    PDI 4.3

    Please let me know if you need some more details.

    Thanks in advance.

  3. #3

    Default

    Hi Kumar,

    Thanks a lot for your quick reply. I have replaced the PDI hadoop-core-0.20.2.jar file with teh hadoop-core-0.20.2.jar file from Apache-hadoop 1.0.0 cluster. But still the problem is same. Its not able to talk to hadoop from pdi which I have installed on Windows server.

    Error message which I got is given below:

    2012/08/06 14:18:04 - Hadoop Copy Files - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : Can not copy file/folder [file:///E:/MakeMyTrip/kettle and hadoop/weblogs_rebuild.txt/weblogs_rebuild.txt] to [hdfs://HadoopUsr:hadoop@10.16.3.32:9002/usr/local/hadoop/tmp/gutenberg/test/]. Exception : [
    2012/08/06 14:18:04 - Hadoop Copy Files - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) :
    2012/08/06 14:18:04 - Hadoop Copy Files - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : Unable to get VFS File object for filename 'hdfs://HadoopUsr:hadoop@10.16.3.32:9002/usr/local/hadoop/tmp/gutenberg/test/' : Could not resolve file "hdfs://HadoopUsr:***@10.16.3.32:9002/usr/local/hadoop/tmp/gutenberg/test".



    See the screen shot of the error attached.

    hadoop-pdi-connection-error.jpg


    Thank You!!!!

    Quote Originally Posted by yvkumar View Post
    Hi Shiva,

    First u try to copy the file from cli and then try it using with Pentaho big data plugin..
    One more important thing when u are installing hadoop 1.0.x try to replace the jar files with hadoop installation jar files in pentaho data integration big data..
    configure as per ur requirement in hdfs-site.xml,core-site.xml and mapred-site.xml..
    If u face any errors try to post the error we will reply u back.
    Thanks,
    Shivanandan Gupta (Shiva)

  4. #4
    Join Date
    Jul 2012
    Posts
    164

    Default

    I think still somewhere existing jar files giving error or copy commons-configurations.jar file from hadoop to pdi

    Try copying commons-configuration-1.7.jar from $HADOOP_HOME/lib (I think) to $PDI_HOME/libext/bigdata also.

    Restart ur pdi or computer then check it out.

    Actually in Pdi 4.3 default hadoop jar is hadoop-0.20.2 u need to replace the jar file with ur version remove the older one and place it with ur newer version hadoop jar file
    Last edited by yvkumar; 08-06-2012 at 05:30 AM.

  5. #5

    Default

    Thanks once again kumar. I tried this option also. But no luck till now. I am gettting the same error message.



    Quote Originally Posted by yvkumar View Post
    I think still somewhere existing jar files giving error or copy commons-configurations.jar file from hadoop to pdi

    Try copying commons-configuration-1.7.jar from $HADOOP_HOME/lib (I think) to $PDI_HOME/libext/bigdata also.

    Restart ur pdi or computer then check it out.

    Actually in Pdi 4.3 default hadoop jar is hadoop-0.20.2 u need to replace the jar file with ur version remove the older one and place it with ur newer version hadoop jar file
    Thanks,
    Shivanandan Gupta (Shiva)

  6. #6
    Join Date
    Nov 1999
    Posts
    9,535

    Default

    Follow the installation procedure on the big data wiki pages.
    When you still have a problem, please post in the Big Data forum.

    By the way, if you're using Hadoop 1.0 I'm guessing you shouldn't use the 0.20.0 core library.
    Matt Casters, Chief Data Integration
    Pentaho, Open Source Business Intelligence
    http://www.pentaho.org -- mcasters@pentaho.org

    Author of the book Pentaho Kettle Solutions by Wiley. Also available as e-Book and on the Kindle reading applications (iPhone, iPad, Android, Kindle devices, ...)

    Join us on IRC server Freenode.net, channel ##pentaho

  7. #7
    Join Date
    Jul 2012
    Posts
    164

    Default

    Try to remove ur old hadoop-0.20.2 jar file in pdi and place ur hadoop-1.0.0 jar file in pdi which u have installed hadoop in redhat or any os..



    Quote Originally Posted by shivanandangupta View Post
    Thanks once again kumar. I tried this option also. But no luck till now. I am gettting the same error message.

  8. #8

    Default

    Hi Matt,

    Thanks for your reply. I have used Hadoop 1.0 core library but its still not working. Going further I will post such threads in Big Data forum.


    Quote Originally Posted by MattCasters View Post
    Follow the installation procedure on the big data wiki pages.
    When you still have a problem, please post in the Big Data forum.

    By the way, if you're using Hadoop 1.0 I'm guessing you shouldn't use the 0.20.0 core library.
    Thanks,
    Shivanandan Gupta (Shiva)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •