Hitachi Vantara Pentaho Community Forums
Results 1 to 3 of 3

Thread: Kettle - Hadoop connectivity - unable to connect to HDFS server

  1. #1
    Join Date
    Sep 2014
    Posts
    4

    Default Kettle - Hadoop connectivity - unable to connect to HDFS server

    Hi,

    I have just installed Kettle and I'm attempting connectivity from Spoon to a remote Hadoop cluster. I referred to a similar thread.

    Below are the versions of the different tools involved :

    Code:
    hadoop version
    Hadoop 2.3.0-cdh5.0.1
    Subversion git://github.sf.cloudera.com/CDH/cdh.git -r 8e266e052e423af592871e2dfe09d54c03f6a0e8
    Compiled by jenkins on 2014-05-06T19:01Z
    Compiled with protoc 2.5.0
    From source with checksum 6ce1c599ee996a0a6505d2579e62ffa
    This command was run using /usr/lib/hadoop/hadoop-common-2.3.0-cdh5.0.1.jar
    Kettle/Spoon : 5.1.0

    Code:
     cat /etc/hadoop/conf.cloudera.hdfs/core-site.xml
    <?xml version="1.0" encoding="UTF-8"?>
    
    <!--Autogenerated by Cloudera Manager-->
    <configuration>
      <property>
        <name>fs.defaultFS</name>
        <value>hdfs://cldx-1368-1226:8020</value>
      </property>
      <property>
        <name>fs.trash.interval</name>
        <value>1</value>
      </property>
      <property>
        <name>io.compression.codecs</name>
        <value>org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec,org.apache.hadoop.io.compress.DeflateCodec,org.apache.hadoop.io.compress.SnappyCodec,org.apache.hadoop.io.compress.Lz4Codec</value>
      </property>
      <property>
        <name>hadoop.security.authentication</name>
        <value>simple</value>
      </property>
      <property>
        <name>hadoop.rpc.protection</name>
        <value>authentication</value>
      </property>
      <property>
        <name>hadoop.security.auth_to_local</name>
        <value>DEFAULT</value>
      </property>
    </configuration>
    As mentioned in the documentation , in all the relevant plugin.properties, I added the following entry

    active.hadoop.configuration=cdh50

    When I attempt to edit a 'Hadoop Copy Files' job, I get an 'unable to connect to HDFS' as shown in the image below :

    Name:  Spoon_unable_to_connect_HDFS.jpg
Views: 290
Size:  27.8 KB

    Can anyone provide me inputs as to where am I going wrong ?

    Regards,
    Omkar

  2. #2
    Join Date
    Sep 2012
    Posts
    71

    Default

    There are some other *-site.xml files in the cdh50 folder (such as yarn-site.xml) that likely contain hostname references to the Cloudera QuickStart VM (cdh5.clouderamanager.test, e.g.), if you update the hostnames/ports to point at your cluster does it work?

    Also, are you using YARN on your CDH 5.0.1 cluster, or MapReduce 1? If the latter, you have to change config.properties in the cdh50 folder to use configuration "mr1" instead of the default "mr2".

    Lastly, if you are trying to connect to a secure cluster, you'll need PDI Enterprise Edition and to follow the setup instructions located at the InfoCenter (http://help.pentaho.com/Documentatio...P0/0W0/030/040)

  3. #3
    Join Date
    Jan 2015
    Posts
    2

    Default

    I am having a similar issue with CDH5 and Hadoop 2.5. Did you figure this out?

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.