Hitachi Vantara Pentaho Community Forums
Results 1 to 2 of 2

Thread: cannot connect PDI EE 6.0.1 with hadoop 2.2 sandbox

  1. #1

    Question cannot connect PDI EE 6.0.1 with hadoop 2.2 sandbox

    Dear Community members


    I am stuck up with connecting a Horton-works sandbox 2.2.4.2 & pdi 6 EE to its HDFS

    I am using the simple job from pdi cook book which loads a csv file to hdfs file system ( click the below link for the transformation download which i am referencing here )

    https://www.dropbox.com/s/9gko8cayzt...-hdfs.kjb?dl=0


    I am having my vmware running my HDP sandbox at 192.168.56.130:8020 and from my laptop i can connect to the hortonworks sandbox using a web browser @ http://192.168.56.130:8000/filebrowser/view/user/hue


    As per the documentation i have copied all the mapred-site.xml , hive-site.xml , hdfs-site.xml , hbase-site.xml , core-site.xml from my sandbox to my pdi folder - C:\pentaho\design-tools\data-integration\plugins\pentaho-big-data-plugin\hadoop-configurations\hdp22


    In spoon i have set hadoop distribution to Hortonworks HDP 2.2.x and restarted the PDI


    I believe shims for HDP 2.2 is already present with the pentaho 6.0.1 EE under C:\pentaho\design-tools\data-integration\plugins\pentaho-big-data-plugin\hadoop-configurations\hdp22

    My source CSV file is present in c:\weblogs_rebuild.txt

    https://www.dropbox.com/s/pio2l27y9j...0hdfs.JPG?dl=0


    How ever from the same laptop using a browser i can connect to my vmware sandbox's HUE interface using
    below URL


    When i run the Job it says error with the below log message


    2016/03/10 13:31:46 - Copy Files - Processing row source File/folder source : [c:\weblogs_rebuild.txt] ... destination file/folder : [hdfs://192.168.56.130:8020/user/sandbox/baseball/]... wildcard : [^.*\.txt]
    2016/03/10 13:31:46 - Copy Files - ERROR (version 6.0.1.0-386, build 1 from 2015-12-03 11.37.25 by buildguy) : Couldn't created parent folder file:///C:/pentaho/design-tools/data-integration/hdfs:/192.168.56.130:8020/user/sandbox/baseball
    2016/03/10 13:31:46 - Copy Files - ERROR (version 6.0.1.0-386, build 1 from 2015-12-03 11.37.25 by buildguy) : org.apache.commons.vfs2.FileSystemException: Could not create folder "file:///C:/pentaho/design-tools/data-integration/hdfs:".
    2016/03/10 13:31:46 - Copy Files - at org.apache.commons.vfs2.provider.AbstractFileObject.createFolder(AbstractFileObject.java:436)
    2016/03/10 13:31:46 - Copy Files - at org.apache.commons.vfs2.provider.AbstractFileObject.createFolder(AbstractFileObject.java:419)
    2016/03/10 13:31:46 - Copy Files - at org.apache.commons.vfs2.provider.AbstractFileObject.createFolder(AbstractFileObject.java:419)
    2016/03/10 13:31:46 - Copy Files - at org.apache.commons.vfs2.provider.AbstractFileObject.createFolder(AbstractFileObject.java:419)
    2016/03/10 13:31:46 - Copy Files - at org.apache.commons.vfs2.provider.AbstractFileObject.createFolder(AbstractFileObject.java:419)
    2016/03/10 13:31:46 - Copy Files - at org.pentaho.di.job.entries.copyfiles.JobEntryCopyFiles.CreateDestinationFolder(JobEntryCopyFiles.java:681)
    2016/03/10 13:31:46 - Copy Files - at org.pentaho.di.job.entries.copyfiles.JobEntryCopyFiles.ProcessFileFolder(JobEntryCopyFiles.java:428)
    2016/03/10 13:31:46 - Copy Files - at org.pentaho.di.job.entries.copyfiles.JobEntryCopyFiles.execute(JobEntryCopyFiles.java:375)
    2016/03/10 13:31:46 - Copy Files - at org.pentaho.di.job.Job.execute(Job.java:730)
    2016/03/10 13:31:46 - Copy Files - at org.pentaho.di.job.Job.execute(Job.java:873)
    2016/03/10 13:31:46 - Copy Files - at org.pentaho.di.job.Job.execute(Job.java:546)
    2016/03/10 13:31:46 - Copy Files - at org.pentaho.di.job.Job.run(Job.java:435)
    2016/03/10 13:31:46 - Copy Files - Caused by: org.apache.commons.vfs2.FileSystemException: Could not create directory "C:\pentaho\design-tools\data-integration\hdfs:".
    2016/03/10 13:31:46 - Copy Files - at org.apache.commons.vfs2.provider.local.LocalFile.doCreateFolder(LocalFile.java:158)
    2016/03/10 13:31:46 - Copy Files - at org.apache.commons.vfs2.provider.AbstractFileObject.createFolder(AbstractFileObject.java:425)
    2016/03/10 13:31:46 - Copy Files - ... 11 more
    2016/03/10 13:31:46 - Copy Files - ERROR (version 6.0.1.0-386, build 1 from 2015-12-03 11.37.25 by buildguy) : Destination folder does not exist!
    2016/03/10 13:31:46 - load_hdfs - Finished job entry [Copy Files] (result=[false])
    2016/03/10 13:31:46 - load_hdfs - Job execution finished
    2016/03/10 13:31:46 - Spoon - Job has ended.
    2016/03/10 13:31:47 - Spoon - Spoon
    2016/03/10 13:31:55 - cfgbuilder - Warning: The configuration parameter [org] is not supported by the default configuration builder for scheme: sftp
    2016/03/10 13:54:04 - Spoon - Save as...
    2016/03/10 13:54:04 - Spoon - Save file as...
    2016/03/10 13:54:35 - Spoon - Save as...
    2016/03/10 13:54:35 - Spoon - Save file as...
    2016/03/10 13:57:02 - Spoon - Spoon
    2016/03/10 13:58:57 - cfgbuilder - Warning: The configuration parameter [org] is not supported by the default configuration builder for scheme: sftp
    2016/03/10 14:00:07 - Spoon - Spoon


    Is there something i am missing critical to make this to work ?? any help or advice on this is much appreciated

    Thanks
    Cheran

  2. #2

    Thumbs up

    Issue Resolved for some reason the PDI in my local machine [windows 7 ] does not talk to my VM which runs cent os hortonworks sandbox

    When i copied a new PDI installation , config files , kettle job all to my hortonworks sandbox and executed the same it got this working like a charm with no changes

    I can see my CSV file pushed to HDFS as the way i wanted.

    Thanks
    Cheran





    Quote Originally Posted by cheranilango View Post
    Dear Community members


    I am stuck up with connecting a Horton-works sandbox 2.2.4.2 & pdi 6 EE to its HDFS

    I am using the simple job from pdi cook book which loads a csv file to hdfs file system ( click the below link for the transformation download which i am referencing here )

    https://www.dropbox.com/s/9gko8cayzt...-hdfs.kjb?dl=0


    I am having my vmware running my HDP sandbox at 192.168.56.130:8020 and from my laptop i can connect to the hortonworks sandbox using a web browser @ http://192.168.56.130:8000/filebrowser/view/user/hue


    As per the documentation i have copied all the mapred-site.xml , hive-site.xml , hdfs-site.xml , hbase-site.xml , core-site.xml from my sandbox to my pdi folder - C:\pentaho\design-tools\data-integration\plugins\pentaho-big-data-plugin\hadoop-configurations\hdp22


    In spoon i have set hadoop distribution to Hortonworks HDP 2.2.x and restarted the PDI


    I believe shims for HDP 2.2 is already present with the pentaho 6.0.1 EE under C:\pentaho\design-tools\data-integration\plugins\pentaho-big-data-plugin\hadoop-configurations\hdp22

    My source CSV file is present in c:\weblogs_rebuild.txt

    https://www.dropbox.com/s/pio2l27y9j...0hdfs.JPG?dl=0


    How ever from the same laptop using a browser i can connect to my vmware sandbox's HUE interface using
    below URL


    When i run the Job it says error with the below log message


    2016/03/10 13:31:46 - Copy Files - Processing row source File/folder source : [c:\weblogs_rebuild.txt] ... destination file/folder : [hdfs://192.168.56.130:8020/user/sandbox/baseball/]... wildcard : [^.*\.txt]
    2016/03/10 13:31:46 - Copy Files - ERROR (version 6.0.1.0-386, build 1 from 2015-12-03 11.37.25 by buildguy) : Couldn't created parent folder file:///C:/pentaho/design-tools/data-integration/hdfs:/192.168.56.130:8020/user/sandbox/baseball
    2016/03/10 13:31:46 - Copy Files - ERROR (version 6.0.1.0-386, build 1 from 2015-12-03 11.37.25 by buildguy) : org.apache.commons.vfs2.FileSystemException: Could not create folder "file:///C:/pentaho/design-tools/data-integration/hdfs:".
    2016/03/10 13:31:46 - Copy Files - at org.apache.commons.vfs2.provider.AbstractFileObject.createFolder(AbstractFileObject.java:436)
    2016/03/10 13:31:46 - Copy Files - at org.apache.commons.vfs2.provider.AbstractFileObject.createFolder(AbstractFileObject.java:419)
    2016/03/10 13:31:46 - Copy Files - at org.apache.commons.vfs2.provider.AbstractFileObject.createFolder(AbstractFileObject.java:419)
    2016/03/10 13:31:46 - Copy Files - at org.apache.commons.vfs2.provider.AbstractFileObject.createFolder(AbstractFileObject.java:419)
    2016/03/10 13:31:46 - Copy Files - at org.apache.commons.vfs2.provider.AbstractFileObject.createFolder(AbstractFileObject.java:419)
    2016/03/10 13:31:46 - Copy Files - at org.pentaho.di.job.entries.copyfiles.JobEntryCopyFiles.CreateDestinationFolder(JobEntryCopyFiles.java:681)
    2016/03/10 13:31:46 - Copy Files - at org.pentaho.di.job.entries.copyfiles.JobEntryCopyFiles.ProcessFileFolder(JobEntryCopyFiles.java:428)
    2016/03/10 13:31:46 - Copy Files - at org.pentaho.di.job.entries.copyfiles.JobEntryCopyFiles.execute(JobEntryCopyFiles.java:375)
    2016/03/10 13:31:46 - Copy Files - at org.pentaho.di.job.Job.execute(Job.java:730)
    2016/03/10 13:31:46 - Copy Files - at org.pentaho.di.job.Job.execute(Job.java:873)
    2016/03/10 13:31:46 - Copy Files - at org.pentaho.di.job.Job.execute(Job.java:546)
    2016/03/10 13:31:46 - Copy Files - at org.pentaho.di.job.Job.run(Job.java:435)
    2016/03/10 13:31:46 - Copy Files - Caused by: org.apache.commons.vfs2.FileSystemException: Could not create directory "C:\pentaho\design-tools\data-integration\hdfs:".
    2016/03/10 13:31:46 - Copy Files - at org.apache.commons.vfs2.provider.local.LocalFile.doCreateFolder(LocalFile.java:158)
    2016/03/10 13:31:46 - Copy Files - at org.apache.commons.vfs2.provider.AbstractFileObject.createFolder(AbstractFileObject.java:425)
    2016/03/10 13:31:46 - Copy Files - ... 11 more
    2016/03/10 13:31:46 - Copy Files - ERROR (version 6.0.1.0-386, build 1 from 2015-12-03 11.37.25 by buildguy) : Destination folder does not exist!
    2016/03/10 13:31:46 - load_hdfs - Finished job entry [Copy Files] (result=[false])
    2016/03/10 13:31:46 - load_hdfs - Job execution finished
    2016/03/10 13:31:46 - Spoon - Job has ended.
    2016/03/10 13:31:47 - Spoon - Spoon
    2016/03/10 13:31:55 - cfgbuilder - Warning: The configuration parameter [org] is not supported by the default configuration builder for scheme: sftp
    2016/03/10 13:54:04 - Spoon - Save as...
    2016/03/10 13:54:04 - Spoon - Save file as...
    2016/03/10 13:54:35 - Spoon - Save as...
    2016/03/10 13:54:35 - Spoon - Save file as...
    2016/03/10 13:57:02 - Spoon - Spoon
    2016/03/10 13:58:57 - cfgbuilder - Warning: The configuration parameter [org] is not supported by the default configuration builder for scheme: sftp
    2016/03/10 14:00:07 - Spoon - Spoon


    Is there something i am missing critical to make this to work ?? any help or advice on this is much appreciated

    Thanks
    Cheran

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.