Hitachi Vantara Pentaho Community Forums
Results 1 to 4 of 4

Thread: Apache Hadoop version 0.20.X for Pentaho on Windows

  1. #1
    Join Date
    Dec 2012
    Posts
    1

    Default Apache Hadoop version 0.20.X for Pentaho on Windows

    I just wanted to know if anyone has configured Apache Hadoop for Windows?

    I believe Pentaho in my case version 4.8 comes configured with Apache Hadoop but I am trying to understand how it works because I cannot get to work so far.

    Here is what I am trying to follow the Hadoop Tutorial.

    I beleieve in the past there was PHD (which is no longer in use). I also understand that if you are configuring a different version e.g. Clodera or MapR then some configuration needs changing. I am trying to find where my hadoop port resides. If I use job entry to Copy hadoop files when I try to browse my destination directory it doesn't work at all. The port is 9000? Where do I find it.

    I have seen we have a core-site.xml that is empty with 1 tag i.e. configuration, other files I cannot see are hdfs-site.xml and mapred.xml. Should these files be available. Do I need to install hadoop from PHD User guide still.

    Thank You.

    REgards,

    Farayi

  2. #2
    Join Date
    Mar 2008
    Posts
    9

    Default

    Farayi,
    I am also facing similar problem. Please let me know, how you resolved or have any alternative solutions.

    Appreciate your feedback.

    Regards,
    Raj

  3. #3
    Join Date
    Jun 2013
    Posts
    44

    Default

    sorry as I've no clue towards the topic you have raised here ... looking over the web and scanning through multiple informative sites to get a better idea about how to get it right soon .. but have not found anything helpful yet to catch up with the solution well .. is there anyone who knows it good .. please help..

  4. #4
    Join Date
    Sep 2012
    Posts
    71

    Default

    Pentaho 4.8 doesn't come with Hadoop, it comes with client-side support for various vendors' Hadoop distributions, such as Cloudera (versions 3u4 and 4), MapR, and Apache Hadoop 0.20. You'll need a Hadoop distribution from one of these vendors installed somewhere, then you'd configure your Pentaho Data Integration steps, job entries, etc. to "point at" your Hadoop distribution. If you're looking to install Apache Hadoop on Windows, there's a blog post here describing how to do this with Cygwin:

    http://blog.sqltrainer.com/2012/01/i...ng-apache.html

    Also HortonWorks has a beta platform for Windows:

    http://hortonworks.com/blog/hadoop-in-windows/

    The default port(s) for your Hadoop platform should be identified in the documentation for that platform.

    Hope this helps,
    Matt

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.