Hitachi Vantara Pentaho Community Forums
Results 1 to 3 of 3

Thread: Pentaho and Hive

  1. #1
    Join Date
    Jul 2013
    Posts
    3

    Default Pentaho and Hive

    Hi Guys, I am very new to Pentaho and I am thinking about using Pentaho in combination with Hive as an alternative to R.
    I want to do the following: From Pentaho I want to submit a Hive query on a Hadoop Cluster where it is executed. After that I want to analyse the results from this query by computing the correlations between two columns which are part of the result set and I want to visualize them. I allready managed to submit a Hive query from the Pentaho Report Designer but I was not able to view the Result from Pentaho or even to visualize them. So my question is first: Is it possible to use Pentaho for this described use case? And the second question: Which of the Pentaho solutions should I download? It seems that the Report Designer is not able to analyse the result from hive. So is it the Pentaho data integration? Or Pentaho Big Data?

    Help would be really apreciated.

  2. #2
    Join Date
    Jul 2013
    Posts
    3

    Default

    are you serios guys?nobody is able to answer this question?

  3. #3
    Join Date
    Sep 2012
    Posts
    71

    Default

    Hive, Hive 2, and Impala support for Pentaho client tools will be available in the imminent 4.8.2 suite release. If you are using the 4.8.x products, it is possible to upgrade the Big Data plugin in each of your Pentaho tools, but the procedure is a bit long and would have to be applied to each Pentaho client tool (Report Designer, Data Integration, etc.). The basic procedure is here:

    http://forums.pentaho.com/showthread...ive2-connector

    That refers to building the plugin from source, but the artifacts are available on our Continuous Integration server (http://ci.pentaho.com/view/Big%20Dat...ta-plugin-1.3/) and the latest releases are in our repository:

    ZIP: http://repository.pentaho.org/artifa...in-1.3.3.1.zip
    Hive JDBC shim JAR: http://repository.pentaho.org/artifa...shim-1.3.3.jar

    The ZIP file contains the pentaho-big-data-plugin folder, which you would delete from your products and replace from the ZIP. The JAR file goes into each product where the JDBC drivers are (usually under libext/JDBC) and replaces the JAR of the same name (but earlier version). Depending on what Hadoop distribution you have (Apache, Cloudera, HortonWorks, MapR, etc.) you will have to configure the plugin to use that distribution (see the procedure at the link above). If you use Apache Hadoop 1.x you will need to create your own configuration, I outline how to do that in my blog here:

    http://funpdi.blogspot.com/2013/03/p...nd-hadoop.html

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.