Hitachi Vantara Pentaho Community Forums
Results 1 to 10 of 10

Thread: CDH Hive2 connector.

Hybrid View

Previous Post Previous Post   Next Post Next Post
  1. #1
    Join Date
    Jun 2013
    Posts
    4

    Default CDH Hive2 connector.

    Hi,
    I am using kettle stable 4.4 and I am trying to add a Hive Server 2 DB connection, but running into problems. Here is what I have tried so far.

    When using the stable releas 4.4 I didn't get any "Hadoop hive 2" connection type, so I tried using the Generic Connection and explicitly specifying the driver class to be 'org.apache.hive.jdbc.HiveDriver' and connection string to be 'jdbc:hive2://<host>:10000/default', but I get the ClassNotFoundException pasted at the end.
    Then I compiled the big-data-plugin code and replace the jars, this gives me the option to have a hive server 2 connection but I still get the same ClassNotFoundException.
    I also tried to copy the hive-jdbc and other hive jars from my hadoop cluster for version compatibility, but still the same ClassNotFoundException.
    I also made sure that the 'org.apache.hive.jdbc.HiveDriver' is in the jdbc jar.

    It looks that the current kettle 4.4 stable is not able to load the hive2 driver. Is there anyway to workaround this?


    Exception:
    ========
    Error connecting to database [Hive] : org.pentaho.di.core.exception.KettleDatabaseException:
    Error occured while trying to connect to the database

    Exception while loading class
    org.apache.hive.jdbc.HiveDriver

    org.pentaho.di.core.exception.KettleDatabaseException:
    Error occured while trying to connect to the database

    Exception while loading class
    org.apache.hive.jdbc.HiveDriver
    ....
    Caused by: java.lang.ClassNotFoundException: org.apache.hive.jdbc.HiveDriver
    at java.net.URLClassLoader$1.run(URLClassLoader.java:366)

    Kind Regards
    Ehsan

  2. #2
    Join Date
    Sep 2012
    Posts
    71

    Default

    When you say you compiled the Big Data plugin and replaced the JARs, which JARs did you replace? Usually to deploy a new Big Data plugin (when building from source) you should do the following:

    1) Checkout the Big Data plugin source and run "ant clean-all dist". This builds the Big Data plugin ZIP including a few Hadoop configurations (aka "shims") such as "cdh42".

    In the pentaho-big-data-plugin/dist folder, you will find the pentaho-big-data-plugin-<version>.zip file, this contains the "pentaho-big-data-plugin" folder containing the whole plugin and shims.

    2) From your Kettle 4.4-stable directory, delete (after backing up) the data-integration/plugins/pentaho-big-data-plugin folder, and unzip the aforementioned ZIP file into data-integration/plugins. This effectively replaces your Big Data plugin with the new one.

    3) Verify the data-integration/plugins/pentaho-big-data-plugin/hadoop-configurations/cdh42 folder exists, this is the CDH 4.2 shim (which is backwards-compatible to CDH 4.0 and I believe forwards-compatible to CDH 4.3).

    4) Set the active shim to CDH 4.2 by editing data-integration/plugins/pentaho-big-data-plugin/plugin.properties and setting the "active.hadoop.configuration" property to cdh42.

    This last step is the one most often missed, as it is usually taken care of by the release build process:

    5) In your Kettle 4.4-stable directory at data-integration/libext/JDBC, replace the pentaho-hadoop-hive-jdbc-shim-<version>.jar with the one from your Big Data project at pentaho-big-data-plugin/shims/hive-jdbc/dist/pentaho-hadoop-hive-jdbc-shim-<version>.jar

    6) Start Spoon, you should be able to select a Hadoop Hive 2 connection and connect successfully.

    If you're still having problems after following this procedure, please let me know and I will help where I can.

    Cheers,
    Matt

  3. #3
    Join Date
    Jun 2013
    Posts
    4

    Default

    Thanks Matt.
    I was missing point 5, figured out after a few hours of hitting my head on the terminal :-)

    Cheers,
    Ehsan

  4. #4
    Join Date
    Dec 2012
    Posts
    7

    Default

    Thanks to Matt

    Quote Originally Posted by mattb_pdi View Post
    When you say you compiled the Big Data plugin and replaced the JARs, which JARs did you replace? Usually to deploy a new Big Data plugin (when building from source) you should do the following:

    1) Checkout the Big Data plugin source and run "ant clean-all dist". This builds the Big Data plugin ZIP including a few Hadoop configurations (aka "shims") such as "cdh42".

    In the pentaho-big-data-plugin/dist folder, you will find the pentaho-big-data-plugin-<version>.zip file, this contains the "pentaho-big-data-plugin" folder containing the whole plugin and shims.

    2) From your Kettle 4.4-stable directory, delete (after backing up) the data-integration/plugins/pentaho-big-data-plugin folder, and unzip the aforementioned ZIP file into data-integration/plugins. This effectively replaces your Big Data plugin with the new one.

    3) Verify the data-integration/plugins/pentaho-big-data-plugin/hadoop-configurations/cdh42 folder exists, this is the CDH 4.2 shim (which is backwards-compatible to CDH 4.0 and I believe forwards-compatible to CDH 4.3).

    4) Set the active shim to CDH 4.2 by editing data-integration/plugins/pentaho-big-data-plugin/plugin.properties and setting the "active.hadoop.configuration" property to cdh42.

    This last step is the one most often missed, as it is usually taken care of by the release build process:

    5) In your Kettle 4.4-stable directory at data-integration/libext/JDBC, replace the pentaho-hadoop-hive-jdbc-shim-<version>.jar with the one from your Big Data project at pentaho-big-data-plugin/shims/hive-jdbc/dist/pentaho-hadoop-hive-jdbc-shim-<version>.jar

    6) Start Spoon, you should be able to select a Hadoop Hive 2 connection and connect successfully.

    If you're still having problems after following this procedure, please let me know and I will help where I can.

    Cheers,
    Matt
    Last edited by jet47; 07-19-2013 at 09:10 PM.

  5. #5
    Join Date
    Mar 2017
    Posts
    1

    Default

    Hi Matt,
    I am using single node cluster hadoop 2.7.2 in local. i m not able to connect hive 2.0.0 with pentaho 7.0

    error when trying to conect
    Error connecting to database [hive] rg.pentaho.di.core.exception.KettleDatabaseException:
    Error occurred while trying to connect to the database


    Error connecting to database: (using class org.apache.hive.jdbc.HiveDriver)
    org/apache/http/client/CookieStore




    org.pentaho.di.core.exception.KettleDatabaseException:
    Error occurred while trying to connect to the database


    Error connecting to database: (using class org.apache.hive.jdbc.HiveDriver)
    org/apache/http/client/CookieStore




    at org.pentaho.di.core.database.Database.normalConnect(Database.java:472)
    at org.pentaho.di.core.database.Database.connect(Database.java:370)
    at org.pentaho.di.core.database.Database.connect(Database.java:341)
    at org.pentaho.di.core.database.Database.connect(Database.java:331)
    at org.pentaho.di.core.database.DatabaseFactory.getConnectionTestReport(DatabaseFactory.java:80)
    at org.pentaho.di.core.database.DatabaseMeta.testConnection(DatabaseMeta.java:2795)
    at org.pentaho.ui.database.event.DataHandler.testDatabaseConnection(DataHandler.java:598)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.pentaho.ui.xul.impl.AbstractXulDomContainer.invoke(AbstractXulDomContainer.java:313)
    at org.pentaho.ui.xul.impl.AbstractXulComponent.invoke(AbstractXulComponent.java:157)
    at org.pentaho.ui.xul.impl.AbstractXulComponent.invoke(AbstractXulComponent.java:141)
    at org.pentaho.ui.xul.swt.tags.SwtButton.access$500(SwtButton.java:43)
    at org.pentaho.ui.xul.swt.tags.SwtButton$4.widgetSelected(SwtButton.java:137)
    at org.eclipse.swt.widgets.TypedListener.handleEvent(Unknown Source)
    at org.eclipse.swt.widgets.EventTable.sendEvent(Unknown Source)
    at org.eclipse.swt.widgets.Widget.sendEvent(Unknown Source)
    at org.eclipse.swt.widgets.Display.runDeferredEvents(Unknown Source)
    at org.eclipse.swt.widgets.Display.readAndDispatch(Unknown Source)
    at org.eclipse.jface.window.Window.runEventLoop(Window.java:820)
    at org.eclipse.jface.window.Window.open(Window.java:796)
    at org.pentaho.di.ui.xul.KettleDialog.show(KettleDialog.java:80)
    at org.pentaho.di.ui.xul.KettleDialog.show(KettleDialog.java:47)
    at org.pentaho.di.ui.core.database.dialog.XulDatabaseDialog.open(XulDatabaseDialog.java:116)
    at org.pentaho.di.ui.core.database.dialog.DatabaseDialog.open(DatabaseDialog.java:60)
    at org.pentaho.di.ui.spoon.delegates.SpoonDBDelegate.newConnection(SpoonDBDelegate.java:475)
    at org.pentaho.di.ui.spoon.delegates.SpoonDBDelegate.newConnection(SpoonDBDelegate.java:462)
    at org.pentaho.di.ui.spoon.Spoon.newConnection(Spoon.java:8811)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.pentaho.ui.xul.impl.AbstractXulDomContainer.invoke(AbstractXulDomContainer.java:313)
    at org.pentaho.ui.xul.impl.AbstractXulComponent.invoke(AbstractXulComponent.java:157)
    at org.pentaho.ui.xul.impl.AbstractXulComponent.invoke(AbstractXulComponent.java:141)
    at org.pentaho.ui.xul.jface.tags.JfaceMenuitem.access$100(JfaceMenuitem.java:43)
    at org.pentaho.ui.xul.jface.tags.JfaceMenuitem$1.run(JfaceMenuitem.java:106)
    at org.eclipse.jface.action.Action.runWithEvent(Action.java:498)
    at org.eclipse.jface.action.ActionContributionItem.handleWidgetSelection(ActionContributionItem.java:545)
    at org.eclipse.jface.action.ActionContributionItem.access$2(ActionContributionItem.java:490)
    at org.eclipse.jface.action.ActionContributionItem$5.handleEvent(ActionContributionItem.java:402)
    at org.eclipse.swt.widgets.EventTable.sendEvent(Unknown Source)
    at org.eclipse.swt.widgets.Widget.sendEvent(Unknown Source)
    at org.eclipse.swt.widgets.Display.runDeferredEvents(Unknown Source)
    at org.eclipse.swt.widgets.Display.readAndDispatch(Unknown Source)
    at org.pentaho.di.ui.spoon.Spoon.readAndDispatch(Spoon.java:1359)
    at org.pentaho.di.ui.spoon.Spoon.waitForDispose(Spoon.java:7990)
    at org.pentaho.di.ui.spoon.Spoon.start(Spoon.java:9290)
    at org.pentaho.di.ui.spoon.Spoon.main(Spoon.java:685)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.pentaho.commons.launcher.Launcher.main(Launcher.java:92)
    Caused by: org.pentaho.di.core.exception.KettleDatabaseException:
    Error connecting to database: (using class org.apache.hive.jdbc.HiveDriver)
    org/apache/http/client/CookieStore


    at org.pentaho.di.core.database.Database.connectUsingClass(Database.java:587)
    at org.pentaho.di.core.database.Database.normalConnect(Database.java:456)
    ... 55 more
    Caused by: java.lang.NoClassDefFoundError: org/apache/http/client/CookieStore
    at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:107)
    at java.sql.DriverManager.getConnection(DriverManager.java:664)
    at java.sql.DriverManager.getConnection(DriverManager.java:270)
    at org.pentaho.di.core.database.Database.connectUsingClass(Database.java:571)
    ... 56 more
    Caused by: java.lang.ClassNotFoundException: org.apache.http.client.CookieStore
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    ... 60 more


    Hostname :localhost
    Port :10000
    Database name :default


    Please help what i m missing

  6. #6
    Join Date
    Jul 2017
    Posts
    1

    Default

    Hi there!

    I know this was a few months ago but if you are still experiencing this problem (or anyone else reading this), you are most likely missing a dependency that contains the org.apache.http.client.CookieStore class. Doing a quick Google search it seems like that can be found in httpclient jar version 4.0 or higher (https://hc.apache.org/httpcomponents...okieStore.html). I usually find jars I need at https://mvnrepository.com, when I Googled "org/apache/http/client/CookieStore dependencies" the first link that came up was for httpclient v 4.1.1 (https://mvnrepository.com/artifact/o...tpclient/4.1.1), maybe this will help you!

  7. #7
    Join Date
    Jun 2013
    Posts
    44

    Default

    being a programmer i also stuck into the similar trouble out of the imposing improper concepts and logic or moat probably misplaced ones .. under such complexities i also get help through multiple sources like web or book or notes .. however good to see that you have got out of the trouble you were in ..

  8. #8
    Join Date
    Nov 2013
    Posts
    1

    Default

    Hi,

    I have done all steps. i managed to connect to impala. but when i am trying to query some data, query fails. here is logs from simple job, that queries impala for some data.

    2013/11/22 01:55:29 - Spoon - Asking for repository
    2013/11/22 01:55:30 - RepositoriesMeta - Reading repositories XML file: /Users/mkorolyov/.kettle/repositories.xml
    2013/11/22 01:55:30 - Version checker - OK
    2013/11/22 01:55:31 - Spoon - Connected to metastore : crm, added to delegating metastore
    2013/11/22 01:55:36 - Spoon - Starting job...
    2013/11/22 01:55:37 - impala export test - Start of job execution
    2013/11/22 01:55:37 - impala export test - Starting entry [SQL]
    2013/11/22 01:55:38 - SQL - ERROR (version 5.0.1-stable, build 1 from 2013-11-15_16-08-58 by buildguy) : An error occurred executing this job entry :
    2013/11/22 01:55:38 - SQL - An error occurred executing SQL:
    2013/11/22 01:55:38 - SQL - select count(*) from test.events10
    2013/11/22 01:55:38 - SQL -
    2013/11/22 01:55:38 - SQL - Error determining value metadata from SQL resultset metadata
    2013/11/22 01:55:38 - SQL - Method not supported
    2013/11/22 01:55:39 - impala export test - Finished job entry [SQL] (result=[false])
    2013/11/22 01:55:39 - impala export test - Job execution finished
    2013/11/22 01:55:39 - Spoon - Job has ended.
    2013/11/22 01:55:47 - Spoon - Transformation opened.
    2013/11/22 01:55:47 - Spoon - Launching transformation [impala_trans_test]...
    2013/11/22 01:55:47 - Spoon - Started the transformation execution.
    2013/11/22 01:55:49 - Spoon - The transformation has finished!!

    i have managed to connect to impala cluster and query some data through JDBC with simple sql tool - squeril, i have also tried to query impala from console with impala-shell tool and everything was fine so cluster is alive and working fine.
    Could anyone be so kind and point me what is wrong with current PDI 5.0.1 setup?

    I am using CDH 4.4, if it matters.

    Thanks
    Last edited by mkorolyov; 11-22-2013 at 09:58 AM.

  9. #9
    Join Date
    Nov 2012
    Posts
    1

    Default

    Same problem here -- couldn't manage to get Kettle working with CDH 4.4 Impala/Hive

  10. #10
    Join Date
    Nov 2013
    Posts
    18

    Default pentaho visualization

    How to get data from Impala and visualize in the Pentaho Visualization. Can somebody give input on it. Thanks in advance.

    I want to visualize data which exists in hadoop component(Hive, Impala).


    Quote Originally Posted by remeniuk View Post
    Same problem here -- couldn't manage to get Kettle working with CDH 4.4 Impala/Hive

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.