PDA

View Full Version : CDH Hive2 connector.



ehsan.haq
06-25-2013, 04:32 AM
Hi,
I am using kettle stable 4.4 and I am trying to add a Hive Server 2 DB connection, but running into problems. Here is what I have tried so far.

When using the stable releas 4.4 I didn't get any "Hadoop hive 2" connection type, so I tried using the Generic Connection and explicitly specifying the driver class to be 'org.apache.hive.jdbc.HiveDriver' and connection string to be 'jdbc:hive2://<host>:10000/default', but I get the ClassNotFoundException pasted at the end.
Then I compiled the big-data-plugin code and replace the jars, this gives me the option to have a hive server 2 connection but I still get the same ClassNotFoundException.
I also tried to copy the hive-jdbc and other hive jars from my hadoop cluster for version compatibility, but still the same ClassNotFoundException.
I also made sure that the 'org.apache.hive.jdbc.HiveDriver' is in the jdbc jar.

It looks that the current kettle 4.4 stable is not able to load the hive2 driver. Is there anyway to workaround this?


Exception:
========
Error connecting to database [Hive] : org.pentaho.di.core.exception.KettleDatabaseException:
Error occured while trying to connect to the database

Exception while loading class
org.apache.hive.jdbc.HiveDriver

org.pentaho.di.core.exception.KettleDatabaseException:
Error occured while trying to connect to the database

Exception while loading class
org.apache.hive.jdbc.HiveDriver
....
Caused by: java.lang.ClassNotFoundException: org.apache.hive.jdbc.HiveDriver
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)

Kind Regards
Ehsan

mattb_pdi
06-25-2013, 09:03 AM
When you say you compiled the Big Data plugin and replaced the JARs, which JARs did you replace? Usually to deploy a new Big Data plugin (when building from source) you should do the following:

1) Checkout the Big Data plugin source and run "ant clean-all dist". This builds the Big Data plugin ZIP including a few Hadoop configurations (aka "shims") such as "cdh42".

In the pentaho-big-data-plugin/dist folder, you will find the pentaho-big-data-plugin-<version>.zip file, this contains the "pentaho-big-data-plugin" folder containing the whole plugin and shims.

2) From your Kettle 4.4-stable directory, delete (after backing up) the data-integration/plugins/pentaho-big-data-plugin folder, and unzip the aforementioned ZIP file into data-integration/plugins. This effectively replaces your Big Data plugin with the new one.

3) Verify the data-integration/plugins/pentaho-big-data-plugin/hadoop-configurations/cdh42 folder exists, this is the CDH 4.2 shim (which is backwards-compatible to CDH 4.0 and I believe forwards-compatible to CDH 4.3).

4) Set the active shim to CDH 4.2 by editing data-integration/plugins/pentaho-big-data-plugin/plugin.properties and setting the "active.hadoop.configuration" property to cdh42.

This last step is the one most often missed, as it is usually taken care of by the release build process:

5) In your Kettle 4.4-stable directory at data-integration/libext/JDBC, replace the pentaho-hadoop-hive-jdbc-shim-<version>.jar with the one from your Big Data project at pentaho-big-data-plugin/shims/hive-jdbc/dist/pentaho-hadoop-hive-jdbc-shim-<version>.jar

6) Start Spoon, you should be able to select a Hadoop Hive 2 connection and connect successfully.

If you're still having problems after following this procedure, please let me know and I will help where I can.

Cheers,
Matt

ehsan.haq
06-26-2013, 05:16 AM
Thanks Matt.
I was missing point 5, figured out after a few hours of hitting my head on the terminal :-)

Cheers,
Ehsan

ellisionthomas
06-26-2013, 06:48 AM
being a programmer i also stuck into the similar trouble out of the imposing improper concepts and logic or moat probably misplaced ones .. under such complexities i also get help through multiple sources like web or book or notes .. however good to see that you have got out of the trouble you were in ..

jet47
07-19-2013, 06:05 AM
Thanks to Matt


When you say you compiled the Big Data plugin and replaced the JARs, which JARs did you replace? Usually to deploy a new Big Data plugin (when building from source) you should do the following:

1) Checkout the Big Data plugin source and run "ant clean-all dist". This builds the Big Data plugin ZIP including a few Hadoop configurations (aka "shims") such as "cdh42".

In the pentaho-big-data-plugin/dist folder, you will find the pentaho-big-data-plugin-<version>.zip file, this contains the "pentaho-big-data-plugin" folder containing the whole plugin and shims.

2) From your Kettle 4.4-stable directory, delete (after backing up) the data-integration/plugins/pentaho-big-data-plugin folder, and unzip the aforementioned ZIP file into data-integration/plugins. This effectively replaces your Big Data plugin with the new one.

3) Verify the data-integration/plugins/pentaho-big-data-plugin/hadoop-configurations/cdh42 folder exists, this is the CDH 4.2 shim (which is backwards-compatible to CDH 4.0 and I believe forwards-compatible to CDH 4.3).

4) Set the active shim to CDH 4.2 by editing data-integration/plugins/pentaho-big-data-plugin/plugin.properties and setting the "active.hadoop.configuration" property to cdh42.

This last step is the one most often missed, as it is usually taken care of by the release build process:

5) In your Kettle 4.4-stable directory at data-integration/libext/JDBC, replace the pentaho-hadoop-hive-jdbc-shim-<version>.jar with the one from your Big Data project at pentaho-big-data-plugin/shims/hive-jdbc/dist/pentaho-hadoop-hive-jdbc-shim-<version>.jar

6) Start Spoon, you should be able to select a Hadoop Hive 2 connection and connect successfully.

If you're still having problems after following this procedure, please let me know and I will help where I can.

Cheers,
Matt

mkorolyov
11-21-2013, 07:06 PM
Hi,

I have done all steps. i managed to connect to impala. but when i am trying to query some data, query fails. here is logs from simple job, that queries impala for some data.

2013/11/22 01:55:29 - Spoon - Asking for repository
2013/11/22 01:55:30 - RepositoriesMeta - Reading repositories XML file: /Users/mkorolyov/.kettle/repositories.xml
2013/11/22 01:55:30 - Version checker - OK
2013/11/22 01:55:31 - Spoon - Connected to metastore : crm, added to delegating metastore
2013/11/22 01:55:36 - Spoon - Starting job...
2013/11/22 01:55:37 - impala export test - Start of job execution
2013/11/22 01:55:37 - impala export test - Starting entry [SQL]
2013/11/22 01:55:38 - SQL - ERROR (version 5.0.1-stable, build 1 from 2013-11-15_16-08-58 by buildguy) : An error occurred executing this job entry :
2013/11/22 01:55:38 - SQL - An error occurred executing SQL:
2013/11/22 01:55:38 - SQL - select count(*) from test.events10
2013/11/22 01:55:38 - SQL -
2013/11/22 01:55:38 - SQL - Error determining value metadata from SQL resultset metadata
2013/11/22 01:55:38 - SQL - Method not supported
2013/11/22 01:55:39 - impala export test - Finished job entry [SQL] (result=[false])
2013/11/22 01:55:39 - impala export test - Job execution finished
2013/11/22 01:55:39 - Spoon - Job has ended.
2013/11/22 01:55:47 - Spoon - Transformation opened.
2013/11/22 01:55:47 - Spoon - Launching transformation [impala_trans_test]...
2013/11/22 01:55:47 - Spoon - Started the transformation execution.
2013/11/22 01:55:49 - Spoon - The transformation has finished!!

i have managed to connect to impala cluster and query some data through JDBC with simple sql tool - squeril (http://squirrel-sql.sourceforge.net), i have also tried to query impala from console with impala-shell tool and everything was fine so cluster is alive and working fine.
Could anyone be so kind and point me what is wrong with current PDI 5.0.1 setup?

I am using CDH 4.4, if it matters.

Thanks

remeniuk
11-24-2013, 05:26 AM
Same problem here -- couldn't manage to get Kettle working with CDH 4.4 Impala/Hive

pm2013
11-26-2013, 07:00 AM
How to get data from Impala and visualize in the Pentaho Visualization. Can somebody give input on it. Thanks in advance.

I want to visualize data which exists in hadoop component(Hive, Impala).



Same problem here -- couldn't manage to get Kettle working with CDH 4.4 Impala/Hive

lalit.kumar
03-28-2017, 01:25 AM
Hi Matt,
I am using single node cluster hadoop 2.7.2 in local. i m not able to connect hive 2.0.0 with pentaho 7.0

error when trying to conect
Error connecting to database [hive] :org.pentaho.di.core.exception.KettleDatabaseException:
Error occurred while trying to connect to the database


Error connecting to database: (using class org.apache.hive.jdbc.HiveDriver)
org/apache/http/client/CookieStore




org.pentaho.di.core.exception.KettleDatabaseException:
Error occurred while trying to connect to the database


Error connecting to database: (using class org.apache.hive.jdbc.HiveDriver)
org/apache/http/client/CookieStore




at org.pentaho.di.core.database.Database.normalConnect(Database.java:472)
at org.pentaho.di.core.database.Database.connect(Database.java:370)
at org.pentaho.di.core.database.Database.connect(Database.java:341)
at org.pentaho.di.core.database.Database.connect(Database.java:331)
at org.pentaho.di.core.database.DatabaseFactory.getConnectionTestReport(DatabaseFactory.java:80)
at org.pentaho.di.core.database.DatabaseMeta.testConnection(DatabaseMeta.java:2795)
at org.pentaho.ui.database.event.DataHandler.testDatabaseConnection(DataHandler.java:598)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.pentaho.ui.xul.impl.AbstractXulDomContainer.invoke(AbstractXulDomContainer.java:313)
at org.pentaho.ui.xul.impl.AbstractXulComponent.invoke(AbstractXulComponent.java:157)
at org.pentaho.ui.xul.impl.AbstractXulComponent.invoke(AbstractXulComponent.java:141)
at org.pentaho.ui.xul.swt.tags.SwtButton.access$500(SwtButton.java:43)
at org.pentaho.ui.xul.swt.tags.SwtButton$4.widgetSelected(SwtButton.java:137)
at org.eclipse.swt.widgets.TypedListener.handleEvent(Unknown Source)
at org.eclipse.swt.widgets.EventTable.sendEvent(Unknown Source)
at org.eclipse.swt.widgets.Widget.sendEvent(Unknown Source)
at org.eclipse.swt.widgets.Display.runDeferredEvents(Unknown Source)
at org.eclipse.swt.widgets.Display.readAndDispatch(Unknown Source)
at org.eclipse.jface.window.Window.runEventLoop(Window.java:820)
at org.eclipse.jface.window.Window.open(Window.java:796)
at org.pentaho.di.ui.xul.KettleDialog.show(KettleDialog.java:80)
at org.pentaho.di.ui.xul.KettleDialog.show(KettleDialog.java:47)
at org.pentaho.di.ui.core.database.dialog.XulDatabaseDialog.open(XulDatabaseDialog.java:116)
at org.pentaho.di.ui.core.database.dialog.DatabaseDialog.open(DatabaseDialog.java:60)
at org.pentaho.di.ui.spoon.delegates.SpoonDBDelegate.newConnection(SpoonDBDelegate.java:475)
at org.pentaho.di.ui.spoon.delegates.SpoonDBDelegate.newConnection(SpoonDBDelegate.java:462)
at org.pentaho.di.ui.spoon.Spoon.newConnection(Spoon.java:8811)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.pentaho.ui.xul.impl.AbstractXulDomContainer.invoke(AbstractXulDomContainer.java:313)
at org.pentaho.ui.xul.impl.AbstractXulComponent.invoke(AbstractXulComponent.java:157)
at org.pentaho.ui.xul.impl.AbstractXulComponent.invoke(AbstractXulComponent.java:141)
at org.pentaho.ui.xul.jface.tags.JfaceMenuitem.access$100(JfaceMenuitem.java:43)
at org.pentaho.ui.xul.jface.tags.JfaceMenuitem$1.run(JfaceMenuitem.java:106)
at org.eclipse.jface.action.Action.runWithEvent(Action.java:498)
at org.eclipse.jface.action.ActionContributionItem.handleWidgetSelection(ActionContributionItem.java:545)
at org.eclipse.jface.action.ActionContributionItem.access$2(ActionContributionItem.java:490)
at org.eclipse.jface.action.ActionContributionItem$5.handleEvent(ActionContributionItem.java:402)
at org.eclipse.swt.widgets.EventTable.sendEvent(Unknown Source)
at org.eclipse.swt.widgets.Widget.sendEvent(Unknown Source)
at org.eclipse.swt.widgets.Display.runDeferredEvents(Unknown Source)
at org.eclipse.swt.widgets.Display.readAndDispatch(Unknown Source)
at org.pentaho.di.ui.spoon.Spoon.readAndDispatch(Spoon.java:1359)
at org.pentaho.di.ui.spoon.Spoon.waitForDispose(Spoon.java:7990)
at org.pentaho.di.ui.spoon.Spoon.start(Spoon.java:9290)
at org.pentaho.di.ui.spoon.Spoon.main(Spoon.java:685)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.pentaho.commons.launcher.Launcher.main(Launcher.java:92)
Caused by: org.pentaho.di.core.exception.KettleDatabaseException:
Error connecting to database: (using class org.apache.hive.jdbc.HiveDriver)
org/apache/http/client/CookieStore


at org.pentaho.di.core.database.Database.connectUsingClass(Database.java:587)
at org.pentaho.di.core.database.Database.normalConnect(Database.java:456)
... 55 more
Caused by: java.lang.NoClassDefFoundError: org/apache/http/client/CookieStore
at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:107)
at java.sql.DriverManager.getConnection(DriverManager.java:664)
at java.sql.DriverManager.getConnection(DriverManager.java:270)
at org.pentaho.di.core.database.Database.connectUsingClass(Database.java:571)
... 56 more
Caused by: java.lang.ClassNotFoundException: org.apache.http.client.CookieStore
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 60 more


Hostname :localhost
Port :10000
Database name :default


Please help what i m missing

elpakks
10-24-2017, 11:54 AM
Hi there!

I know this was a few months ago but if you are still experiencing this problem (or anyone else reading this), you are most likely missing a dependency that contains the org.apache.http.client.CookieStore class. Doing a quick Google search it seems like that can be found in httpclient jar version 4.0 or higher (https://hc.apache.org/httpcomponents-client-ga/httpclient/apidocs/org/apache/http/client/CookieStore.html). I usually find jars I need at https://mvnrepository.com, when I Googled "org/apache/http/client/CookieStore dependencies" the first link that came up was for httpclient v 4.1.1 (https://mvnrepository.com/artifact/org.apache.httpcomponents/httpclient/4.1.1), maybe this will help you!