PDA

View Full Version : Cassandra input/output error



jcc-ch
05-07-2012, 11:18 AM
Hello everybodyI'm working in a POC using cassandra. Im using version 4.3 (preview) of the kettle client for Windows and apache cassandra 1.1.0.I try to copy information extracted from a csv into a cassandra column family. The problem is that I always receive the same error message. It looks like there is some problem in the CassandraColumnMetaData.java class. I'm wondering if somebody already had the same problem. Lookslike if kettle has some problems to read the cassandra metada of the column family that specify. The keyspace exists and I selected "createcolumn family" in the cassandra output step (I have also tried without the create column family, creating first the column family by hand)Additionally: Show schema always return: "an error ocurred while getting the schema information: null"Any suggestion?2012/05/07 16:43:55 - Spoon - Logging goes to file:///D:/DOCUME~1/t115506/LOCALS~1/Temp/spoon_12299ca2-9853-11e1-8b4b-a1f1caa91afa.log2012/05/07 16:43:55 - DBCache - Loading database cache from file: [D:\Documents and Settings\t115506\.kettle\db.cache]2012/05/07 16:43:55 - DBCache - We read 0 cached rows from the database cache!2012/05/07 16:43:56 - Spoon - Asking for repository2012/05/07 16:43:56 - RepositoriesMeta - No repositories file found in the local directory: C:\Program Files\Cassandra\pdi-ce-big-data-4.3.0-preview\repositories.xml2012/05/07 16:43:59 - Spoon - Trying to open the last file used.2012/05/07 16:43:59 - Transformation metadata - The shared object fie [null] is empty!2012/05/07 16:43:59 - Transformation metadata - We have 1 connections...2012/05/07 16:43:59 - Transformation metadata - Looking at connection #02012/05/07 16:43:59 - Transformation metadata - Reading 3 steps...2012/05/07 16:43:59 - Transformation metadata - Looking at step #02012/05/07 16:43:59 - Transformation metadata - Looking at step #12012/05/07 16:43:59 - Transformation metadata - Looking at step #22012/05/07 16:43:59 - Transformation metadata - We have 2 hops...2012/05/07 16:43:59 - Transformation metadata - Looking at hop #02012/05/07 16:43:59 - Transformation metadata - Looking at hop #12012/05/07 16:43:59 - Transformation metadata - nr of steps read : 32012/05/07 16:43:59 - Transformation metadata - nr of hops read : 22012/05/07 16:43:59 - Transformation metadata - The shared object fie [null] is empty!2012/05/07 16:43:59 - Transformation metadata - We have 0 connections...2012/05/07 16:43:59 - Transformation metadata - Reading 2 steps...2012/05/07 16:43:59 - Transformation metadata - Looking at step #02012/05/07 16:43:59 - Transformation metadata - Looking at step #12012/05/07 16:43:59 - Transformation metadata - We have 1 hops...2012/05/07 16:43:59 - Transformation metadata - Looking at hop #02012/05/07 16:43:59 - Transformation metadata - nr of steps read : 22012/05/07 16:43:59 - Transformation metadata - nr of hops read : 12012/05/07 16:43:59 - Transformation metadata - The shared object fie [null] is empty!2012/05/07 16:43:59 - Transformation metadata - We have 0 connections...2012/05/07 16:43:59 - Transformation metadata - Reading 2 steps...2012/05/07 16:43:59 - Transformation metadata - Looking at step #02012/05/07 16:43:59 - Transformation metadata - Looking at step #12012/05/07 16:43:59 - Transformation metadata - We have 1 hops...2012/05/07 16:43:59 - Transformation metadata - Looking at hop #02012/05/07 16:43:59 - Transformation metadata - nr of steps read : 22012/05/07 16:43:59 - Transformation metadata - nr of hops read : 12012/05/07 16:46:34 - Transformation metadata - The shared object fie [null] is empty!2012/05/07 16:46:34 - Transformation metadata - We have 1 connections...2012/05/07 16:46:34 - Transformation metadata - Looking at connection #02012/05/07 16:46:34 - Transformation metadata - Reading 3 steps...2012/05/07 16:46:34 - Transformation metadata - Looking at step #02012/05/07 16:46:34 - Transformation metadata - Looking at step #12012/05/07 16:46:34 - Transformation metadata - Looking at step #22012/05/07 16:46:34 - Transformation metadata - We have 2 hops...2012/05/07 16:46:34 - Transformation metadata - Looking at hop #02012/05/07 16:46:34 - Transformation metadata - Looking at hop #12012/05/07 16:46:34 - Transformation metadata - nr of steps read : 32012/05/07 16:46:34 - Transformation metadata - nr of hops read : 22012/05/07 16:46:34 - Spoon - Transformation opened.2012/05/07 16:46:34 - Spoon - Launching transformation [GetttingStartedTransformationWithDB]...2012/05/07 16:46:34 - Spoon - Started the transformation execution.2012/05/07 16:46:35 - Transformation metadata - Natural sort of steps executed in 0 ms (3 time previous steps calculated)2012/05/07 16:46:36 - Spoon - The transformation has finished!!2012/05/07 16:48:01 - Spoon - Save to file or repository...2012/05/07 16:48:01 - Spoon - File written to [P:\My Documents\Pentaho\HRDataTest\HRTest.ktr]2012/05/07 16:48:01 - DBCache - We wrote 0 cached rows to the database cache!2012/05/07 16:48:01 - Transformation metadata - The shared object fie [null] is empty!2012/05/07 16:48:01 - Transformation metadata - We have 0 connections...2012/05/07 16:48:01 - Transformation metadata - Reading 2 steps...2012/05/07 16:48:01 - Transformation metadata - Looking at step #02012/05/07 16:48:01 - Transformation metadata - Looking at step #12012/05/07 16:48:01 - Transformation metadata - We have 1 hops...2012/05/07 16:48:01 - Transformation metadata - Looking at hop #02012/05/07 16:48:01 - Transformation metadata - nr of steps read : 22012/05/07 16:48:01 - Transformation metadata - nr of hops read : 12012/05/07 16:48:01 - Spoon - Transformation opened.2012/05/07 16:48:01 - Spoon - Launching transformation [HRTest]...2012/05/07 16:48:01 - Spoon - Started the transformation execution.2012/05/07 16:48:01 - HRTest - Dispatching started for transformation [HRTest]2012/05/07 16:48:01 - HRTest - Nr of arguments detected:0 2012/05/07 16:48:01 - HRTest - This is not a replay transformation2012/05/07 16:48:01 - Transformation metadata - Natural sort of steps executed in 0 ms (2 time previous steps calculated)2012/05/07 16:48:01 - HRTest - I found 2 different steps to launch.2012/05/07 16:48:01 - HRTest - Allocating rowsets...2012/05/07 16:48:01 - HRTest - Allocating rowsets for step 0 --> CSV file input2012/05/07 16:48:01 - HRTest - prevcopies = 1, nextcopies=12012/05/07 16:48:01 - HRTest - Transformation allocated new rowset [CSV file input.0 - Cassandra Output.0]2012/05/07 16:48:01 - HRTest - Allocated 1 rowsets for step 0 --> CSV file input 2012/05/07 16:48:01 - HRTest - Allocating rowsets for step 1 --> Cassandra Output2012/05/07 16:48:01 - HRTest - Allocated 1 rowsets for step 1 --> Cassandra Output 2012/05/07 16:48:01 - HRTest - Allocating Steps & StepData...2012/05/07 16:48:01 - HRTest - Transformation is about to allocate step [CSV file input] of type [CsvInput]2012/05/07 16:48:01 - HRTest - Step has nrcopies=12012/05/07 16:48:01 - CSV file input.0 - distribution activated2012/05/07 16:48:01 - CSV file input.0 - Starting allocation of buffers & new threads...2012/05/07 16:48:01 - CSV file input.0 - Step info: nrinput=0 nroutput=12012/05/07 16:48:01 - CSV file input.0 - output rel. is 1:12012/05/07 16:48:01 - CSV file input.0 - Found output rowset [CSV file input.0 - Cassandra Output.0]2012/05/07 16:48:01 - CSV file input.0 - Finished dispatching2012/05/07 16:48:01 - HRTest - Transformation has allocated a new step: [CSV file input].02012/05/07 16:48:01 - HRTest - Transformation is about to allocate step [Cassandra Output] of type [CassandraOutput]2012/05/07 16:48:01 - HRTest - Step has nrcopies=12012/05/07 16:48:01 - Cassandra Output.0 - distribution activated2012/05/07 16:48:01 - Cassandra Output.0 - Starting allocation of buffers & new threads...2012/05/07 16:48:01 - Cassandra Output.0 - Step info: nrinput=1 nroutput=02012/05/07 16:48:01 - Cassandra Output.0 - Got previous step from [Cassandra Output] #0 --> CSV file input2012/05/07 16:48:01 - Cassandra Output.0 - input rel is 1:12012/05/07 16:48:01 - Cassandra Output.0 - Found input rowset [CSV file input.0 - Cassandra Output.0]2012/05/07 16:48:01 - Cassandra Output.0 - Finished dispatching2012/05/07 16:48:01 - HRTest - Transformation has allocated a new step: [Cassandra Output].02012/05/07 16:48:01 - HRTest - This transformation can be replayed with replay date: 2012/05/07 16:48:012012/05/07 16:48:01 - HRTest - Initialising 2 steps...2012/05/07 16:48:01 - CSV file input.0 - Running on slave server #0/1.2012/05/07 16:48:01 - Cassandra Output.0 - Running on slave server #0/1.2012/05/07 16:48:01 - HRTest - Step [CSV file input.0] initialized flawlessly.2012/05/07 16:48:01 - HRTest - Step [Cassandra Output.0] initialized flawlessly.2012/05/07 16:48:01 - HRTest - Transformation has allocated 2 threads and 1 rowsets.2012/05/07 16:48:01 - Cassandra Output.0 - Starting to run...2012/05/07 16:48:01 - CSV file input.0 - Starting to run...2012/05/07 16:48:01 - CSV file input.0 - Header row skipped in file 'D:\Documents and Settings\t115506\Desktop\G8DX_2.csv'2012/05/07 16:48:01 - CSV file input.0 - Signaling 'output done' to 1 output rowsets.2012/05/07 16:48:01 - CSV file input.0 - Finished processing (I=9, O=0, R=0, W=8, U=0, E=0)2012/05/07 16:48:01 - Cassandra Output.0 - Connecting to Cassandra node at 'localhost:9160' using keyspace 'tkeyspace'...2012/05/07 16:48:01 - Cassandra Output.0 - Getting meta data for column family 'users'2012/05/07 16:48:01 - Cassandra Output.0 - Closing connection...2012/05/07 16:48:01 - Cassandra Output.0 - ERROR (version 4.3.0, build 16295 from 2012-01-27 15.53.26 by tomcat) : Unexpected error2012/05/07 16:48:01 - Cassandra Output.0 - ERROR (version 4.3.0, build 16295 from 2012-01-27 15.53.26 by tomcat) : org.pentaho.di.core.exception.KettleException: 2012/05/07 16:48:01 - Cassandra Output.0 - ERROR (version 4.3.0, build 16295 from 2012-01-27 15.53.26 by tomcat) : null2012/05/07 16:48:01 - Cassandra Output.0 - ERROR (version 4.3.0, build 16295 from 2012-01-27 15.53.26 by tomcat) : at java.lang.Thread.run (null:-1)2012/05/07 16:48:01 - Cassandra Output.0 - ERROR (version 4.3.0, build 16295 from 2012-01-27 15.53.26 by tomcat) : at org.pentaho.di.trans.step.RunThread.run (RunThread.java:50)2012/05/07 16:48:01 - Cassandra Output.0 - ERROR (version 4.3.0, build 16295 from 2012-01-27 15.53.26 by tomcat) : at org.pentaho.di.trans.steps.cassandraoutput.CassandraOutput.processRow (CassandraOutput.java:176)2012/05/07 16:48:01 - Cassandra Output.0 - ERROR (version 4.3.0, build 16295 from 2012-01-27 15.53.26 by tomcat) : at org.pentaho.cassandra.CassandraColumnMetaData. (CassandraColumnMetaData.java:118)2012/05/07 16:48:01 - Cassandra Output.0 - ERROR (version 4.3.0, build 16295 from 2012-01-27 15.53.26 by tomcat) : at org.pentaho.cassandra.CassandraColumnMetaData.refresh (CassandraColumnMetaData.java:175)2012/05/07 16:48:01 - Cassandra Output.0 - ERROR (version 4.3.0, build 16295 from 2012-01-27 15.53.26 by tomcat) : 2012/05/07 16:48:01 - Cassandra Output.0 - ERROR (version 4.3.0, build 16295 from 2012-01-27 15.53.26 by tomcat) : at org.pentaho.di.trans.steps.cassandraoutput.CassandraOutput.processRow(CassandraOutput.java:185)2012/05/07 16:48:01 - Cassandra Output.0 - ERROR (version 4.3.0, build 16295 from 2012-01-27 15.53.26 by tomcat) : at org.pentaho.di.trans.step.RunThread.run(RunThread.java:50)2012/05/07 16:48:01 - Cassandra Output.0 - ERROR (version 4.3.0, build 16295 from 2012-01-27 15.53.26 by tomcat) : at java.lang.Thread.run(Unknown Source)2012/05/07 16:48:01 - Cassandra Output.0 - ERROR (version 4.3.0, build 16295 from 2012-01-27 15.53.26 by tomcat) : Caused by: java.lang.NullPointerException2012/05/07 16:48:01 - Cassandra Output.0 - ERROR (version 4.3.0, build 16295 from 2012-01-27 15.53.26 by tomcat) : at org.pentaho.cassandra.CassandraColumnMetaData.refresh(CassandraColumnMetaData.java:175)2012/05/07 16:48:01 - Cassandra Output.0 - ERROR (version 4.3.0, build 16295 from 2012-01-27 15.53.26 by tomcat) : at org.pentaho.cassandra.CassandraColumnMetaData.(CassandraColumnMetaData.java:118)2012/05/07 16:48:01 - Cassandra Output.0 - ERROR (version 4.3.0, build 16295 from 2012-01-27 15.53.26 by tomcat) : at org.pentaho.di.trans.steps.cassandraoutput.CassandraOutput.processRow(CassandraOutput.java:176)2012/05/07 16:48:01 - Cassandra Output.0 - ERROR (version 4.3.0, build 16295 from 2012-01-27 15.53.26 by tomcat) : ... 2 more2012/05/07 16:48:01 - Cassandra Output.0 - Closing connection...2012/05/07 16:48:01 - Cassandra Output.0 - Finished processing (I=0, O=0, R=1, W=0, U=0, E=1)2012/05/07 16:48:01 - Spoon - The transformation has finished!!2012/05/07 16:48:01 - HRTest - ERROR (version 4.3.0, build 16295 from 2012-01-27 15.53.26 by tomcat) : Errors detected!2012/05/07 16:48:01 - HRTest - ERROR (version 4.3.0, build 16295 from 2012-01-27 15.53.26 by tomcat) : Errors detected!2012/05/07 16:48:01 - HRTest - HRTest2012/05/07 16:48:01 - HRTest - HRTest2012/05/07 16:48:01 - HRTest - Looking at step: CSV file input2012/05/07 16:48:01 - Cassandra Output.0 - Closing connection...2012/05/07 16:48:02 - HRTest - Looking at step: Cassandra Output2012/05/07 16:48:02 - Cassandra Output.0 - Closing connection...

Mark
05-08-2012, 04:58 AM
Hi,

There have been some small changes in the Cassandra Thrift API between 1.0.8 and 1.1.0 - the errors you are seeing are almost certainly caused by this. Some recent changes to the CassandraColumnMetaData class should have rectified this (while remaining backwards compatible with Cassandra < 1.1.0). You could try a snapshot build of the big data plugin from our CI server:

http://ci.pentaho.com/job/pentaho-big-data-plugin

Cheers,
Mark.

jcc-ch
05-09-2012, 08:23 AM
Hi,There have been some small changes in the Cassandra Thrift API between 1.0.8 and 1.1.0 - the errors you are seeing are almost certainly caused by this. Some recent changes to the CassandraColumnMetaData class should have rectified this (while remaining backwards compatible with Cassandra < 1.1.0). You could try a snapshot build of the big data plugin from our CI server:http://ci.pentaho.com/job/pentaho-big-data-pluginCheers,Mark.You are right. It was a compatibility problem. I'm using now version 1.0.9 and it works.