PDA

View Full Version : CQL 3 - Cassandra



satjo
12-04-2012, 02:02 AM
When can we expect to have the integration of CQL 3 ( or beta) in Cassandra Input and Output steps?

Mark
12-05-2012, 05:12 AM
I can't give a definite time frame on that - we'd need to do a decent amount of testing. But it appears to be the case that turning on support for CQL 3 via the Java libraries is as simple as:

Cassandra.Client.set_cql_version("3.0.0");

So we should be able to provide a check box for enabling it in the steps.

Cheers,
Mark.

satjo
12-09-2012, 12:28 PM
Thanks! Mark

I have built the code with this change. I modified CassandraConnection class to se the version (m_client.set_cql_version("3.0.0")).
I, however, get the following exception when I tried to get the meta data in Cassandra input step after clicking on 'Show schema' button.


org.apache.thrift.transport.TTransportException: Cannot write to null outputStream
at org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:142)
at org.apache.thrift.transport.TFramedTransport.flush(TFramedTransport.java:156)
at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:65)
at org.apache.cassandra.thrift.Cassandra$Client.send_set_cql_version(Cassandra.java:1494)
at org.apache.cassandra.thrift.Cassandra$Client.set_cql_version(Cassandra.java:1486)
at org.pentaho.cassandra.CassandraConnection.<init>(CassandraConnection.java:99)
at org.pentaho.di.trans.steps.cassandrainput.CassandraInputData.getCassandraConnection(CassandraInputData.java:101)
at org.pentaho.di.trans.steps.cassandrainput.CassandraInputDialog$10.widgetSelected(CassandraInputDialog.java:456)
at org.eclipse.swt.widgets.TypedListener.handleEvent(Unknown Source)
at org.eclipse.swt.widgets.EventTable.sendEvent(Unknown Source)

BTW, I had to modify SSTableWriter.java class as the API for SSTableSimpleUnsortedWriter is changed in Cassandra 1.1.6. Otherwise I would get the following exception during the build.

[javac] C:\cassandra\Kettle-plugin-Nov21\big-data-plugin-build-ant\big-data-
plugin-master\src\org\pentaho\di\trans\steps\cassandrasstableoutput\SSTableWrite
r.java:131: cannot find symbol
[javac] symbol : constructor SSTableSimpleUnsortedWriter(java.io.File,java.
lang.String,java.lang.String,org.apache.cassandra.db.marshal.AsciiType,<nulltype
>,int)
[javac] location: class org.apache.cassandra.io.sstable.SSTableSimpleUnsorte
dWriter
[javac] writer = new SSTableSimpleUnsortedWriter(directory, keyspace,
[javac] ^
[javac] Note: Some input files use unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.
[javac] 1 error

Mark
12-10-2012, 03:49 AM
Hi Satjo,

Until I get the chance to look at this myself I can't say for sure what could be going wrong.I did try CQL 3 briefly when the Cassandra 1.1 beta came out in order to experiment with how it handled composite types. The set_cql_version thing worked for me at the time, and I found that it supported composite keys and values but not composite column names (which is why I added the thrift mode). These were the docs that I looked at:

http://www.datastax.com/docs/1.1/dml/using_cql
http://www.datastax.com/dev/blog/whats-new-in-cql-3-0

It sounds like CQL 3 is still not final (and won't be until Cassandra 1.2).

Cheers,
Mark.

satjo
01-03-2013, 11:54 AM
We finally we got the 1.2 released (https://blogs.apache.org/foundation/entry/the_apache_software_foundation_announces38) and looking forward to seeing the integration.

vweijo
01-17-2013, 11:42 AM
Hi Mark,

I'm also interested in using CQL 3 features with Cassandra input step. Modifying the CassandraConnection to include 'm_client.set_cql_version("3.0.0")' shouldn't be too much trouble, but the Thrift version specified in big data plugin requirements is 1.0.8, which doesn't include that method. Do you think there will be problems if I replace Thrift 1.0.8 with, let's say, Thrift 1.2.0?

Mark
01-21-2013, 11:26 PM
Hi,

You could give it a go, but a recompile of the Cassandra input step might be required. Also, methods in CfDef have a nasty habit of disappearing between Cassandra releases - this happened between 0.8 and 1.0; and between 1.0.8 and 1.1.0. It might have happened again :-) The CassandraInput step queries CfDef for information on the column family to display when the "Show schema" button is pressed.

Cheers,
Mark.

vweijo
01-28-2013, 11:45 AM
I finally had time to try creating a custom CassandraInput step, and it kind of worked with cassandra-*-1.1.8 libraries. Uncommenting the 'set_cql_version' from source didn't seem to break anything and access to data works, but for some reason it doesn't seem to get a proper schema for a table with compound keys. For example, the 'Show schema' dialog describes column family data correctly but column metadata is empty, because '... = colDefs.getColumn_metadataIterator();' returns an empty iterator. Not sure if this is a feature or a problem with how the API is used (it described non-compound tables correctly, though).

vweijo
01-29-2013, 11:46 AM
It seems CQL 3 is supported over Thrift only in a very limited fashion, i.e., you have to create tables 'WITH COMPACT STORAGE' and even then you wont' get proper column metadata. Instead of Thrift, CQL 3 is supposed to be used via a native protocol, http://www.datastax.com/dev/blog/binary-protocol, which then requires a major rewrite of the cassandra input step.

Mark
01-30-2013, 03:55 AM
Interesting. Still no support for streaming - streaming, in the Cassandra universe, has been "coming soon" since Noah was a boy :-)

I think it's probably too soon for us to go bananas and rush to support CQL 3 with this new binary protocol. It's been a while since I read the initial stuff on CQL 3, but I'm pretty sure I read that it would require folks to recreate their existing databases in the CQL 3 format. Our experience so far has shown that there are plenty of people out there doing some semi-weird stuff via the non-CQL Thrift transport. One example was encoding arbitrary information in composite column names. I vaguely recall that CQL 3 specifically uses composite column names to encode some sort of index information.

I guess the best thing to do would be to build in options for CQL 3 and the new protocol into the Cassandra steps, while still retaining the existing CQL 2/Thrift and pure Thrift modes.

Cheers,
Mark.

satjo
03-04-2013, 11:49 AM
I agree that it is better to have options for CQL3 and the new protocol while retaining th CQL2 mode.

satjo