PDA

View Full Version : HBase read with date range in start and stop key



rajpal
05-14-2012, 07:27 PM
Using PDI(4.3 preview) Big data input/output and have a HBase table with key starting as date in format yyyyMMdd. Now while reading data from HBase, is it possible to read data from 20120501 to 20120515 date range? I tried putting these in start and stop key but didn't work as expected. Also, tried regex in the start key which also didn't work. Spoon picks only the start key and stop key column entry is not making any difference. Always output is from start till current date.

Any suggestions?

Thanks,
Raj

Mark
05-15-2012, 04:49 AM
Have you provided a formatting string in either the format cell in the row corresponding to the key in the fields table or, alternatively, in the start and stop key value fields? In the later case, it can be specified by suffixing the start/stop key value with "@" followed by the formatting string - e.g. @yyyyMMdd in your case.

Cheers,
Mark.

rajpal
05-15-2012, 12:52 PM
Thanks Mark for the suggestion. I have tried putting formating option in teh start/stop but no luck. I'm not sure the format which I applied is correct for the key.

My Key/id this is of type string and not date.
Id/key is of the format: yyyyMMdd-[some-alpha-numeric-number] i.e. 20110201-gdgg23-fhdfh-44343-rkekh11

I tried couple of format options in start/stop : 20110201@yyyyMMdd, 20110201@yyyyMMdd.*
Both of these didn't work. What should be the correct format for this key?

Mark
05-15-2012, 06:39 PM
OK, the formatting string is only useful if your key is actually a date (i.e. internally stored in HBase as a signed long). Unfortunately, the HBaseInput step doesn't support row key filters yet - feel free to lodge a new feature JIRA for this.

Cheers,
Mark.