Hello,

I am configuring the avro to be read by PDI 5.2.

The PDI is able to browse the file residing on the cluster.However any attempt to preview or to transfer it to any other location resulting the same error message

The system configuration is :
PDI (running in Windows 8.1,64 bit ,16 GB RAM,i5)
Kettle/Spoon GA 5.2.0.0

Hadoop Cluster is working fine.

Done with shims settings
http://wiki.pentaho.com/display/BAD/...for+YARN+Shims

and

http://funpdi.blogspot.in/2013/03/pe...nd-hadoop.html

Followed both aforementioned links


The error message is:

org.pentaho.di.core.exception.KettleFileException:


Exception reading line: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-647747408-172.31.42.71-1418062088466:blk_1073741892_1068 file=/dragonflytest/FiveNOAA_00000.avro
Could not obtain block: BP-647747408-172.31.42.71-1418062088466:blk_1073741892_1068 file=/dragonflytest/FiveNOAA_00000.avro


Could not obtain block: BP-647747408-172.31.42.71-1418062088466:blk_1073741892_1068 file=/dragonflytest/FiveNOAA_00000.avro


at org.pentaho.di.trans.steps.textfileinput.TextFileInput.getLine(TextFileInput.java:157)
at org.pentaho.di.trans.steps.textfileinput.TextFileInput.getLine(TextFileInput.java:95)
at org.pentaho.di.ui.trans.steps.hadoopfileinput.HadoopFileInputDialog.getCSV(HadoopFileInputDialog.java:2563)
at org.pentaho.di.ui.trans.steps.hadoopfileinput.HadoopFileInputDialog.get(HadoopFileInputDialog.java:2507)
at org.pentaho.di.ui.trans.steps.hadoopfileinput.HadoopFileInputDialog.access$300(HadoopFileInputDialog.java:117)
at org.pentaho.di.ui.trans.steps.hadoopfileinput.HadoopFileInputDialog$5.handleEvent(HadoopFileInputDialog.java:483)
at org.eclipse.swt.widgets.EventTable.sendEvent(Unknown Source)
at org.eclipse.swt.widgets.Widget.sendEvent(Unknown Source)
at org.eclipse.swt.widgets.Display.runDeferredEvents(Unknown Source)
at org.eclipse.swt.widgets.Display.readAndDispatch(Unknown Source)
at org.pentaho.di.ui.trans.steps.hadoopfileinput.HadoopFileInputDialog.open(HadoopFileInputDialog.java:680)
at org.pentaho.di.ui.spoon.delegates.SpoonStepsDelegate.editStep(SpoonStepsDelegate.java:124)
at org.pentaho.di.ui.spoon.Spoon.editStep(Spoon.java:8720)
at org.pentaho.di.ui.spoon.trans.TransGraph.editStep(TransGraph.java:3027)
at org.pentaho.di.ui.spoon.trans.TransGraph.mouseDoubleClick(TransGraph.java:744)
at org.eclipse.swt.widgets.TypedListener.handleEvent(Unknown Source)
at org.eclipse.swt.widgets.EventTable.sendEvent(Unknown Source)
at org.eclipse.swt.widgets.Widget.sendEvent(Unknown Source)
at org.eclipse.swt.widgets.Display.runDeferredEvents(Unknown Source)
at org.eclipse.swt.widgets.Display.readAndDispatch(Unknown Source)
at org.pentaho.di.ui.spoon.Spoon.readAndDispatch(Spoon.java:1310)
at org.pentaho.di.ui.spoon.Spoon.waitForDispose(Spoon.java:7931)
at org.pentaho.di.ui.spoon.Spoon.start(Spoon.java:9202)
at org.pentaho.di.ui.spoon.Spoon.main(Spoon.java:648)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.pentaho.commons.launcher.Launcher.main(Launcher.java:92)
Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-647747408-172.31.42.71-1418062088466:blk_1073741892_1068 file=/dragonflytest/FiveNOAA_00000.avro
at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:878)
at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:559)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:789)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:836)
at java.io.DataInputStream.read(Unknown Source)
at java.io.BufferedInputStream.fill(Unknown Source)
at java.io.BufferedInputStream.read(Unknown Source)
at org.apache.commons.vfs.util.MonitorInputStream.read(Unknown Source)
at org.pentaho.di.core.compress.CompressionInputStream.read(CompressionInputStream.java:36)
at java.io.InputStream.read(Unknown Source)
at sun.nio.cs.StreamDecoder.readBytes(Unknown Source)
at sun.nio.cs.StreamDecoder.implRead(Unknown Source)
at sun.nio.cs.StreamDecoder.read(Unknown Source)
at sun.nio.cs.StreamDecoder.read0(Unknown Source)
at sun.nio.cs.StreamDecoder.read(Unknown Source)
at java.io.InputStreamReader.read(Unknown Source)
at org.pentaho.di.trans.steps.textfileinput.TextFileInput.getLine(TextFileInput.java:106)
... 28 more


Please suggest that what is remaining in order to read/preview/write data to hadoop cluster?

Thanks

Rohit