Hi All,
I am facing a very strange issue while running jobs with very huge data , with large amount of calculations .

1) I have been doing some tests on Pentaho today to work out a optimal way to handle large amount of data.
As I am testing it I notice that the memory is not getting completely released between each run.
This is an example of a memory leak in Pentaho itself.

One way to work around such an issue is to run Pentaho out of process.
Each iteration "batch" starts a new instance of Pentaho.

When I shut down Pentaho, the memory is released, but if I run repeated jobs or transformations with the same instance of Pentaho it does not release all the memory.
2) Hangs (never completes issue)
The bug results in Pentaho (Kettle) hanging "and it never completes".
The CPU utilisation is through the roof when this occurs, and the transformation pipeline stops (no data moving through it).
The memory use is acceptable so it is not a memory or heap issue.
It happens typically after the ETL has been running for about 20 minutes.

Increasing a heap size of memory is also not working .

This bug looks very much like a "race condition" while waiting for a resource that is locked.

Any body having some suggestion and work around , pls suggest to over come this issue.