Hitachi Vantara Pentaho Community Forums
Results 1 to 9 of 9

Thread: Stops after Millions Records

  1. #1
    Join Date
    Sep 2008
    Posts
    7

    Default Stops after Millions Records

    Hi,

    I am working on kettle to copy the csv (comma separated file) file records to postgres database.

    The csv file has around 3 Millions records. I have created the script to update the DB. It works for around 50000 records. But for 3 Million records it hangs after reading 1 million records. Script has "Sort" and "Unique" tools.

    Pls. advice how can I make it to work.

    Thanx,
    Swap

  2. #2
    Join Date
    May 2006
    Posts
    4,882

    Default

    Any errors showing up... if yes post them.

    If not...

    Remove the update step and replace it by a text file output. This way you know whether it's a database problem or a PDI problem... If without update step it works, it's database related.

    If it's database related as I expect, have a chat with your local postgres DBA and check what's running in the database at the moment it's hanging.... first guess would be some kind of deadlock.

    Regards,
    Sven

  3. #3
    DEinspanjer Guest

    Default

    If you are using the Sort step then you could be running into memory issues as well.
    What version of Kettle are you running? Can you post the transformation so we can see it?

  4. #4
    Join Date
    Sep 2008
    Posts
    7

    Default

    I m using version 2.5.2

    PFA .ktr file
    Attached Files Attached Files

  5. #5
    Join Date
    Sep 2008
    Posts
    7

    Default

    It's not a database realted issue. Even though i remove DB update, getting same issue. There no Error as such. It only reads 1 Million records (instead of 3 millions) and reading step says "Finished".

  6. #6
    DEinspanjer Guest

    Default

    Does the hppolexc input step say that it is finished with 1M rows or 3M?
    How much memory are you allocating to the JVM? How close is it to that memory cap when it stops?

  7. #7
    Join Date
    Sep 2008
    Posts
    7

    Default

    input step say that it is finished with 1M (it should finish after 3 M)
    How can you set the memory allocation to JVM in kettle ?
    I have no idea abt it. Pls. help me.

  8. #8
    DEinspanjer Guest

    Default

    If the step says that it finished, that means no errors happened and the number of rows processed should equal the number of rows in the file. I'd say you need to double check your file to make sure you understand what is supposed to be happening. Did you try a "wc -l" line count on it?

    As for JVM memory settings, you can either search the web in general or you can search this forum. There are many many posts about setting the JVM memory settings.

  9. #9
    Join Date
    Sep 2008
    Posts
    7

    Default

    I have checked and file has 3M records. it finishes after 1M and sometimes after 500000 records.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.