Hitachi Vantara Pentaho Community Forums
Results 1 to 7 of 7

Thread: Transform/Job Speed

  1. #1

    Default Transform/Job Speed

    To All:

    I am in the process of upgrading from a old version (3.2) to 4.2 and noticed that many of the transformation/jobs took quite a bit longer to run. On side by side compare on the same box one process took 34 seconds on 3.4 and 1.2 minutes on the 4.2 version.

    while they run different versions of java (one was 1.5 and 4.2 is 1.6) I made a large improvement by doing a quick reorg on the repository DB. Once I did this the 4.2 speed was brought down to 42 seconds. While quite a bit slower it is something I can deal with in most cases.

    Anyone have any other tricks to get 4.2 via the pan.sh command line to run a bit quicker?

    Thanks

  2. #2
    Join Date
    Jul 2008
    Posts
    8

    Default

    I have the same problem since I use pdi 4.2

  3. #3

    Default

    Something with how the pan.sh works.. while my shell script knowlage stinks the help screen should not be taking 20+ seconds with the only change being the #!/bin/sh to being bash so that it would work at all

    Same results on pan.sh and kitchen.sh. Box is not a slow box either nor hard worked

    bash-3.00# time kitchen.sh
    Options:
    -rep = Repository name
    -user = Repository username
    -pass = Repository password
    -job = The name of the job to launch
    -dir = The directory (dont forget the leading /)
    -file = The filename (Job XML) to launch
    -level = The logging level (Basic, Detailed, Debug, Rowlevel, Error, Nothing)
    -logfile = The logging file to write to
    -listdir = List the directories in the repository
    -listjobs = List the jobs in the specified directory
    -listrep = List the available repositories
    -norep = Do not log into the repository
    -version = show the version, revision and build date
    -param = Set a named parameter <NAME>=<VALUE>. For example -param:FOO=bar
    -listparam = List information concerning the defined parameters in the specified job.
    -export = Exports all linked resources of the specified job. The argument is the name of a ZIP file.
    -maxloglines = The maximum number of log lines that are kept internally by Kettle. Set to 0 to keep all rows (default)
    -maxlogtimeout = The maximum age (in minutes) of a log line while being kept internally by Kettle. Set to 0 to keep all rows indefinitely (default)


    real 0m32.305s
    user 0m23.011s
    sys 0m2.342s

  4. #4

    Default

    Ok root cause found and in my case this works.. A simple dtrace showed me what was going on


    dtrace -n 'syscall:pen*:entry { printf("%s %s",execname,copyinstr(arg0)); }' | grep kettle_4.2 > /tmp/files4

    vs a

    dtrace -n 'syscall:pen*:entry { printf("%s %s",execname,copyinstr(arg0)); }' | grep kettle_3.2 > /tmp/files3

    showed right off the bat what was going on.. all the plugins/libext files were being loaded

    Since I use a windows based system to design the items and a Unix based box to run the jobs/trans jobs I removed the agile plugin and some of the lib ext files I will never use time decreased by 50%

    Word to the wise.. never delete anything without backups and knowing what it will impact!!!! I feel pretty sure I am good by removing the files (I think)

  5. #5
    Join Date
    Jul 2008
    Posts
    8

    Default

    I still have the same problem when I delete few libext or plugin

  6. #6
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    Repository was a bit slower compared to 3.2.5 but I'm sure we fixed this in 4.2.1

    http://jira.pentaho.com/browse/PDI-6816

  7. #7

    Default

    Ahh ha... I have 4.2.0 Stable which the bug fix is not applied to... I may upgrade but most of my issues seem to be fixed with the below..

    Thanks Matt

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.