Hitachi Vantara Pentaho Community Forums
Results 1 to 7 of 7

Thread: SSH command to csv

  1. #1
    Join Date
    May 2014
    Posts
    13

    Default SSH command to csv

    I am trying to load output of ssh command to excel/postgres database.Basically my ssh command is:

    find /usr/local/data/logs/glassfish3_logs -regex '.+\.log_2015-04-13.+' -printf "%f,%TY-%Tm-%Td %TI:%TM,%k\n"
    This command prints file name,file size and modified date. File size before processing is 200KB-800KB. I get stdout (response) and stderror. I used select value component to select only response and then used split row to split the row but file size is very huge(270MB-2.5GB). How to avoid this?
    I could not upload file but I have a link to kettle job and transformation:

    https://drive.google.com/file/d/0B-Q...ew?usp=sharing

  2. #2
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    One thing you can do to keep TFO file size down to a minimum: Enable option "Fast data dump" on the Content tab of the step settings.
    So long, and thanks for all the fish.

  3. #3
    Join Date
    May 2014
    Posts
    13

    Default

    I tried that now the file size is 8GB. Well my new line separator in command is \n which is what I am using. My format would be:

    stdout;stderror
    log1.csv,2014-1-2
    log2.csv,2014-1-3;N
    pdi-ce-5.0.1-stable
    java-1.7.0_75
    windows 8.

  4. #4
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    You actually increased file size?
    Makes me stumble.
    What's in your ssh_output.txt - does it look healthy?
    Can you show a couple of lines?
    So long, and thanks for all the fish.

  5. #5
    Join Date
    May 2014
    Posts
    13

    Default

    Here is the file for tow output (the final one is 300 MB for small no of file in find command):
    https://drive.google.com/file/d/0B-Q...ew?usp=sharing

    I have tried your solution from previous post. But it is not working.
    response,error_response
    server.log_2015-04-13T18-38-41,2015-04-13 06:38,1956
    server.log_2015-04-13T15-34-23,2015-04-13 03:34,1956
    server.log_2015-04-13T13-08-52,2015-04-13 01:08,1956
    server.log_2015-04-13T15-24-20,2015-04-13 03:24,1956
    server.log_2015-04-13T19-05-43,2015-04-13 07:05,1956
    server.log_2015-04-13T23-02-45,2015-04-13 11:02,1956
    server.log_2015-04-13T21-11-02,2015-04-13 09:11,1956
    Last edited by kitex; 04-14-2015 at 03:07 AM. Reason: what i have tried so far.
    pdi-ce-5.0.1-stable
    java-1.7.0_75
    windows 8.

  6. #6
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    When splitting a field to rows, the original field is kept in the stream.
    Just remove the original field (response) and things should lighten up.
    So long, and thanks for all the fish.

  7. #7
    Join Date
    May 2014
    Posts
    13

    Default It works

    Yes, now things work. Thanks a lot.
    pdi-ce-5.0.1-stable
    java-1.7.0_75
    windows 8.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.