US and Worldwide: +1 (866) 660-7555
Page 1 of 2 12 LastLast
Results 1 to 10 of 11

Thread: Launching .bat file from transformation

  1. #1

    Default Launching .bat file from transformation

    Hello

    I am trying to unzip a .gz file using 7-Zip on my Windows XP machine. Here are the details:

    working directory and file: C:\temp\datafile05112011.txt.gz
    7-zip program location: C:\Program Files\7-Zip\7z.exe

    Important thing is that filename is in a variable such as fname=datafile05112011.txt.gz where date will change everyday.

    I am using shell script step and having a hard time getting this working. Probably not clicking it.


    anyone has resolved similar issue? Pl. let me know.

    Thanks and regards, charu
    Last edited by charusheel; 05-19-2011 at 03:30 PM.

  2. #2
    Join Date
    Mar 2006
    Posts
    170

    Default

    Hi Charu,

    Why don't you use a Job to do the unzipping? There is a step (job component) that unzips files. It also can use variables for all of the name & path.

    In a transformation you should create a variable that has the dynamic name and path of the zipped file and then in a Job call that trans and then have the unzip a file step (in the job) after the trans. Have that unzip a file use the variable.

    I do this all the time.

    Good luck!

    Kent

  3. #3

    Default

    sorry, I miss wrote. I am trying to do this in the job itself and the issue is passing argument value (filename) with 7-zip command line parameter 'e'.

    If you have done the same thing, would it be possible you to share ur shell script step in the job?

    The "UnZip" step won't extract .gz file.


    Here is the error log if that helps:
    2011/05/19 14:21:46 - Spoon - Starting job...

    - UnZipFiles.bat - Found 0 previous result rows
    - UnZipFiles.bat - Running on platform : Windows XP
    - UnZipFiles.bat - Executing command : cmd.exe /C "C:\mydir\UnZipFiles.bat "e DATAFILE_05112011.txt.gz""
    - UnZipFiles.bat - (stdout)
    - UnZipFiles.bat - (stdout) 7-Zip 4.65 Copyright (c) 1999-2009 Igor Pavlov 2009-02-03
    - UnZipFiles.bat - (stdout)
    - UnZipFiles.bat - (stdout) Usage: 7z <command> [<switches>...] <archive_name> [<file_names>...]
    - UnZipFiles.bat - (stdout) [<@listfiles...>]
    - UnZipFiles.bat - (stdout)
    - UnZipFiles.bat - (stdout) <Commands>
    - UnZipFiles.bat - (stdout) a: Add files to archive
    - UnZipFiles.bat - (stdout) b: Benchmark
    - UnZipFiles.bat - (stdout) d: Delete files from archive
    - UnZipFiles.bat - (stdout) e: Extract files from archive (without using directory names)
    - UnZipFiles.bat - (stdout) l: List contents of archive
    - UnZipFiles.bat - (stdout) t: Test integrity of archive
    - UnZipFiles.bat - (stdout) u: Update files to archive
    - UnZipFiles.bat - (stdout) x: eXtract files with full paths
    - UnZipFiles.bat - (stdout) <Switches>
    - UnZipFiles.bat - (stdout) -ai[r[-|0]]{@listfile|!wildcard}: Include archives
    - UnZipFiles.bat - (stdout) -ax[r[-|0]]{@listfile|!wildcard}: eXclude archives
    - UnZipFiles.bat - (stdout) -bd: Disable percentage indicator
    - UnZipFiles.bat - (stdout) -i[r[-|0]]{@listfile|!wildcard}: Include filenames
    - UnZipFiles.bat - (stdout) -m{Parameters}: set compression Method
    - UnZipFiles.bat - (stdout) -o{Directory}: set Output directory
    - UnZipFiles.bat - (stdout) -p{Password}: set Password
    - UnZipFiles.bat - (stdout) -r[-|0]: Recurse subdirectories
    - UnZipFiles.bat - (stdout) -scs{UTF-8 | WIN | DOS}: set charset for list files
    - UnZipFiles.bat - (stdout) -sfx[{name}]: Create SFX archive
    - UnZipFiles.bat - (stdout) -si[{name}]: read data from stdin
    - UnZipFiles.bat - (stdout) -slt: show technical information for l (List) command
    - UnZipFiles.bat - (stdout) -so: write data to stdout
    - UnZipFiles.bat - (stdout) -ssc[-]: set sensitive case mode
    - UnZipFiles.bat - (stdout) -ssw: compress shared files
    - UnZipFiles.bat - (stdout) -t{Type}: Set type of archive
    - UnZipFiles.bat - (stdout) -v{Size}[b|k|m|g]: Create volumes
    - UnZipFiles.bat - (stdout) -u[-][p#][q#][r#][x#][y#][z#][!newArchiveName]: Update options
    - UnZipFiles.bat - (stdout) -w[{path}]: assign Work directory. Empty path means a temporary directory
    - UnZipFiles.bat - (stdout) -x[r[-|0]]]{@listfile|!wildcard}: eXclude filenames
    - UnZipFiles.bat - (stdout) -y: assume Yes on all queries
    - UnZipFiles.bat - Command cmd.exe /C "C:\mydir\UnZipFiles.bat "e DATAFILE_05112011.txt.gz"" has finished
    - TEST-UncompressGZipfile - Finished job entry [UnZipFiles.bat] (result=[true])
    - TEST-UncompressGZipfile - Finished job entry [Simple evaluation] (result=[true])
    - TEST-UncompressGZipfile - Finished job entry [Get NOPS file with SFTP 2] (result=[true])
    - TEST-UncompressGZipfile - Job execution finished
    - Spoon - Job has ended.

    where
    UnZipFiles.bat is
    @echo off
    "C:\Program Files\7-Zip\7z.exe" %

    Last edited by charusheel; 05-19-2011 at 05:53 PM.

  4. #4
    Join Date
    Apr 2008
    Posts
    2,597

    Default

    Why not read the datafile.txt.gz directly?

    file:gz://${WorkingDirectory}/${fname} in the Text File Input..
    **THIS IS A SIGNATURE - IT GETS POSTED ON (ALMOST) EVERY POST**
    I'm no expert.
    Take my comments at your own risk.

    PDI user since PDI 3.1
    PDI on Windows 7 & Linux

    Please keep in mind (and this may not apply to this thread):
    No forum member is going to do your work for you. We will help you sort out how to do a specific part of the work, as best we can, in the timelines that our work will allow us.
    Signature Updated: 2014-06-30

  5. #5

    Default

    I am not sure about the perfomance hit by first reading into txt file and then moving it to database.

    Basically, I am importing the data into MySQL staging area and mySQL bulk loader will not import data from .gz format file.

  6. #6
    Join Date
    Mar 2006
    Posts
    170

    Default

    Hi Charu,

    I think what Maria is saying is don't bother unzipping the file. PDI is able to read the data in a a zipped file just like if it was unzipped ... which is what I'm thinking you are trying to do.

    Also you can use PDI's Bulk Loader to insert the data within the zipped file into your MYSQL db. Not sure if that is what you are doing or if you are actually using the MYSQL Bulk Load utility.

    Thanks

    Kent

  7. #7
    Join Date
    Nov 1999
    Posts
    9,689

    Default

    To give an alternative answer to the original question, you can also execute bat files or shell scripts in general with the ... "Text File Output" step.
    One of the options in the step is to pass the output, normally written to a text file, to a script of choice.
    In this case I guess you could pass the file-names to the script as input on stdin.
    Matt Casters, Chief Data Integration
    Pentaho, Open Source Business Intelligence
    http://www.pentaho.org -- mcasters@pentaho.org

    Author of the book Pentaho Kettle Solutions by Wiley. Also available as e-Book and on the Kindle reading applications (iPhone, iPad, Android, Kindle devices, ...)

    Join us on IRC server Freenode.net, channel ##pentaho

  8. #8

    Default

    Matt,
    Thanks. I got your suggestion working after banging my head a little- Now I can use sftp to transfer a zip file and extract it locally and process it.


    Kent,
    I tried specifying .gz (compressed data file ) in the bulk loading step and it complained about the 'Data too long for column 1'. Keeping all the settings same for the bulk load step if i change the filename to the extracted one everything works fine. Are there any settings that cannot be used when pointing to .gz fils like 'Ignore first line' etc?
    I couldn't find any other setting to specify compression mechanism either. If you have used this approach in the past, please share some more details. I am using Kettle 4.1


    Thanks and regards, Charu
    Last edited by charusheel; 05-23-2011 at 11:35 AM.

  9. #9
    Join Date
    Mar 2006
    Posts
    170

    Default

    Hi Charu,

    I think you skipped a step in the process...

    You would need to have a CSV Input or a Text File Input step "read" in the .gz file and once in "memory" direct that through a "hop" to the Bulk Loading step.

    Not sure if a Bulk Loading step can consume a compressed file ... might be able to, not sure.

    So you need a step that "inputs" your data that is in a compressed format and then "streams" it to the Bulk Loading step via a hop.

    Hope that helps!

    Kent

  10. #10

    Default

    Thanks for the clarification. I will try that step to see any performance issue vs. loading extracted file directly in MySQL.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •