PDA

View Full Version : Launching .bat file from transformation



charusheel
05-19-2011, 03:27 PM
Hello

I am trying to unzip a .gz file using 7-Zip on my Windows XP machine. Here are the details:

working directory and file: C:\temp\datafile05112011.txt.gz
7-zip program location: C:\Program Files\7-Zip\7z.exe

Important thing is that filename is in a variable such as fname=datafile05112011.txt.gz where date will change everyday.

I am using shell script step and having a hard time getting this working. Probably not clicking it.


anyone has resolved similar issue? Pl. let me know.

Thanks and regards, charu

kandrews
05-19-2011, 03:59 PM
Hi Charu,

Why don't you use a Job to do the unzipping? There is a step (job component) that unzips files. It also can use variables for all of the name & path.

In a transformation you should create a variable that has the dynamic name and path of the zipped file and then in a Job call that trans and then have the unzip a file step (in the job) after the trans. Have that unzip a file use the variable.

I do this all the time.

Good luck!

Kent

charusheel
05-19-2011, 05:46 PM
sorry, I miss wrote. I am trying to do this in the job itself and the issue is passing argument value (filename) with 7-zip command line parameter 'e'.

If you have done the same thing, would it be possible you to share ur shell script step in the job?

The "UnZip" step won't extract .gz file.


Here is the error log if that helps:
2011/05/19 14:21:46 - Spoon - Starting job...

- UnZipFiles.bat - Found 0 previous result rows
- UnZipFiles.bat - Running on platform : Windows XP
- UnZipFiles.bat - Executing command : cmd.exe /C "C:\mydir\UnZipFiles.bat "e DATAFILE_05112011.txt.gz""
- UnZipFiles.bat - (stdout)
- UnZipFiles.bat - (stdout) 7-Zip 4.65 Copyright (c) 1999-2009 Igor Pavlov 2009-02-03
- UnZipFiles.bat - (stdout)
- UnZipFiles.bat - (stdout) Usage: 7z <command> [<switches>...] <archive_name> [<file_names>...]
- UnZipFiles.bat - (stdout) [<@listfiles...>]
- UnZipFiles.bat - (stdout)
- UnZipFiles.bat - (stdout) <Commands>
- UnZipFiles.bat - (stdout) a: Add files to archive
- UnZipFiles.bat - (stdout) b: Benchmark
- UnZipFiles.bat - (stdout) d: Delete files from archive
- UnZipFiles.bat - (stdout) e: Extract files from archive (without using directory names)
- UnZipFiles.bat - (stdout) l: List contents of archive
- UnZipFiles.bat - (stdout) t: Test integrity of archive
- UnZipFiles.bat - (stdout) u: Update files to archive
- UnZipFiles.bat - (stdout) x: eXtract files with full paths
- UnZipFiles.bat - (stdout) <Switches>
- UnZipFiles.bat - (stdout) -ai[r[-|0]]{@listfile|!wildcard}: Include archives
- UnZipFiles.bat - (stdout) -ax[r[-|0]]{@listfile|!wildcard}: eXclude archives
- UnZipFiles.bat - (stdout) -bd: Disable percentage indicator
- UnZipFiles.bat - (stdout) -i[r[-|0]]{@listfile|!wildcard}: Include filenames
- UnZipFiles.bat - (stdout) -m{Parameters}: set compression Method
- UnZipFiles.bat - (stdout) -o{Directory}: set Output directory
- UnZipFiles.bat - (stdout) -p{Password}: set Password
- UnZipFiles.bat - (stdout) -r[-|0]: Recurse subdirectories
- UnZipFiles.bat - (stdout) -scs{UTF-8 | WIN | DOS}: set charset for list files
- UnZipFiles.bat - (stdout) -sfx[{name}]: Create SFX archive
- UnZipFiles.bat - (stdout) -si[{name}]: read data from stdin
- UnZipFiles.bat - (stdout) -slt: show technical information for l (List) command
- UnZipFiles.bat - (stdout) -so: write data to stdout
- UnZipFiles.bat - (stdout) -ssc[-]: set sensitive case mode
- UnZipFiles.bat - (stdout) -ssw: compress shared files
- UnZipFiles.bat - (stdout) -t{Type}: Set type of archive
- UnZipFiles.bat - (stdout) -v{Size}: Create volumes
- UnZipFiles.bat - (stdout) -u[-][p#][q#][r#][x#][y#][z#][!newArchiveName]: Update options
- UnZipFiles.bat - (stdout) -w[{path}]: assign Work directory. Empty path means a temporary directory
- UnZipFiles.bat - (stdout) -x[r[-|0]]]{@listfile|!wildcard}: eXclude filenames
- UnZipFiles.bat - (stdout) -y: assume Yes on all queries
[B]- UnZipFiles.bat - Command cmd.exe /C "C:\mydir\UnZipFiles.bat "e DATAFILE_05112011.txt.gz"" has finished
- TEST-UncompressGZipfile - Finished job entry [UnZipFiles.bat] (result=[true])
- TEST-UncompressGZipfile - Finished job entry [Simple evaluation] (result=[true])
- TEST-UncompressGZipfile - Finished job entry [Get NOPS file with SFTP 2] (result=[true])
- TEST-UncompressGZipfile - Job execution finished
- Spoon - Job has ended.

where UnZipFiles.bat is
@echo off
"C:\Program Files\7-Zip\7z.exe" %

gutlez
05-19-2011, 06:29 PM
Why not read the datafile.txt.gz directly?

file:gz://${WorkingDirectory}/${fname} in the Text File Input..

charusheel
05-20-2011, 11:07 AM
I am not sure about the perfomance hit by first reading into txt file and then moving it to database.

Basically, I am importing the data into MySQL staging area and mySQL bulk loader will not import data from .gz format file.

kandrews
05-20-2011, 12:28 PM
Hi Charu,

I think what Maria is saying is don't bother unzipping the file. PDI is able to read the data in a a zipped file just like if it was unzipped ... which is what I'm thinking you are trying to do.

Also you can use PDI's Bulk Loader to insert the data within the zipped file into your MYSQL db. Not sure if that is what you are doing or if you are actually using the MYSQL Bulk Load utility.

Thanks

Kent

MattCasters
05-20-2011, 02:41 PM
To give an alternative answer to the original question, you can also execute bat files or shell scripts in general with the ... "Text File Output" step.
One of the options in the step is to pass the output, normally written to a text file, to a script of choice.
In this case I guess you could pass the file-names to the script as input on stdin.

charusheel
05-23-2011, 11:11 AM
Matt,
Thanks. I got your suggestion working after banging my head a little-;) Now I can use sftp to transfer a zip file and extract it locally and process it.


Kent,
I tried specifying .gz (compressed data file ) in the bulk loading step and it complained about the 'Data too long for column 1'. Keeping all the settings same for the bulk load step if i change the filename to the extracted one everything works fine. Are there any settings that cannot be used when pointing to .gz fils like 'Ignore first line' etc?
I couldn't find any other setting to specify compression mechanism either. If you have used this approach in the past, please share some more details. I am using Kettle 4.1


Thanks and regards, Charu

kandrews
05-23-2011, 12:21 PM
Hi Charu,

I think you skipped a step in the process...

You would need to have a CSV Input or a Text File Input step "read" in the .gz file and once in "memory" direct that through a "hop" to the Bulk Loading step.

Not sure if a Bulk Loading step can consume a compressed file ... might be able to, not sure.

So you need a step that "inputs" your data that is in a compressed format and then "streams" it to the Bulk Loading step via a hop.

Hope that helps!

Kent

charusheel
05-23-2011, 12:38 PM
Thanks for the clarification. I will try that step to see any performance issue vs. loading extracted file directly in MySQL.

prasada.konatham
06-11-2012, 07:08 AM
Hi All,

My Query is i have created 6-7 jobs in pentaho Kettle and now i have to run the jobs based on schedule?

I have tried windows scheduler & Start Run at job level those are not satisfied with my requirement

My requirement is i will pass the values from UI(dashboard) and based on values i want generate .BAT file in data-integration