View Full Version : Fixed Format text file output
Bruce
10-16-2007, 03:00 PM
One of the requirements of our work is the ability to created fixed
format text files (like the ones we can read in through text file
input :) Creating this type of file does not appear to be a strength
of the current text file output as we are not able to list start
positions of fields, truncation is not always applied based on length
depending on the field type, etc.
Am I missing the proper way to do this in Kettle?
Thanks,
Bruce Linn
Decision Intelligence, Inc
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "kettle-developers" group.
To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com
To unsubscribe from this group, send email to kettle-developers-unsubscribe (AT) googlegroups (DOT) com
For more options, visit this group at http://groups.google.com/group/kettle-developers?hl=en
-~----------~----~----~----~------~----~------~--~---
Matt Casters
10-16-2007, 03:20 PM
Hi Bruce,
Actually, the default output of the "Text File Output" step is fixed width if
you hit the "Get fields" button. Check the "right pad fields" and you should
be set.
I'm not sure if it makes sense to be able to specify a position for a field
when in fact that position is not really yours to choose anyway.
As for trimming fields.. Actually we don't ever do it except in the case where
you checked the "right pad fields" option.
Sample in attachment.
HTH,
Matt
On Tuesday 16 October 2007 20:55:23 Bruce wrote:
> One of the requirements of our work is the ability to created fixed
> format text files (like the ones we can read in through text file
> input :) Creating this type of file does not appear to be a strength
> of the current text file output as we are not able to list start
> positions of fields, truncation is not always applied based on length
> depending on the field type, etc.
>
> Am I missing the proper way to do this in Kettle?
>
> Thanks,
>
> Bruce Linn
> Decision Intelligence, Inc
>
>
>
--
Matt
____________________________________________
Matt Casters
Chief Data Integration - Kettle founder
Pentaho, Open Source Business Intelligence
http://www.pentaho.org -- mcasters (AT) pentaho (DOT) org
Tel. +32 (0) 486 97 29 37
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "kettle-developers" group.
To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com
To unsubscribe from this group, send email to kettle-developers-unsubscribe (AT) googlegroups (DOT) com
For more options, visit this group at http://groups.google.com/group/kettle-developers?hl=en
-~----------~----~----~----~------~----~------~--~---
<?xml version="1.0" encoding="UTF-8"?>
<transformation>
<info>
<name>Fixed width output</name>
<description/>
<extended_description/>
<trans_version/>
<filename>/home/matt/test-stuff/Fixed width output.ktr</filename>
<directory>/</directory>
<log>
<read/>
<write/>
<input/>
<output/>
<update/>
<rejected/>
<connection/>
<table/>
<use_batchid>Y</use_batchid>
<use_logfield>N</use_logfield>
</log>
<maxdate>
<connection/>
<table/>
<field/>
<offset>0.0</offset>
<maxdiff>0.0</maxdiff>
</maxdate>
<size_rowset>10000</size_rowset>
<sleep_time_empty>50</sleep_time_empty>
<sleep_time_full>50</sleep_time_full>
<unique_connections>N</unique_connections>
<feedback_shown>Y</feedback_shown>
<feedback_size>50000</feedback_size>
<using_thread_priorities>N</using_thread_priorities>
<shared_objects_file/>
<dependencies>
</dependencies>
<partitionschemas>
</partitionschemas>
<slaveservers>
<slaveserver><name>localhost:8084</name><hostname>localhost</hostname><port>8084</port><username>cluster</username><password>Encrypted 2be98afc86aa7f2e4cb1aa265cd86aac8</password><proxy_hostname/><proxy_port/><non_proxy_hosts/><master>N</master></slaveserver>
<slaveserver><name>localhost:8083</name><hostname>localhost</hostname><port>8083</port><username>cluster</username><password>Encrypted 2be98afc86aa7f2e4cb1aa265cd86aac8</password><proxy_hostname/><proxy_port/><non_proxy_hosts/><master>N</master></slaveserver>
<slaveserver><name>localhost:8082</name><hostname>localhost</hostname><port>8082</port><username>cluster</username><password>Encrypted 2be98afc86aa7f2e4cb1aa265cd86aac8</password><proxy_hostname/><proxy_port/><non_proxy_hosts/><master>N</master></slaveserver>
<slaveserver><name>localhost:8081</name><hostname>localhost</hostname><port>8081</port><username>cluster</username><password>Encrypted 2be98afc86aa7f2e4cb1aa265cd86aac8</password><proxy_hostname/><proxy_port/><non_proxy_hosts/><master>N</master></slaveserver>
<slaveserver><name>localhost:8080:Master</name><hostname>localhost</hostname><port>8080</port><username>cluster</username><password>Encrypted 2be98afc86aa7f2e4cb1aa265cd86aac8</password><proxy_hostname/><proxy_port/><non_proxy_hosts/><master>Y</master></slaveserver>
<slaveserver><name>localhost:8080</name><hostname>localhost</hostname><port>8080</port><username>cluster</username><password>Encrypted 2be98afc86aa7f2e4cb1aa265cd86aac8</password><proxy_hostname/><proxy_port/><non_proxy_hosts/><master>N</master></slaveserver>
</slaveservers>
<clusterschemas>
<clusterschema>
<name>local schema</name>
<base_port>40000</base_port>
<sockets_buffer_size>2000</sockets_buffer_size>
<sockets_flush_interval>5000</sockets_flush_interval>
<sockets_compressed>N</sockets_compressed>
<slaveservers>
<name>localhost:8080:Master</name>
<name>localhost:8081</name>
<name>localhost:8082</name>
</slaveservers>
</clusterschema>
</clusterschemas>
<modified_user>-</modified_user>
<modified_date>2007/10/16 21:10:20.247</modified_date>
</info>
<notepads>
</notepads>
<connection>
<name>PGSQL Localhost test</name>
<server>localhost</server>
<type>POSTGRESQL</type>
<access>Native</access>
<database>test</database>
<port>5432</port>
<username>matt</username>
<password>Encrypted 2be98afc86aa7f2e4cb79ce10df90acde</password>
<servername/>
<data_tablespace/>
<index_tablespace/>
<attributes>
<attribute><code>CLUSTER_DBNAME_0</code><attribute>db1</attribute></attribute>
<attribute><code>CLUSTER_DBNAME_1</code><attribute>db2</attribute></attribute>
<attribute><code>CLUSTER_DBNAME_2</code><attribute>db3</attribute></attribute>
<attribute><code>CLUSTER_DBNAME_3</code><attribute>db4</attribute></attribute>
<attribute><code>CLUSTER_DBNAME_4</code><attribute>db5</attribute></attribute>
<attribute><code>CLUSTER_HOSTNAME_0</code><attribute>192.168.1.10</attribute></attribute>
<attribute><code>CLUSTER_HOSTNAME_1</code><attribute>192.168.1.10</attribute></attribute>
<attribute><code>CLUSTER_HOSTNAME_2</code><attribute>192.168.1.10</attribute></attribute>
<attribute><code>CLUSTER_HOSTNAME_3</code><attribute>192.168.1.10</attribute></attribute>
<attribute><code>CLUSTER_HOSTNAME_4</code><attribute>192.168.1.10</attribute></attribute>
<attribute><code>CLUSTER_PARTITION_0</code><attribute>PartDB1</attribute></attribute>
<attribute><code>CLUSTER_PARTITION_1</code><attribute>PartDB2</attribute></attribute>
<attribute><code>CLUSTER_PARTITION_2</code><attribute>PartDB3</attribute></attribute>
<attribute><code>CLUSTER_PARTITION_3</code><attribute>PartDB4</attribute></attribute>
<attribute><code>CLUSTER_PARTITION_4</code><attribute>PartDB5</attribute></attribute>
<attribute><code>CLUSTER_PASSWORD_0</code><attribute>Encrypted </attribute></attribute>
<attribute><code>CLUSTER_PASSWORD_1</code><attribute>Encrypted </attribute></attribute>
<attribute><code>CLUSTER_PASSWORD_2</code><attribute>Encrypted </attribute></attribute>
<attribute><code>CLUSTER_PASSWORD_3</code><attribute>Encrypted </attribute></attribute>
<attribute><code>CLUSTER_PASSWORD_4</code><attribute>Encrypted </attribute></attribute>
<attribute><code>CLUSTER_PORT_0</code><attribute>3306</attribute></attribute>
<attribute><code>CLUSTER_PORT_1</code><attribute>3306</attribute></attribute>
<attribute><code>CLUSTER_PORT_2</code><attribute>3306</attribute></attribute>
<attribute><code>CLUSTER_PORT_3</code><attribute>3306</attribute></attribute>
<attribute><code>CLUSTER_PORT_4</code><attribute>3306</attribute></attribute>
<attribute><code>CUSTOM_DRIVER_CLASS</code><attribute>com.ibm.u2.jdbc.UniJDBCDriver</attribute></attribute>
<attribute><code>CUSTOM_URL</code><attribute>jdbc:universe://localhost/database</attribute></attribute>
<attribute><code>EXTRA_OPTION_MYSQL.defaultFetchSize</code><attribute>500</attribute></attribute>
<attribute><code>EXTRA_OPTION_MYSQL.rewriteBatchedStatements</code><attribute>false</attribute></attribute>
<attribute><code>EXTRA_OPTION_MYSQL.useCursorFetch</code><attribute>true</attribute></attribute>
<attribute><code>EXTRA_OPTION_MYSQL.zeroDateTimeBehavior</code><attribute>convertToNull</attribute></attribute>
<attribute><code>EXTRA_OPTION_SYBASE.SQLINITSTRING</code><attribute>SET CHAINED OFF</attribute></attribute>
<attribute><code>FORCE_IDENTIFIERS_TO_LOWERCASE</code><attribute>N</attribute></attribute>
<attribute><code>FORCE_IDENTIFIERS_TO_UPPERCASE</code><attribute>N</attribute></attribute>
<attribute><code>INITIAL_POOL_SIZE</code><attribute>5</attribute></attribute>
<attribute><code>IS_CLUSTERED</code><attribute>N</attribute></attribute>
<attribute><code>MAXIMUM_POOL_SIZE</code><attribute>10</attribute></attribute>
<attribute><code>MSSQL_DOUBLE_DECIMAL_SEPARATOR</code><attribute>N</attribute></attribute>
<attribute><code>POOLING_defaultCatalog</code><attribute>catalog</attribute></attribute>
<attribute><code>POOLING_removeAbandoned</code><attribute>true</attribute></attribute>
<attribute><code>POOLING_testOnReturn</code><attribute>false</attribute></attribute>
<attribute><code>PORT_NUMBER</code><attribute>5432</attribute></attribute>
<attribute><code>QUOTE_ALL_FIELDS</code><attribute>N</attribute></attribute>
<attribute><code>STREAM_RESULTS</code><attribute>Y</attribute></attribute>
<attribute><code>USE_POOLING</code><attribute>N</attribute></attribute>
</attributes>
</connection>
<connection>
<name>MySQL localhost test</name>
<server>localhost</server>
<type>MYSQL</type>
<access>Native</access>
<database>test</database>
<port>3306</port>
<username>matt</username>
<password>Encrypted 2be98afc86aa7f2e4cb79ce10df90acde</password>
<servername/>
<data_tablespace/>
<index_tablespace/>
<attributes>
<attribute><code>CLUSTER_DBNAME_0</code><attribute>db1</attribute></attribute>
<attribute><code>CLUSTER_DBNAME_1</code><attribute>db2</attribute></attribute>
<attribute><code>CLUSTER_DBNAME_2</code><attribute>db3</attribute></attribute>
<attribute><code>CLUSTER_DBNAME_3</code><attribute>db4</attribute></attribute>
<attribute><code>CLUSTER_DBNAME_4</code><attribute>db5</attribute></attribute>
<attribute><code>CLUSTER_HOSTNAME_0</code><attribute>192.168.1.10</attribute></attribute>
<attribute><code>CLUSTER_HOSTNAME_1</code><attribute>192.168.1.10</attribute></attribute>
<attribute><code>CLUSTER_HOSTNAME_2</code><attribute>192.168.1.10</attribute></attribute>
<attribute><code>CLUSTER_HOSTNAME_3</code><attribute>192.168.1.10</attribute></attribute>
<attribute><code>CLUSTER_HOSTNAME_4</code><attribute>192.168.1.10</attribute></attribute>
<attribute><code>CLUSTER_PARTITION_0</code><attribute>PartDB1</attribute></attribute>
<attribute><code>CLUSTER_PARTITION_1</code><attribute>PartDB2</attribute></attribute>
<attribute><code>CLUSTER_PARTITION_2</code><attribute>PartDB3</attribute></attribute>
<attribute><code>CLUSTER_PARTITION_3</code><attribute>PartDB4</attribute></attribute>
<attribute><code>CLUSTER_PARTITION_4</code><attribute>PartDB5</attribute></attribute>
<attribute><code>CLUSTER_PASSWORD_0</code><attribute>Encrypted </attribute></attribute>
<attribute><code>CLUSTER_PASSWORD_1</code><attribute>Encrypted </attribute></attribute>
<attribute><code>CLUSTER_PASSWORD_2</code><attribute>Encrypted </attribute></attribute>
<attribute><code>CLUSTER_PASSWORD_3</code><attribute>Encrypted </attribute></attribute>
<attribute><code>CLUSTER_PASSWORD_4</code><attribute>Encrypted </attribute></attribute>
<attribute><code>CLUSTER_PORT_0</code><attribute>3306</attribute></attribute>
<attribute><code>CLUSTER_PORT_1</code><attribute>3306</attribute></attribute>
<attribute><code>CLUSTER_PORT_2</code><attribute>3306</attribute></attribute>
<attribute><code>CLUSTER_PORT_3</code><attribute>3306</attribute></attribute>
<attribute><code>CLUSTER_PORT_4</code><attribute>3306</attribute></attribute>
<attribute><code>CUSTOM_DRIVER_CLASS</code><attribute>com.ibm.u2.jdbc.UniJDBCDriver</attribute></attribute>
<attribute><code>CUSTOM_URL</code><attribute>jdbc:universe://localhost/database</attribute></attribute>
<attribute><code>EXTRA_OPTION_MYSQL.defaultFetchSize</code><attribute>500</attribute></attribute>
<attribute><code>EXTRA_OPTION_MYSQL.rewriteBatchedStatements</code><attribute>false</attribute></attribute>
<attribute><code>EXTRA_OPTION_MYSQL.useCursorFetch</code><attribute>true</attribute></attribute>
<attribute><code>EXTRA_OPTION_MYSQL.zeroDateTimeBehavior</code><attribute>convertToNull</attribute></attribute>
<attribute><code>EXTRA_OPTION_SYBASE.SQLINITSTRING</code><attribute>SET CHAINED OFF</attribute></attribute>
<attribute><code>INITIAL_POOL_SIZE</code><attribute>5</attribute></attribute>
<attribute><code>IS_CLUSTERED</code><attribute>N</attribute></attribute>
<attribute><code>MAXIMUM_POOL_SIZE</code><attribute>10</attribute></attribute>
<attribute><code>MSSQL_DOUBLE_DECIMAL_SEPARATOR</code><attribute>N</attribute></attribute>
<attribute><code>POOLING_defaultCatalog</code><attribute>catalog</attribute></attribute>
<attribute><code>POOLING_removeAbandoned</code><attribute>true</attribute></attribute>
<attribute><code>POOLING_testOnReturn</code><attribute>false</attribute></attribute>
<attribute><code>PORT_NUMBER</code><attribute>3306</attribute></attribute>
<attribute><code>QUOTE_ALL_FIELDS</code><attribute>N</attribute></attribute>
<attribute><code>STREAM_RESULTS</code><attribute>Y</attribute></attribute>
<attribute><code>USE_POOLING</code><attribute>N</attribute></attribute>
</attributes>
</connection>
<order>
<hop> <from>Generate Rows</from><to>Dummy (do nothing)</to><enabled>Y</enabled> </hop> <hop> <from>Generate Rows 2</from><to>Dummy (do nothing)</to><enabled>Y</enabled> </hop> <hop> <from>Dummy (do nothing)</from><to>Text file output</to><enabled>Y</enabled> </hop> </order>
<step>
<name>Generate Rows</name>
<type>RowGenerator</type>
<description/>
<distribute>Y</distribute>
<copies>1</copies>
<partitioning>
<method>none</method>
<schema_name/>
</partitioning>
<fields>
<field>
<name>A</name>
<type>String</type>
<format/>
<currency/>
<decimal/>
<group/>
<nullif>The quick brown fox jumped over the lazy dog</nullif>
<length>20</length>
<precision>-1</precision>
</field>
</fields>
<limit>10</limit>
<cluster_schema/>
<remotesteps> <input> </input> <output> </output> </remotesteps> <GUI>
<xloc>217</xloc>
<yloc>141</yloc>
<draw>Y</draw>
</GUI>
</step>
<step>
<name>Generate Rows 2</name>
<type>RowGenerator</type>
<description/>
<distribute>Y</distribute>
<copies>1</copies>
<partitioning>
<method>none</method>
<schema_name/>
</partitioning>
<fields>
<field>
<name>A</name>
<type>String</type>
<format/>
<currency/>
<decimal/>
<group/>
<nullif>A shorter text.</nullif>
<length>20</length>
<precision>-1</precision>
</field>
</fields>
<limit>10</limit>
<cluster_schema/>
<remotesteps> <input> </input> <output> </output> </remotesteps> <GUI>
<xloc>220</xloc>
<yloc>246</yloc>
<draw>Y</draw>
</GUI>
</step>
<step>
<name>Dummy (do nothing)</name>
<type>Dummy</type>
<description/>
<distribute>Y</distribute>
<copies>1</copies>
<partitioning>
<method>none</method>
<schema_name/>
</partitioning>
<cluster_schema/>
<remotesteps> <input> </input> <output> </output> </remotesteps> <GUI>
<xloc>422</xloc>
<yloc>205</yloc>
<draw>Y</draw>
</GUI>
</step>
<step>
<name>Text file output</name>
<type>TextFileOutput</type>
<description/>
<distribute>Y</distribute>
<copies>1</copies>
<partitioning>
<method>none</method>
<schema_name/>
</partitioning>
<separator/>
<enclosure/>
<enclosure_forced>N</enclosure_forced>
<header>N</header>
<footer>N</footer>
<format>Unix</format>
<compression>None</compression>
<encoding/>
<endedLine/>
<file>
<name>${java.io.tmpdir}/fixed</name>
<is_command>N</is_command>
<extention>txt</extention>
<append>N</append>
<split>N</split>
<haspartno>N</haspartno>
<add_date>N</add_date>
<add_time>N</add_time>
<pad>Y</pad>
<fast_dump>N</fast_dump>
<splitevery>0</splitevery>
</file>
<fields>
<field>
<name>A</name>
<type>String</type>
<format/>
<currency/>
<decimal/>
<group/>
<nullif/>
<length>20</length>
<precision>-1</precision>
</field>
</fields>
<cluster_schema/>
<remotesteps> <input> </input> <output> </output> </remotesteps> <GUI>
<xloc>615</xloc>
<yloc>205</yloc>
<draw>Y</draw>
</GUI>
</step>
<step_error_handling>
</step_error_handling>
<slave-step-copy-partition-distribution>
</slave-step-copy-partition-distribution>
<slave_transformation>N</slave_transformation>
</transformation>
Bruce Linn
10-17-2007, 04:20 PM
The method you provided is essentially what I have been testing with. My
biggest issue in creating a fixed formatted file is how numbers are handled
in the write node. Specifically, it appears that for numbers the specified
"length" has no real impact (unlike how it is applied for strings). If a
number is smaller than a specified length, we can certainly handle this
through a format mask to make it match the length. However, in the case
where the number is larger than the length, text file output will write the
entire number to the file thereby making the "fixed format" invalid.
For our purposes, it would be better if the value were truncated
(maintaining the fixed format) and generated an error/warning letting us
know that a value had been truncated. I realize that we could format and
convert the value to a string through java script or some other node, but in
our application we are attempting to simplify the ETL functionality as much
as possible. In this case, we would prefer a smarter file write as opposed
to multiple steps of formatting/conversion.
I've attached an example where field B is a number and is not being
truncated (or padded since I didn't format).
If this concept makes sense to you, is this something we could work on
together?
Bruce
On 10/16/07, Matt Casters <mcasters (AT) pentaho (DOT) org> wrote:
>
>
> Hi Bruce,
>
> Actually, the default output of the "Text File Output" step is fixed width
> if
> you hit the "Get fields" button. Check the "right pad fields" and you
> should
> be set.
> I'm not sure if it makes sense to be able to specify a position for a
> field
> when in fact that position is not really yours to choose anyway.
>
> As for trimming fields.. Actually we don't ever do it except in the case
> where
> you checked the "right pad fields" option.
>
> Sample in attachment.
>
> HTH,
>
> Matt
>
>
>
> On Tuesday 16 October 2007 20:55:23 Bruce wrote:
> > One of the requirements of our work is the ability to created fixed
> > format text files (like the ones we can read in through text file
> > input :) Creating this type of file does not appear to be a strength
> > of the current text file output as we are not able to list start
> > positions of fields, truncation is not always applied based on length
> > depending on the field type, etc.
> >
> > Am I missing the proper way to do this in Kettle?
> >
> > Thanks,
> >
> > Bruce Linn
> > Decision Intelligence, Inc
> >
> >
> >
>
>
> --
> Matt
> ____________________________________________
> Matt Casters
> Chief Data Integration - Kettle founder
> Pentaho, Open Source Business Intelligence
> http://www.pentaho.org -- mcasters (AT) pentaho (DOT) org
> Tel. +32 (0) 486 97 29 37
>
> >
>
> <?xml version="1.0" encoding="UTF-8"?>
> <transformation>
> <info>
> <name>Fixed width output</name>
> <description/>
> <extended_description/>
> <trans_version/>
> <filename>/home/matt/test-stuff/Fixed width output.ktr
> </filename>
> <directory>/</directory>
> <log>
> <read/>
> <write/>
> <input/>
> <output/>
> <update/>
> <rejected/>
> <connection/>
> <table/>
> <use_batchid>Y</use_batchid>
> <use_logfield>N</use_logfield>
> </log>
> <maxdate>
> <connection/>
> <table/>
> <field/>
> <offset>0.0</offset>
> <maxdiff>0.0</maxdiff>
> </maxdate>
> <size_rowset>10000</size_rowset>
> <sleep_time_empty>50</sleep_time_empty>
> <sleep_time_full>50</sleep_time_full>
> <unique_connections>N</unique_connections>
> <feedback_shown>Y</feedback_shown>
> <feedback_size>50000</feedback_size>
> <using_thread_priorities>N</using_thread_priorities>
> <shared_objects_file/>
> <dependencies>
> </dependencies>
> <partitionschemas>
> </partitionschemas>
> <slaveservers>
>
> <slaveserver><name>localhost:8084</name><hostname>localhost</hostname><port>8084</port><username>cluster</username><password>Encrypted
> 2be98afc86aa7f2e4cb1aa265cd86aac8</password><proxy_hostname/><proxy_port/><non_proxy_hosts/><master>N</master></slaveserver>
>
> <slaveserver><name>localhost:8083</name><hostname>localhost</hostname><port>8083</port><username>cluster</username><password>Encrypted
> 2be98afc86aa7f2e4cb1aa265cd86aac8</password><proxy_hostname/><proxy_port/><non_proxy_hosts/><master>N</master></slaveserver>
>
> <slaveserver><name>localhost:8082</name><hostname>localhost</hostname><port>8082</port><username>cluster</username><password>Encrypted
> 2be98afc86aa7f2e4cb1aa265cd86aac8</password><proxy_hostname/><proxy_port/><non_proxy_hosts/><master>N</master></slaveserver>
>
> <slaveserver><name>localhost:8081</name><hostname>localhost</hostname><port>8081</port><username>cluster</username><password>Encrypted
> 2be98afc86aa7f2e4cb1aa265cd86aac8</password><proxy_hostname/><proxy_port/><non_proxy_hosts/><master>N</master></slaveserver>
>
> <slaveserver><name>localhost:8080:Master</name><hostname>localhost</hostname><port>8080</port><username>cluster</username><password>Encrypted
> 2be98afc86aa7f2e4cb1aa265cd86aac8</password><proxy_hostname/><proxy_port/><non_proxy_hosts/><master>Y</master></slaveserver>
>
> <slaveserver><name>localhost:8080</name><hostname>localhost</hostname><port>8080</port><username>cluster</username><password>Encrypted
> 2be98afc86aa7f2e4cb1aa265cd86aac8</password><proxy_hostname/><proxy_port/><non_proxy_hosts/><master>N</master></slaveserver>
> </slaveservers>
> <clusterschemas>
> <clusterschema>
> <name>local schema</name>
> <base_port>40000</base_port>
> <sockets_buffer_size>2000</sockets_buffer_size>
> <sockets_flush_interval>5000</sockets_flush_interval>
> <sockets_compressed>N</sockets_compressed>
> <slaveservers>
> <name>localhost:8080:Master</name>
> <name>localhost:8081</name>
> <name>localhost:8082</name>
> </slaveservers>
> </clusterschema>
> </clusterschemas>
> <modified_user>-</modified_user>
> <modified_date>2007/10/16 21:10:20.247</modified_date>
> </info>
> <notepads>
> </notepads>
> <connection>
> <name>PGSQL Localhost test</name>
> <server>localhost</server>
> <type>POSTGRESQL</type>
> <access>Native</access>
> <database>test</database>
> <port>5432</port>
> <username>matt</username>
> <password>Encrypted 2be98afc86aa7f2e4cb79ce10df90acde</password>
> <servername/>
> <data_tablespace/>
> <index_tablespace/>
> <attributes>
>
> <attribute><code>CLUSTER_DBNAME_0</code><attribute>db1</attribute></attribute>
>
> <attribute><code>CLUSTER_DBNAME_1</code><attribute>db2</attribute></attribute>
>
> <attribute><code>CLUSTER_DBNAME_2</code><attribute>db3</attribute></attribute>
>
> <attribute><code>CLUSTER_DBNAME_3</code><attribute>db4</attribute></attribute>
>
> <attribute><code>CLUSTER_DBNAME_4</code><attribute>db5</attribute></attribute>
> <attribute><code>CLUSTER_HOSTNAME_0</code><attribute>192.168.1.10
> </attribute></attribute>
> <attribute><code>CLUSTER_HOSTNAME_1</code><attribute>192.168.1.10
> </attribute></attribute>
> <attribute><code>CLUSTER_HOSTNAME_2</code><attribute>192.168.1.10
> </attribute></attribute>
> <attribute><code>CLUSTER_HOSTNAME_3</code><attribute>192.168.1.10
> </attribute></attribute>
> <attribute><code>CLUSTER_HOSTNAME_4</code><attribute>192.168.1.10
> </attribute></attribute>
>
> <attribute><code>CLUSTER_PARTITION_0</code><attribute>PartDB1</attribute></attribute>
>
> <attribute><code>CLUSTER_PARTITION_1</code><attribute>PartDB2</attribute></attribute>
>
> <attribute><code>CLUSTER_PARTITION_2</code><attribute>PartDB3</attribute></attribute>
>
> <attribute><code>CLUSTER_PARTITION_3</code><attribute>PartDB4</attribute></attribute>
>
> <attribute><code>CLUSTER_PARTITION_4</code><attribute>PartDB5</attribute></attribute>
> <attribute><code>CLUSTER_PASSWORD_0</code><attribute>Encrypted
> </attribute></attribute>
> <attribute><code>CLUSTER_PASSWORD_1</code><attribute>Encrypted
> </attribute></attribute>
> <attribute><code>CLUSTER_PASSWORD_2</code><attribute>Encrypted
> </attribute></attribute>
> <attribute><code>CLUSTER_PASSWORD_3</code><attribute>Encrypted
> </attribute></attribute>
> <attribute><code>CLUSTER_PASSWORD_4</code><attribute>Encrypted
> </attribute></attribute>
>
> <attribute><code>CLUSTER_PORT_0</code><attribute>3306</attribute></attribute>
>
> <attribute><code>CLUSTER_PORT_1</code><attribute>3306</attribute></attribute>
>
> <attribute><code>CLUSTER_PORT_2</code><attribute>3306</attribute></attribute>
>
> <attribute><code>CLUSTER_PORT_3</code><attribute>3306</attribute></attribute>
>
> <attribute><code>CLUSTER_PORT_4</code><attribute>3306</attribute></attribute>
> <attribute><code>CUSTOM_DRIVER_CLASS</code><attribute>
> com.ibm.u2.jdbc.UniJDBCDriver</attribute></attribute>
>
> <attribute><code>CUSTOM_URL</code><attribute>jdbc:universe://localhost/database</attribute></attribute>
>
> <attribute><code>EXTRA_OPTION_MYSQL.defaultFetchSize</code><attribute>500</attribute></attribute>
>
> <attribute><code>EXTRA_OPTION_MYSQL.rewriteBatchedStatements</code><attribute>false</attribute></attribute>
>
> <attribute><code>EXTRA_OPTION_MYSQL.useCursorFetch</code><attribute>true</attribute></attribute>
>
> <attribute><code>EXTRA_OPTION_MYSQL.zeroDateTimeBehavior</code><attribute>convertToNull</attribute></attribute>
> <attribute><code>EXTRA_OPTION_SYBASE.SQLINITSTRING</code><attribute>SET
> CHAINED OFF</attribute></attribute>
>
> <attribute><code>FORCE_IDENTIFIERS_TO_LOWERCASE</code><attribute>N</attribute></attribute>
>
> <attribute><code>FORCE_IDENTIFIERS_TO_UPPERCASE</code><attribute>N</attribute></attribute>
>
> <attribute><code>INITIAL_POOL_SIZE</code><attribute>5</attribute></attribute>
>
> <attribute><code>IS_CLUSTERED</code><attribute>N</attribute></attribute>
>
> <attribute><code>MAXIMUM_POOL_SIZE</code><attribute>10</attribute></attribute>
>
> <attribute><code>MSSQL_DOUBLE_DECIMAL_SEPARATOR</code><attribute>N</attribute></attribute>
>
> <attribute><code>POOLING_defaultCatalog</code><attribute>catalog</attribute></attribute>
>
> <attribute><code>POOLING_removeAbandoned</code><attribute>true</attribute></attribute>
>
> <attribute><code>POOLING_testOnReturn</code><attribute>false</attribute></attribute>
>
> <attribute><code>PORT_NUMBER</code><attribute>5432</attribute></attribute>
>
> <attribute><code>QUOTE_ALL_FIELDS</code><attribute>N</attribute></attribute>
>
> <attribute><code>STREAM_RESULTS</code><attribute>Y</attribute></attribute>
>
> <attribute><code>USE_POOLING</code><attribute>N</attribute></attribute>
> </attributes>
> </connection>
> <connection>
> <name>MySQL localhost test</name>
> <server>localhost</server>
> <type>MYSQL</type>
> <access>Native</access>
> <database>test</database>
> <port>3306</port>
> <username>matt</username>
> <password>Encrypted 2be98afc86aa7f2e4cb79ce10df90acde</password>
> <servername/>
> <data_tablespace/>
> <index_tablespace/>
> <attributes>
>
> <attribute><code>CLUSTER_DBNAME_0</code><attribute>db1</attribute></attribute>
>
> <attribute><code>CLUSTER_DBNAME_1</code><attribute>db2</attribute></attribute>
>
> <attribute><code>CLUSTER_DBNAME_2</code><attribute>db3</attribute></attribute>
>
> <attribute><code>CLUSTER_DBNAME_3</code><attribute>db4</attribute></attribute>
>
> <attribute><code>CLUSTER_DBNAME_4</code><attribute>db5</attribute></attribute>
> <attribute><code>CLUSTER_HOSTNAME_0</code><attribute>192.168.1.10
> </attribute></attribute>
> <attribute><code>CLUSTER_HOSTNAME_1</code><attribute>192.168.1.10
> </attribute></attribute>
> <attribute><code>CLUSTER_HOSTNAME_2</code><attribute>192.168.1.10
> </attribute></attribute>
> <attribute><code>CLUSTER_HOSTNAME_3</code><attribute>192.168.1.10
> </attribute></attribute>
> <attribute><code>CLUSTER_HOSTNAME_4</code><attribute>192.168.1.10
> </attribute></attribute>
>
> <attribute><code>CLUSTER_PARTITION_0</code><attribute>PartDB1</attribute></attribute>
>
> <attribute><code>CLUSTER_PARTITION_1</code><attribute>PartDB2</attribute></attribute>
>
> <attribute><code>CLUSTER_PARTITION_2</code><attribute>PartDB3</attribute></attribute>
>
> <attribute><code>CLUSTER_PARTITION_3</code><attribute>PartDB4</attribute></attribute>
>
> <attribute><code>CLUSTER_PARTITION_4</code><attribute>PartDB5</attribute></attribute>
> <attribute><code>CLUSTER_PASSWORD_0</code><attribute>Encrypted
> </attribute></attribute>
> <attribute><code>CLUSTER_PASSWORD_1</code><attribute>Encrypted
> </attribute></attribute>
> <attribute><code>CLUSTER_PASSWORD_2</code><attribute>Encrypted
> </attribute></attribute>
> <attribute><code>CLUSTER_PASSWORD_3</code><attribute>Encrypted
> </attribute></attribute>
> <attribute><code>CLUSTER_PASSWORD_4</code><attribute>Encrypted
> </attribute></attribute>
>
> <attribute><code>CLUSTER_PORT_0</code><attribute>3306</attribute></attribute>
>
> <attribute><code>CLUSTER_PORT_1</code><attribute>3306</attribute></attribute>
>
> <attribute><code>CLUSTER_PORT_2</code><attribute>3306</attribute></attribute>
>
> <attribute><code>CLUSTER_PORT_3</code><attribute>3306</attribute></attribute>
>
> <attribute><code>CLUSTER_PORT_4</code><attribute>3306</attribute></attribute>
> <attribute><code>CUSTOM_DRIVER_CLASS</code><attribute>
> com.ibm.u2.jdbc.UniJDBCDriver</attribute></attribute>
>
> <attribute><code>CUSTOM_URL</code><attribute>jdbc:universe://localhost/database</attribute></attribute>
>
> <attribute><code>EXTRA_OPTION_MYSQL.defaultFetchSize</code><attribute>500</attribute></attribute>
>
> <attribute><code>EXTRA_OPTION_MYSQL.rewriteBatchedStatements</code><attribute>false</attribute></attribute>
>
> <attribute><code>EXTRA_OPTION_MYSQL.useCursorFetch</code><attribute>true</attribute></attribute>
>
> <attribute><code>EXTRA_OPTION_MYSQL.zeroDateTimeBehavior</code><attribute>convertToNull</attribute></attribute>
> <attribute><code>EXTRA_OPTION_SYBASE.SQLINITSTRING</code><attribute>SET
> CHAINED OFF</attribute></attribute>
>
> <attribute><code>INITIAL_POOL_SIZE</code><attribute>5</attribute></attribute>
>
> <attribute><code>IS_CLUSTERED</code><attribute>N</attribute></attribute>
>
> <attribute><code>MAXIMUM_POOL_SIZE</code><attribute>10</attribute></attribute>
>
> <attribute><code>MSSQL_DOUBLE_DECIMAL_SEPARATOR</code><attribute>N</attribute></attribute>
>
> <attribute><code>POOLING_defaultCatalog</code><attribute>catalog</attribute></attribute>
>
> <attribute><code>POOLING_removeAbandoned</code><attribute>true</attribute></attribute>
>
> <attribute><code>POOLING_testOnReturn</code><attribute>false</attribute></attribute>
>
> <attribute><code>PORT_NUMBER</code><attribute>3306</attribute></attribute>
>
> <attribute><code>QUOTE_ALL_FIELDS</code><attribute>N</attribute></attribute>
>
> <attribute><code>STREAM_RESULTS</code><attribute>Y</attribute></attribute>
>
> <attribute><code>USE_POOLING</code><attribute>N</attribute></attribute>
> </attributes>
> </connection>
> <order>
> <hop> <from>Generate Rows</from><to>Dummy (do
> nothing)</to><enabled>Y</enabled> </hop> <hop> <from>Generate Rows
> 2</from><to>Dummy (do nothing)</to><enabled>Y</enabled> </hop> <hop>
> <from>Dummy (do nothing)</from><to>Text file output</to><enabled>Y</enabled>
> </hop> </order>
> <step>
> <name>Generate Rows</name>
> <type>RowGenerator</type>
> <description/>
> <distribute>Y</distribute>
> <copies>1</copies>
> <partitioning>
> <method>none</method>
> <schema_name/>
> </partitioning>
> <fields>
> <field>
> <name>A</name>
> <type>String</type>
> <format/>
> <currency/>
> <decimal/>
> <group/>
> <nullif>The quick brown fox jumped over the lazy dog</nullif>
> <length>20</length>
> <precision>-1</precision>
> </field>
> </fields>
> <limit>10</limit>
> <cluster_schema/>
> <remotesteps> <input> </input> <output> </output>
> </remotesteps> <GUI>
> <xloc>217</xloc>
> <yloc>141</yloc>
> <draw>Y</draw>
> </GUI>
> </step>
>
> <step>
> <name>Generate Rows 2</name>
> <type>RowGenerator</type>
> <description/>
> <distribute>Y</distribute>
> <copies>1</copies>
> <partitioning>
> <method>none</method>
> <schema_name/>
> </partitioning>
> <fields>
> <field>
> <name>A</name>
> <type>String</type>
> <format/>
> <currency/>
> <decimal/>
> <group/>
> <nullif>A shorter text.</nullif>
> <length>20</length>
> <precision>-1</precision>
> </field>
> </fields>
> <limit>10</limit>
> <cluster_schema/>
> <remotesteps> <input> </input> <output> </output>
> </remotesteps> <GUI>
> <xloc>220</xloc>
> <yloc>246</yloc>
> <draw>Y</draw>
> </GUI>
> </step>
>
> <step>
> <name>Dummy (do nothing)</name>
> <type>Dummy</type>
> <description/>
> <distribute>Y</distribute>
> <copies>1</copies>
> <partitioning>
> <method>none</method>
> <schema_name/>
> </partitioning>
> <cluster_schema/>
> <remotesteps> <input> </input> <output> </output>
> </remotesteps> <GUI>
> <xloc>422</xloc>
> <yloc>205</yloc>
> <draw>Y</draw>
> </GUI>
> </step>
>
> <step>
> <name>Text file output</name>
> <type>TextFileOutput</type>
> <description/>
> <distribute>Y</distribute>
> <copies>1</copies>
> <partitioning>
> <method>none</method>
> <schema_name/>
> </partitioning>
> <separator/>
> <enclosure/>
> <enclosure_forced>N</enclosure_forced>
> <header>N</header>
> <footer>N</footer>
> <format>Unix</format>
> <compression>None</compression>
> <encoding/>
> <endedLine/>
> <file>
> <name>${java.io.tmpdir}/fixed</name>
> <is_command>N</is_command>
> <extention>txt</extention>
> <append>N</append>
> <split>N</split>
> <haspartno>N</haspartno>
> <add_date>N</add_date>
> <add_time>N</add_time>
> <pad>Y</pad>
> <fast_dump>N</fast_dump>
> <splitevery>0</splitevery>
> </file>
> <fields>
> <field>
> <name>A</name>
> <type>String</type>
> <format/>
> <currency/>
> <decimal/>
> <group/>
> <nullif/>
> <length>20</length>
> <precision>-1</precision>
> </field>
> </fields>
> <cluster_schema/>
> <remotesteps> <input> </input> <output> </output>
> </remotesteps> <GUI>
> <xloc>615</xloc>
> <yloc>205</yloc>
> <draw>Y</draw>
> </GUI>
> </step>
>
> <step_error_handling>
> </step_error_handling>
> <slave-step-copy-partition-distribution>
> </slave-step-copy-partition-distribution>
> <slave_transformation>N</slave_transformation>
> </transformation>
>
>
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "kettle-developers" group.
To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com
To unsubscribe from this group, send email to kettle-developers-unsubscribe (AT) googlegroups (DOT) com
For more options, visit this group at http://groups.google.com/group/kettle-developers?hl=en
-~----------~----~----~----~------~----~------~--~---
Matt Casters
10-17-2007, 04:30 PM
Hi Bruce & all,
What I would be very interested in is knowing what exactly we should do in
case the number is indeed larger than the specified length.
Typically this happens in cases where you put the wrong metadata on the number
field.
I guess you could opt to either:
- print hashes or something ######
- throw an error
- only print the least significant digits
- only print the most significant digits
- leave it as it is.
Personally I would actually opt to just stop the processing and kill the
output process with an error. Truncating the number strings seems like the
worst possible solution, don't you think? The log might or might not be
looked at so you end up with wrong data and a warning. That doesn't sound
right.
> If this concept makes sense to you, is this something we could work on
> together?
Absolutely, this is open source, we value everyone's opinion.
All the best,
Matt
On Wednesday 17 October 2007 22:15:48 Bruce Linn wrote:
> The method you provided is essentially what I have been testing with. My
> biggest issue in creating a fixed formatted file is how numbers are handled
> in the write node. Specifically, it appears that for numbers the specified
> "length" has no real impact (unlike how it is applied for strings). If a
> number is smaller than a specified length, we can certainly handle this
> through a format mask to make it match the length. However, in the case
> where the number is larger than the length, text file output will write the
> entire number to the file thereby making the "fixed format" invalid.
>
> For our purposes, it would be better if the value were truncated
> (maintaining the fixed format) and generated an error/warning letting us
> know that a value had been truncated. I realize that we could format and
> convert the value to a string through java script or some other node, but
> in our application we are attempting to simplify the ETL functionality as
> much as possible. In this case, we would prefer a smarter file write as
> opposed to multiple steps of formatting/conversion.
>
> I've attached an example where field B is a number and is not being
> truncated (or padded since I didn't format).
>
> If this concept makes sense to you, is this something we could work on
> together?
>
> Bruce
>
> On 10/16/07, Matt Casters <mcasters (AT) pentaho (DOT) org> wrote:
> > Hi Bruce,
> >
> > Actually, the default output of the "Text File Output" step is fixed
> > width if
> > you hit the "Get fields" button. Check the "right pad fields" and you
> > should
> > be set.
> > I'm not sure if it makes sense to be able to specify a position for a
> > field
> > when in fact that position is not really yours to choose anyway.
> >
> > As for trimming fields.. Actually we don't ever do it except in the case
> > where
> > you checked the "right pad fields" option.
> >
> > Sample in attachment.
> >
> > HTH,
> >
> > Matt
> >
> > On Tuesday 16 October 2007 20:55:23 Bruce wrote:
> > > One of the requirements of our work is the ability to created fixed
> > > format text files (like the ones we can read in through text file
> > > input :) Creating this type of file does not appear to be a strength
> > > of the current text file output as we are not able to list start
> > > positions of fields, truncation is not always applied based on length
> > > depending on the field type, etc.
> > >
> > > Am I missing the proper way to do this in Kettle?
> > >
> > > Thanks,
> > >
> > > Bruce Linn
> > > Decision Intelligence, Inc
> >
> > --
> > Matt
> > ____________________________________________
> > Matt Casters
> > Chief Data Integration - Kettle founder
> > Pentaho, Open Source Business Intelligence
> > http://www.pentaho.org -- mcasters (AT) pentaho (DOT) org
> > Tel. +32 (0) 486 97 29 37
> >
> >
> >
> > <?xml version="1.0" encoding="UTF-8"?>
> > <transformation>
> > <info>
> > <name>Fixed width output</name>
> > <description/>
> > <extended_description/>
> > <trans_version/>
> > <filename>/home/matt/test-stuff/Fixed width output.ktr
> > </filename>
> > <directory>/</directory>
> > <log>
> > <read/>
> > <write/>
> > <input/>
> > <output/>
> > <update/>
> > <rejected/>
> > <connection/>
> > <table/>
> > <use_batchid>Y</use_batchid>
> > <use_logfield>N</use_logfield>
> > </log>
> > <maxdate>
> > <connection/>
> > <table/>
> > <field/>
> > <offset>0.0</offset>
> > <maxdiff>0.0</maxdiff>
> > </maxdate>
> > <size_rowset>10000</size_rowset>
> > <sleep_time_empty>50</sleep_time_empty>
> > <sleep_time_full>50</sleep_time_full>
> > <unique_connections>N</unique_connections>
> > <feedback_shown>Y</feedback_shown>
> > <feedback_size>50000</feedback_size>
> > <using_thread_priorities>N</using_thread_priorities>
> > <shared_objects_file/>
> > <dependencies>
> > </dependencies>
> > <partitionschemas>
> > </partitionschemas>
> > <slaveservers>
> >
> > <slaveserver><name>localhost:8084</name><hostname>localhost</hostname><po
> >rt>8084</port><username>cluster</username><password>Encrypted
> > 2be98afc86aa7f2e4cb1aa265cd86aac8</password><proxy_hostname/><proxy_port/
> >><non_proxy_hosts/><master>N</master></slaveserver>
> >
> > <slaveserver><name>localhost:8083</name><hostname>localhost</hostname><po
> >rt>8083</port><username>cluster</username><password>Encrypted
> > 2be98afc86aa7f2e4cb1aa265cd86aac8</password><proxy_hostname/><proxy_port/
> >><non_proxy_hosts/><master>N</master></slaveserver>
> >
> > <slaveserver><name>localhost:8082</name><hostname>localhost</hostname><po
> >rt>8082</port><username>cluster</username><password>Encrypted
> > 2be98afc86aa7f2e4cb1aa265cd86aac8</password><proxy_hostname/><proxy_port/
> >><non_proxy_hosts/><master>N</master></slaveserver>
> >
> > <slaveserver><name>localhost:8081</name><hostname>localhost</hostname><po
> >rt>8081</port><username>cluster</username><password>Encrypted
> > 2be98afc86aa7f2e4cb1aa265cd86aac8</password><proxy_hostname/><proxy_port/
> >><non_proxy_hosts/><master>N</master></slaveserver>
> >
> > <slaveserver><name>localhost:8080:Master</name><hostname>localhost</hostn
> >ame><port>8080</port><username>cluster</username><password>Encrypted
> > 2be98afc86aa7f2e4cb1aa265cd86aac8</password><proxy_hostname/><proxy_port/
> >><non_proxy_hosts/><master>Y</master></slaveserver>
> >
> > <slaveserver><name>localhost:8080</name><hostname>localhost</hostname><po
> >rt>8080</port><username>cluster</username><password>Encrypted
> > 2be98afc86aa7f2e4cb1aa265cd86aac8</password><proxy_hostname/><proxy_port/
> >><non_proxy_hosts/><master>N</master></slaveserver> </slaveservers>
> > <clusterschemas>
> > <clusterschema>
> > <name>local schema</name>
> > <base_port>40000</base_port>
> > <sockets_buffer_size>2000</sockets_buffer_size>
> > <sockets_flush_interval>5000</sockets_flush_interval>
> > <sockets_compressed>N</sockets_compressed>
> > <slaveservers>
> > <name>localhost:8080:Master</name>
> > <name>localhost:8081</name>
> > <name>localhost:8082</name>
> > </slaveservers>
> > </clusterschema>
> > </clusterschemas>
> > <modified_user>-</modified_user>
> > <modified_date>2007/10/16 21:10:20.247</modified_date>
> > </info>
> > <notepads>
> > </notepads>
> > <connection>
> > <name>PGSQL Localhost test</name>
> > <server>localhost</server>
> > <type>POSTGRESQL</type>
> > <access>Native</access>
> > <database>test</database>
> > <port>5432</port>
> > <username>matt</username>
> > <password>Encrypted 2be98afc86aa7f2e4cb79ce10df90acde</password>
> > <servername/>
> > <data_tablespace/>
> > <index_tablespace/>
> > <attributes>
> >
> >
> > <attribute><code>CLUSTER_DBNAME_0</code><attribute>db1</attribute></attri
> >bute>
> >
> >
> > <attribute><code>CLUSTER_DBNAME_1</code><attribute>db2</attribute></attri
> >bute>
> >
> >
> > <attribute><code>CLUSTER_DBNAME_2</code><attribute>db3</attribute></attri
> >bute>
> >
> >
> > <attribute><code>CLUSTER_DBNAME_3</code><attribute>db4</attribute></attri
> >bute>
> >
> >
> > <attribute><code>CLUSTER_DBNAME_4</code><attribute>db5</attribute></attri
> >bute> <attribute><code>CLUSTER_HOSTNAME_0</code><attribute>192.168.1.10
> > </attribute></attribute>
> > <attribute><code>CLUSTER_HOSTNAME_1</code><attribute>192.168.1.10
> > </attribute></attribute>
> > <attribute><code>CLUSTER_HOSTNAME_2</code><attribute>192.168.1.10
> > </attribute></attribute>
> > <attribute><code>CLUSTER_HOSTNAME_3</code><attribute>192.168.1.10
> > </attribute></attribute>
> > <attribute><code>CLUSTER_HOSTNAME_4</code><attribute>192.168.1.10
> > </attribute></attribute>
> >
> >
> > <attribute><code>CLUSTER_PARTITION_0</code><attribute>PartDB1</attribute>
> ></attribute>
> >
> >
> > <attribute><code>CLUSTER_PARTITION_1</code><attribute>PartDB2</attribute>
> ></attribute>
> >
> >
> > <attribute><code>CLUSTER_PARTITION_2</code><attribute>PartDB3</attribute>
> ></attribute>
> >
> >
> > <attribute><code>CLUSTER_PARTITION_3</code><attribute>PartDB4</attribute>
> ></attribute>
> >
> >
> > <attribute><code>CLUSTER_PARTITION_4</code><attribute>PartDB5</attribute>
> ></attribute>
> > <attribute><code>CLUSTER_PASSWORD_0</code><attribute>Encrypted
> > </attribute></attribute>
> > <attribute><code>CLUSTER_PASSWORD_1</code><attribute>Encrypted
> > </attribute></attribute>
> > <attribute><code>CLUSTER_PASSWORD_2</code><attribute>Encrypted
> > </attribute></attribute>
> > <attribute><code>CLUSTER_PASSWORD_3</code><attribute>Encrypted
> > </attribute></attribute>
> > <attribute><code>CLUSTER_PASSWORD_4</code><attribute>Encrypted
> > </attribute></attribute>
> >
> >
> > <attribute><code>CLUSTER_PORT_0</code><attribute>3306</attribute></attrib
> >ute>
> >
> >
> > <attribute><code>CLUSTER_PORT_1</code><attribute>3306</attribute></attrib
> >ute>
> >
> >
> > <attribute><code>CLUSTER_PORT_2</code><attribute>3306</attribute></attrib
> >ute>
> >
> >
> > <attribute><code>CLUSTER_PORT_3</code><attribute>3306</attribute></attrib
> >ute>
> >
> >
> > <attribute><code>CLUSTER_PORT_4</code><attribute>3306</attribute></attrib
> >ute> <attribute><code>CUSTOM_DRIVER_CLASS</code><attribute>
> > com.ibm.u2.jdbc.UniJDBCDriver</attribute></attribute>
> >
> >
> > <attribute><code>CUSTOM_URL</code><attribute>jdbc:universe://loca
> >lhost/database</attribute></attribute>
> >
> >
> > <attribute><code>EXTRA_OPTION_MYSQL.defaultFetchSize</code><attribute>500
> ></attribute></attribute>
> >
> >
> > <attribute><code>EXTRA_OPTION_MYSQL.rewriteBatchedStatements</code><attri
> >bute>false</attribute></attribute>
> >
> >
> > <attribute><code>EXTRA_OPTION_MYSQL.useCursorFetch</code><attribute>true<
> >/attribute></attribute>
> >
> >
> > <attribute><code>EXTRA_OPTION_MYSQL.zeroDateTimeBehavior</code><attribute
> >>convertToNull</attribute></attribute>
> > <attribute><code>EXTRA_OPTION_SYBASE.SQLINITSTRING</code><attribute>SET
> > CHAINED OFF</attribute></attribute>
> >
> >
> > <attribute><code>FORCE_IDENTIFIERS_TO_LOWERCASE</code><attribute>N</attri
> >bute></attribute>
> >
> >
> > <attribute><code>FORCE_IDENTIFIERS_TO_UPPERCASE</code><attribute>N</attri
> >bute></attribute>
> >
> >
> > <attribute><code>INITIAL_POOL_SIZE</code><attribute>5</attribute></attrib
> >ute>
> >
> >
> > <attribute><code>IS_CLUSTERED</code><attribute>N</attribute></attribute>
> >
> >
> > <attribute><code>MAXIMUM_POOL_SIZE</code><attribute>10</attribute></attri
> >bute>
> >
> >
> > <attribute><code>MSSQL_DOUBLE_DECIMAL_SEPARATOR</code><attribute>N</attri
> >bute></attribute>
> >
> >
> > <attribute><code>POOLING_defaultCatalog</code><attribute>catalog</attribu
> >te></attribute>
> >
> >
> > <attribute><code>POOLING_removeAbandoned</code><attribute>true</attribute
> >></attribute>
> >
> >
> > <attribute><code>POOLING_testOnReturn</code><attribute>false</attribute><
> >/attribute>
> >
> >
> > <attribute><code>PORT_NUMBER</code><attribute>5432</attribute></attribute
> >>
> >
> >
> > <attribute><code>QUOTE_ALL_FIELDS</code><attribute>N</attribute></attribu
> >te>
> >
> >
> > <attribute><code>STREAM_RESULTS</code><attribute>Y</attribute></attribute
> >>
> >
> >
> > <attribute><code>USE_POOLING</code><attribute>N</attribute></attribute>
> > </attributes>
> > </connection>
> > <connection>
> > <name>MySQL localhost test</name>
> > <server>localhost</server>
> > <type>MYSQL</type>
> > <access>Native</access>
> > <database>test</database>
> > <port>3306</port>
> > <username>matt</username>
> > <password>Encrypted 2be98afc86aa7f2e4cb79ce10df90acde</password>
> > <servername/>
> > <data_tablespace/>
> > <index_tablespace/>
> > <attributes>
> >
> >
> > <attribute><code>CLUSTER_DBNAME_0</code><attribute>db1</attribute></attri
> >bute>
> >
> >
> > <attribute><code>CLUSTER_DBNAME_1</code><attribute>db2</attribute></attri
> >bute>
> >
> >
> > <attribute><code>CLUSTER_DBNAME_2</code><attribute>db3</attribute></attri
> >bute>
> >
> >
> > <attribute><code>CLUSTER_DBNAME_3</code><attribute>db4</attribute></attri
> >bute>
> >
> >
> > <attribute><code>CLUSTER_DBNAME_4</code><attribute>db5</attribute></attri
> >bute> <attribute><code>CLUSTER_HOSTNAME_0</code><attribute>192.168.1.10
> > </attribute></attribute>
> > <attribute><code>CLUSTER_HOSTNAME_1</code><attribute>192.168.1.10
> > </attribute></attribute>
> > <attribute><code>CLUSTER_HOSTNAME_2</code><attribute>192.168.1.10
> > </attribute></attribute>
> > <attribute><code>CLUSTER_HOSTNAME_3</code><attribute>192.168.1.10
> > </attribute></attribute>
> > <attribute><code>CLUSTER_HOSTNAME_4</code><attribute>192.168.1.10
> > </attribute></attribute>
> >
> >
> > <attribute><code>CLUSTER_PARTITION_0</code><attribute>PartDB1</attribute>
> ></attribute>
> >
> >
> > <attribute><code>CLUSTER_PARTITION_1</code><attribute>PartDB2</attribute>
> ></attribute>
> >
> >
> > <attribute><code>CLUSTER_PARTITION_2</code><attribute>PartDB3</attribute>
> ></attribute>
> >
> >
> > <attribute><code>CLUSTER_PARTITION_3</code><attribute>PartDB4</attribute>
> ></attribute>
> >
> >
> > <attribute><code>CLUSTER_PARTITION_4</code><attribute>PartDB5</attribute>
> ></attribute>
> > <attribute><code>CLUSTER_PASSWORD_0</code><attribute>Encrypted
> > </attribute></attribute>
> > <attribute><code>CLUSTER_PASSWORD_1</code><attribute>Encrypted
> > </attribute></attribute>
> > <attribute><code>CLUSTER_PASSWORD_2</code><attribute>Encrypted
> > </attribute></attribute>
> > <attribute><code>CLUSTER_PASSWORD_3</code><attribute>Encrypted
> > </attribute></attribute>
> > <attribute><code>CLUSTER_PASSWORD_4</code><attribute>Encrypted
> > </attribute></attribute>
> >
> >
> > <attribute><code>CLUSTER_PORT_0</code><attribute>3306</attribute></attrib
> >ute>
> >
> >
> > <attribute><code>CLUSTER_PORT_1</code><attribute>3306</attribute></attrib
> >ute>
> >
> >
> > <attribute><code>CLUSTER_PORT_2</code><attribute>3306</attribute></attrib
> >ute>
> >
> >
> > <attribute><code>CLUSTER_PORT_3</code><attribute>3306</attribute></attrib
> >ute>
> >
> >
> > <attribute><code>CLUSTER_PORT_4</code><attribute>3306</attribute></attrib
> >ute> <attribute><code>CUSTOM_DRIVER_CLASS</code><attribute>
> > com.ibm.u2.jdbc.UniJDBCDriver</attribute></attribute>
> >
> >
> > <attribute><code>CUSTOM_URL</code><attribute>jdbc:universe://loca
> >lhost/database</attribute></attribute>
> >
> >
> > <attribute><code>EXTRA_OPTION_MYSQL.defaultFetchSize</code><attribute>500
> ></attribute></attribute>
> >
> >
> > <attribute><code>EXTRA_OPTION_MYSQL.rewriteBatchedStatements</code><attri
> >bute>false</attribute></attribute>
> >
> >
> > <attribute><code>EXTRA_OPTION_MYSQL.useCursorFetch</code><attribute>true<
> >/attribute></attribute>
> >
> >
> > <attribute><code>EXTRA_OPTION_MYSQL.zeroDateTimeBehavior</code><attribute
> >>convertToNull</attribute></attribute>
> > <attribute><code>EXTRA_OPTION_SYBASE.SQLINITSTRING</code><attribute>SET
> > CHAINED OFF</attribute></attribute>
> >
> >
> > <attribute><code>INITIAL_POOL_SIZE</code><attribute>5</attribute></attrib
> >ute>
> >
> >
> > <attribute><code>IS_CLUSTERED</code><attribute>N</attribute></attribute>
> >
> >
> > <attribute><code>MAXIMUM_POOL_SIZE</code><attribute>10</attribute></attri
> >bute>
> >
> >
> > <attribute><code>MSSQL_DOUBLE_DECIMAL_SEPARATOR</code><attribute>N</attri
> >bute></attribute>
> >
> >
> > <attribute><code>POOLING_defaultCatalog</code><attribute>catalog</attribu
> >te></attribute>
> >
> >
> > <attribute><code>POOLING_removeAbandoned</code><attribute>true</attribute
> >></attribute>
> >
> >
> > <attribute><code>POOLING_testOnReturn</code><attribute>false</attribute><
> >/attribute>
> >
> >
> > <attribute><code>PORT_NUMBER</code><attribute>3306</attribute></attribute
> >>
> >
> >
> > <attribute><code>QUOTE_ALL_FIELDS</code><attribute>N</attribute></attribu
> >te>
> >
> >
> > <attribute><code>STREAM_RESULTS</code><attribute>Y</attribute></attribute
> >>
> >
> >
> > <attribute><code>USE_POOLING</code><attribute>N</attribute></attribute>
> > </attributes>
> > </connection>
> > <order>
> > <hop> <from>Generate Rows</from><to>Dummy (do
> > nothing)</to><enabled>Y</enabled> </hop> <hop> <from>Generate Rows
> > 2</from><to>Dummy (do nothing)</to><enabled>Y</enabled> </hop> <hop>
> > <from>Dummy (do nothing)</from><to>Text file
> > output</to><enabled>Y</enabled> </hop> </order>
> > <step>
> > <name>Generate Rows</name>
> > <type>RowGenerator</type>
> > <description/>
> > <distribute>Y</distribute>
> > <copies>1</copies>
> > <partitioning>
> > <method>none</method>
> > <schema_name/>
> > </partitioning>
> > <fields>
> > <field>
> > <name>A</name>
> > <type>String</type>
> > <format/>
> > <currency/>
> > <decimal/>
> > <group/>
> > <nullif>The quick brown fox jumped over the lazy dog</nullif>
> > <length>20</length>
> > <precision>-1</precision>
> > </field>
> > </fields>
> > <limit>10</limit>
> > <cluster_schema/>
> > <remotesteps> <input> </input> <output> </output>
> > </remotesteps> <GUI>
> > <xloc>217</xloc>
> > <yloc>141</yloc>
> > <draw>Y</draw>
> > </GUI>
> > </step>
> >
> > <step>
> > <name>Generate Rows 2</name>
> > <type>RowGenerator</type>
> > <description/>
> > <distribute>Y</distribute>
> > <copies>1</copies>
> > <partitioning>
> > <method>none</method>
> > <schema_name/>
> > </partitioning>
> > <fields>
> > <field>
> > <name>A</name>
> > <type>String</type>
> > <format/>
> > <currency/>
> > <decimal/>
> > <group/>
> > <nullif>A shorter text.</nullif>
> > <length>20</length>
> > <precision>-1</precision>
> > </field>
> > </fields>
> > <limit>10</limit>
> > <cluster_schema/>
> > <remotesteps> <input> </input> <output> </output>
> > </remotesteps> <GUI>
> > <xloc>220</xloc>
> > <yloc>246</yloc>
> > <draw>Y</draw>
> > </GUI>
> > </step>
> >
> > <step>
> > <name>Dummy (do nothing)</name>
> > <type>Dummy</type>
> > <description/>
> > <distribute>Y</distribute>
> > <copies>1</copies>
> > <partitioning>
> > <method>none</method>
> > <schema_name/>
> > </partitioning>
> > <cluster_schema/>
> > <remotesteps> <input> </input> <output> </output>
> > </remotesteps> <GUI>
> > <xloc>422</xloc>
> > <yloc>205</yloc>
> > <draw>Y</draw>
> > </GUI>
> > </step>
> >
> > <step>
> > <name>Text file output</name>
> > <type>TextFileOutput</type>
> > <description/>
> > <distribute>Y</distribute>
> > <copies>1</copies>
> > <partitioning>
> > <method>none</method>
> > <schema_name/>
> > </partitioning>
> > <separator/>
> > <enclosure/>
> > <enclosure_forced>N</enclosure_forced>
> > <header>N</header>
> > <footer>N</footer>
> > <format>Unix</format>
> > <compression>None</compression>
> > <encoding/>
> > <endedLine/>
> > <file>
> > <name>${java.io.tmpdir}/fixed</name>
> > <is_command>N</is_command>
> > <extention>txt</extention>
> > <append>N</append>
> > <split>N</split>
> > <haspartno>N</haspartno>
> > <add_date>N</add_date>
> > <add_time>N</add_time>
> > <pad>Y</pad>
> > <fast_dump>N</fast_dump>
> > <splitevery>0</splitevery>
> > </file>
> > <fields>
> > <field>
> > <name>A</name>
> > <type>String</type>
> > <format/>
> > <currency/>
> > <decimal/>
> > <group/>
> > <nullif/>
> > <length>20</length>
> > <precision>-1</precision>
> > </field>
> > </fields>
> > <cluster_schema/>
> > <remotesteps> <input> </input> <output> </output>
> > </remotesteps> <GUI>
> > <xloc>615</xloc>
> > <yloc>205</yloc>
> > <draw>Y</draw>
> > </GUI>
> > </step>
> >
> > <step_error_handling>
> > </step_error_handling>
> > <slave-step-copy-partition-distribution>
> > </slave-step-copy-partition-distribution>
> > <slave_transformation>N</slave_transformation>
> > </transformation>
>
>
--
Matt
____________________________________________
Matt Casters
Chief Data Integration - Kettle founder
Pentaho, Open Source Business Intelligence
http://www.pentaho.org -- mcasters (AT) pentaho (DOT) org
Tel. +32 (0) 486 97 29 37
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "kettle-developers" group.
To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com
To unsubscribe from this group, send email to kettle-developers-unsubscribe (AT) googlegroups (DOT) com
For more options, visit this group at http://groups.google.com/group/kettle-developers?hl=en
-~----------~----~----~----~------~----~------~--~---
Bruce Linn
10-23-2007, 03:50 PM
I agree...stopping the process is the best of the alternatives. Better to
force user interaction to correct than to make assumptions for them.
On 10/17/07, Matt Casters <mcasters (AT) pentaho (DOT) org> wrote:
>
>
>
> Hi Bruce & all,
>
> What I would be very interested in is knowing what exactly we should do in
> case the number is indeed larger than the specified length.
> Typically this happens in cases where you put the wrong metadata on the
> number
> field.
>
> I guess you could opt to either:
> - print hashes or something ######
> - throw an error
> - only print the least significant digits
> - only print the most significant digits
> - leave it as it is.
>
> Personally I would actually opt to just stop the processing and kill the
> output process with an error. Truncating the number strings seems like
> the
> worst possible solution, don't you think? The log might or might not be
> looked at so you end up with wrong data and a warning. That doesn't sound
> right.
>
> > If this concept makes sense to you, is this something we could work on
> > together?
>
> Absolutely, this is open source, we value everyone's opinion.
>
> All the best,
>
> Matt
>
>
>
> On Wednesday 17 October 2007 22:15:48 Bruce Linn wrote:
> > The method you provided is essentially what I have been testing
> with. My
> > biggest issue in creating a fixed formatted file is how numbers are
> handled
> > in the write node. Specifically, it appears that for numbers the
> specified
> > "length" has no real impact (unlike how it is applied for strings). If
> a
> > number is smaller than a specified length, we can certainly handle this
> > through a format mask to make it match the length. However, in the case
> > where the number is larger than the length, text file output will write
> the
> > entire number to the file thereby making the "fixed format" invalid.
> >
> > For our purposes, it would be better if the value were truncated
> > (maintaining the fixed format) and generated an error/warning letting us
> > know that a value had been truncated. I realize that we could format
> and
> > convert the value to a string through java script or some other node,
> but
> > in our application we are attempting to simplify the ETL functionality
> as
> > much as possible. In this case, we would prefer a smarter file write as
> > opposed to multiple steps of formatting/conversion.
> >
> > I've attached an example where field B is a number and is not being
> > truncated (or padded since I didn't format).
> >
> > If this concept makes sense to you, is this something we could work on
> > together?
> >
> > Bruce
> >
> > On 10/16/07, Matt Casters <mcasters (AT) pentaho (DOT) org> wrote:
> > > Hi Bruce,
> > >
> > > Actually, the default output of the "Text File Output" step is fixed
> > > width if
> > > you hit the "Get fields" button. Check the "right pad fields" and you
> > > should
> > > be set.
> > > I'm not sure if it makes sense to be able to specify a position for a
> > > field
> > > when in fact that position is not really yours to choose anyway.
> > >
> > > As for trimming fields.. Actually we don't ever do it except in the
> case
> > > where
> > > you checked the "right pad fields" option.
> > >
> > > Sample in attachment.
> > >
> > > HTH,
> > >
> > > Matt
> > >
> > > On Tuesday 16 October 2007 20:55:23 Bruce wrote:
> > > > One of the requirements of our work is the ability to created fixed
> > > > format text files (like the ones we can read in through text file
> > > > input :) Creating this type of file does not appear to be a
> strength
> > > > of the current text file output as we are not able to list start
> > > > positions of fields, truncation is not always applied based on
> length
> > > > depending on the field type, etc.
> > > >
> > > > Am I missing the proper way to do this in Kettle?
> > > >
> > > > Thanks,
> > > >
> > > > Bruce Linn
> > > > Decision Intelligence, Inc
> > >
> > > --
> > > Matt
> > > ____________________________________________
> > > Matt Casters
> > > Chief Data Integration - Kettle founder
> > > Pentaho, Open Source Business Intelligence
> > > http://www.pentaho.org -- mcasters (AT) pentaho (DOT) org
> > > Tel. +32 (0) 486 97 29 37
> > >
> > >
> > >
> > > <?xml version="1.0" encoding="UTF-8"?>
> > > <transformation>
> > > <info>
> > > <name>Fixed width output</name>
> > > <description/>
> > > <extended_description/>
> > > <trans_version/>
> > > <filename>/home/matt/test-stuff/Fixed width
> output.ktr
> > > </filename>
> > > <directory>/</directory>
> > > <log>
> > > <read/>
> > > <write/>
> > > <input/>
> > > <output/>
> > > <update/>
> > > <rejected/>
> > > <connection/>
> > > <table/>
> > > <use_batchid>Y</use_batchid>
> > > <use_logfield>N</use_logfield>
> > > </log>
> > > <maxdate>
> > > <connection/>
> > > <table/>
> > > <field/>
> > > <offset>0.0</offset>
> > > <maxdiff>0.0</maxdiff>
> > > </maxdate>
> > > <size_rowset>10000</size_rowset>
> > > <sleep_time_empty>50</sleep_time_empty>
> > > <sleep_time_full>50</sleep_time_full>
> > > <unique_connections>N</unique_connections>
> > > <feedback_shown>Y</feedback_shown>
> > > <feedback_size>50000</feedback_size>
> > > <using_thread_priorities>N</using_thread_priorities>
> > > <shared_objects_file/>
> > > <dependencies>
> > > </dependencies>
> > > <partitionschemas>
> > > </partitionschemas>
> > > <slaveservers>
> > >
> > >
> <slaveserver><name>localhost:8084</name><hostname>localhost</hostname><po
> > >rt>8084</port><username>cluster</username><password>Encrypted
> > >
> 2be98afc86aa7f2e4cb1aa265cd86aac8</password><proxy_hostname/><proxy_port/
> > >><non_proxy_hosts/><master>N</master></slaveserver>
> > >
> > >
> <slaveserver><name>localhost:8083</name><hostname>localhost</hostname><po
> > >rt>8083</port><username>cluster</username><password>Encrypted
> > >
> 2be98afc86aa7f2e4cb1aa265cd86aac8</password><proxy_hostname/><proxy_port/
> > >><non_proxy_hosts/><master>N</master></slaveserver>
> > >
> > >
> <slaveserver><name>localhost:8082</name><hostname>localhost</hostname><po
> > >rt>8082</port><username>cluster</username><password>Encrypted
> > >
> 2be98afc86aa7f2e4cb1aa265cd86aac8</password><proxy_hostname/><proxy_port/
> > >><non_proxy_hosts/><master>N</master></slaveserver>
> > >
> > >
> <slaveserver><name>localhost:8081</name><hostname>localhost</hostname><po
> > >rt>8081</port><username>cluster</username><password>Encrypted
> > >
> 2be98afc86aa7f2e4cb1aa265cd86aac8</password><proxy_hostname/><proxy_port/
> > >><non_proxy_hosts/><master>N</master></slaveserver>
> > >
> > >
> <slaveserver><name>localhost:8080:Master</name><hostname>localhost</hostn
> > >ame><port>8080</port><username>cluster</username><password>Encrypted
> > >
> 2be98afc86aa7f2e4cb1aa265cd86aac8</password><proxy_hostname/><proxy_port/
> > >><non_proxy_hosts/><master>Y</master></slaveserver>
> > >
> > >
> <slaveserver><name>localhost:8080</name><hostname>localhost</hostname><po
> > >rt>8080</port><username>cluster</username><password>Encrypted
> > >
> 2be98afc86aa7f2e4cb1aa265cd86aac8</password><proxy_hostname/><proxy_port/
> > >><non_proxy_hosts/><master>N</master></slaveserver> </slaveservers>
> > > <clusterschemas>
> > > <clusterschema>
> > > <name>local schema</name>
> > > <base_port>40000</base_port>
> > > <sockets_buffer_size>2000</sockets_buffer_size>
> > > <sockets_flush_interval>5000</sockets_flush_interval>
> > > <sockets_compressed>N</sockets_compressed>
> > > <slaveservers>
> > > <name>localhost:8080:Master</name>
> > > <name>localhost:8081</name>
> > > <name>localhost:8082</name>
> > > </slaveservers>
> > > </clusterschema>
> > > </clusterschemas>
> > > <modified_user>-</modified_user>
> > > <modified_date>2007/10/16 21:10:20.247</modified_date>
> > > </info>
> > > <notepads>
> > > </notepads>
> > > <connection>
> > > <name>PGSQL Localhost test</name>
> > > <server>localhost</server>
> > > <type>POSTGRESQL</type>
> > > <access>Native</access>
> > > <database>test</database>
> > > <port>5432</port>
> > > <username>matt</username>
> > > <password>Encrypted 2be98afc86aa7f2e4cb79ce10df90acde</password>
> > > <servername/>
> > > <data_tablespace/>
> > > <index_tablespace/>
> > > <attributes>
> > >
> > >
> > >
> <attribute><code>CLUSTER_DBNAME_0</code><attribute>db1</attribute></attri
> > >bute>
> > >
> > >
> > >
> <attribute><code>CLUSTER_DBNAME_1</code><attribute>db2</attribute></attri
> > >bute>
> > >
> > >
> > >
> <attribute><code>CLUSTER_DBNAME_2</code><attribute>db3</attribute></attri
> > >bute>
> > >
> > >
> > >
> <attribute><code>CLUSTER_DBNAME_3</code><attribute>db4</attribute></attri
> > >bute>
> > >
> > >
> > >
> <attribute><code>CLUSTER_DBNAME_4</code><attribute>db5</attribute></attri
> > >bute> <attribute><code>CLUSTER_HOSTNAME_0</code><attribute>192.168.1.10
> > > </attribute></attribute>
> > > <attribute><code>CLUSTER_HOSTNAME_1</code><attribute>
> 192.168.1.10
> > > </attribute></attribute>
> > > <attribute><code>CLUSTER_HOSTNAME_2</code><attribute>
> 192.168.1.10
> > > </attribute></attribute>
> > > <attribute><code>CLUSTER_HOSTNAME_3</code><attribute>
> 192.168.1.10
> > > </attribute></attribute>
> > > <attribute><code>CLUSTER_HOSTNAME_4</code><attribute>
> 192.168.1.10
> > > </attribute></attribute>
> > >
> > >
> > >
> <attribute><code>CLUSTER_PARTITION_0</code><attribute>PartDB1</attribute>
> > ></attribute>
> > >
> > >
> > >
> <attribute><code>CLUSTER_PARTITION_1</code><attribute>PartDB2</attribute>
> > ></attribute>
> > >
> > >
> > >
> <attribute><code>CLUSTER_PARTITION_2</code><attribute>PartDB3</attribute>
> > ></attribute>
> > >
> > >
> > >
> <attribute><code>CLUSTER_PARTITION_3</code><attribute>PartDB4</attribute>
> > ></attribute>
> > >
> > >
> > >
> <attribute><code>CLUSTER_PARTITION_4</code><attribute>PartDB5</attribute>
> > ></attribute>
> > > <attribute><code>CLUSTER_PASSWORD_0</code><attribute>Encrypted
> > > </attribute></attribute>
> > > <attribute><code>CLUSTER_PASSWORD_1</code><attribute>Encrypted
> > > </attribute></attribute>
> > > <attribute><code>CLUSTER_PASSWORD_2</code><attribute>Encrypted
> > > </attribute></attribute>
> > > <attribute><code>CLUSTER_PASSWORD_3</code><attribute>Encrypted
> > > </attribute></attribute>
> > > <attribute><code>CLUSTER_PASSWORD_4</code><attribute>Encrypted
> > > </attribute></attribute>
> > >
> > >
> > >
> <attribute><code>CLUSTER_PORT_0</code><attribute>3306</attribute></attrib
> > >ute>
> > >
> > >
> > >
> <attribute><code>CLUSTER_PORT_1</code><attribute>3306</attribute></attrib
> > >ute>
> > >
> > >
> > >
> <attribute><code>CLUSTER_PORT_2</code><attribute>3306</attribute></attrib
> > >ute>
> > >
> > >
> > >
> <attribute><code>CLUSTER_PORT_3</code><attribute>3306</attribute></attrib
> > >ute>
> > >
> > >
> > >
> <attribute><code>CLUSTER_PORT_4</code><attribute>3306</attribute></attrib
> > >ute> <attribute><code>CUSTOM_DRIVER_CLASS</code><attribute>
> > > com.ibm.u2.jdbc.UniJDBCDriver</attribute></attribute>
> > >
> > >
> > >
> <attribute><code>CUSTOM_URL</code><attribute>jdbc:universe://loca
> > >lhost/database</attribute></attribute>
> > >
> > >
> > >
> <attribute><code>EXTRA_OPTION_MYSQL.defaultFetchSize</code><attribute>500
> > ></attribute></attribute>
> > >
> > >
> > >
> <attribute><code>EXTRA_OPTION_MYSQL.rewriteBatchedStatements</code><attri
> > >bute>false</attribute></attribute>
> > >
> > >
> > >
> <attribute><code>EXTRA_OPTION_MYSQL.useCursorFetch</code><attribute>true<
> > >/attribute></attribute>
> > >
> > >
> > >
> <attribute><code>EXTRA_OPTION_MYSQL.zeroDateTimeBehavior</code><attribute
> > >>convertToNull</attribute></attribute>
> > >
> <attribute><code>EXTRA_OPTION_SYBASE.SQLINITSTRING</code><attribute>SET
> > > CHAINED OFF</attribute></attribute>
> > >
> > >
> > >
> <attribute><code>FORCE_IDENTIFIERS_TO_LOWERCASE</code><attribute>N</attri
> > >bute></attribute>
> > >
> > >
> > >
> <attribute><code>FORCE_IDENTIFIERS_TO_UPPERCASE</code><attribute>N</attri
> > >bute></attribute>
> > >
> > >
> > >
> <attribute><code>INITIAL_POOL_SIZE</code><attribute>5</attribute></attrib
> > >ute>
> > >
> > >
> > >
> <attribute><code>IS_CLUSTERED</code><attribute>N</attribute></attribute>
> > >
> > >
> > >
> <attribute><code>MAXIMUM_POOL_SIZE</code><attribute>10</attribute></attri
> > >bute>
> > >
> > >
> > >
> <attribute><code>MSSQL_DOUBLE_DECIMAL_SEPARATOR</code><attribute>N</attri
> > >bute></attribute>
> > >
> > >
> > >
> <attribute><code>POOLING_defaultCatalog</code><attribute>catalog</attribu
> > >te></attribute>
> > >
> > >
> > >
> <attribute><code>POOLING_removeAbandoned</code><attribute>true</attribute
> > >></attribute>
> > >
> > >
> > >
> <attribute><code>POOLING_testOnReturn</code><attribute>false</attribute><
> > >/attribute>
> > >
> > >
> > >
> <attribute><code>PORT_NUMBER</code><attribute>5432</attribute></attribute
> > >>
> > >
> > >
> > >
> <attribute><code>QUOTE_ALL_FIELDS</code><attribute>N</attribute></attribu
> > >te>
> > >
> > >
> > >
> <attribute><code>STREAM_RESULTS</code><attribute>Y</attribute></attribute
> > >>
> > >
> > >
> > >
> <attribute><code>USE_POOLING</code><attribute>N</attribute></attribute>
> > > </attributes>
> > > </connection>
> > > <connection>
> > > <name>MySQL localhost test</name>
> > > <server>localhost</server>
> > > <type>MYSQL</type>
> > > <access>Native</access>
> > > <database>test</database>
> > > <port>3306</port>
> > > <username>matt</username>
> > > <password>Encrypted 2be98afc86aa7f2e4cb79ce10df90acde</password>
> > > <servername/>
> > > <data_tablespace/>
> > > <index_tablespace/>
> > > <attributes>
> > >
> > >
> > >
> <attribute><code>CLUSTER_DBNAME_0</code><attribute>db1</attribute></attri
> > >bute>
> > >
> > >
> > >
> <attribute><code>CLUSTER_DBNAME_1</code><attribute>db2</attribute></attri
> > >bute>
> > >
> > >
> > >
> <attribute><code>CLUSTER_DBNAME_2</code><attribute>db3</attribute></attri
> > >bute>
> > >
> > >
> > >
> <attribute><code>CLUSTER_DBNAME_3</code><attribute>db4</attribute></attri
> > >bute>
> > >
> > >
> > >
> <attribute><code>CLUSTER_DBNAME_4</code><attribute>db5</attribute></attri
> > >bute> <attribute><code>CLUSTER_HOSTNAME_0</code><attribute>192.168.1.10
> > > </attribute></attribute>
> > > <attribute><code>CLUSTER_HOSTNAME_1</code><attribute>
> 192.168.1.10
> > > </attribute></attribute>
> > > <attribute><code>CLUSTER_HOSTNAME_2</code><attribute>
> 192.168.1.10
> > > </attribute></attribute>
> > > <attribute><code>CLUSTER_HOSTNAME_3</code><attribute>
> 192.168.1.10
> > > </attribute></attribute>
> > > <attribute><code>CLUSTER_HOSTNAME_4</code><attribute>
> 192.168.1.10
> > > </attribute></attribute>
> > >
> > >
> > >
> <attribute><code>CLUSTER_PARTITION_0</code><attribute>PartDB1</attribute>
> > ></attribute>
> > >
> > >
> > >
> <attribute><code>CLUSTER_PARTITION_1</code><attribute>PartDB2</attribute>
> > ></attribute>
> > >
> > >
> > >
> <attribute><code>CLUSTER_PARTITION_2</code><attribute>PartDB3</attribute>
> > ></attribute>
> > >
> > >
> > >
> <attribute><code>CLUSTER_PARTITION_3</code><attribute>PartDB4</attribute>
> > ></attribute>
> > >
> > >
> > >
> <attribute><code>CLUSTER_PARTITION_4</code><attribute>PartDB5</attribute>
> > ></attribute>
> > > <attribute><code>CLUSTER_PASSWORD_0</code><attribute>Encrypted
> > > </attribute></attribute>
> > > <attribute><code>CLUSTER_PASSWORD_1</code><attribute>Encrypted
> > > </attribute></attribute>
> > > <attribute><code>CLUSTER_PASSWORD_2</code><attribute>Encrypted
> > > </attribute></attribute>
> > > <attribute><code>CLUSTER_PASSWORD_3</code><attribute>Encrypted
> > > </attribute></attribute>
> > > <attribute><code>CLUSTER_PASSWORD_4</code><attribute>Encrypted
> > > </attribute></attribute>
> > >
> > >
> > >
> <attribute><code>CLUSTER_PORT_0</code><attribute>3306</attribute></attrib
> > >ute>
> > >
> > >
> > >
> <attribute><code>CLUSTER_PORT_1</code><attribute>3306</attribute></attrib
> > >ute>
> > >
> > >
> > >
> <attribute><code>CLUSTER_PORT_2</code><attribute>3306</attribute></attrib
> > >ute>
> > >
> > >
> > >
> <attribute><code>CLUSTER_PORT_3</code><attribute>3306</attribute></attrib
> > >ute>
> > >
> > >
> > >
> <attribute><code>CLUSTER_PORT_4</code><attribute>3306</attribute></attrib
> > >ute> <attribute><code>CUSTOM_DRIVER_CLASS</code><attribute>
> > > com.ibm.u2.jdbc.UniJDBCDriver</attribute></attribute>
> > >
> > >
> > >
> <attribute><code>CUSTOM_URL</code><attribute>jdbc:universe://loca
> > >lhost/database</attribute></attribute>
> > >
> > >
> > >
> <attribute><code>EXTRA_OPTION_MYSQL.defaultFetchSize</code><attribute>500
> > ></attribute></attribute>
> > >
> > >
> > >
> <attribute><code>EXTRA_OPTION_MYSQL.rewriteBatchedStatements</code><attri
> > >bute>false</attribute></attribute>
> > >
> > >
> > >
> <attribute><code>EXTRA_OPTION_MYSQL.useCursorFetch</code><attribute>true<
> > >/attribute></attribute>
> > >
> > >
> > >
> <attribute><code>EXTRA_OPTION_MYSQL.zeroDateTimeBehavior</code><attribute
> > >>convertToNull</attribute></attribute>
> > >
> <attribute><code>EXTRA_OPTION_SYBASE.SQLINITSTRING</code><attribute>SET
> > > CHAINED OFF</attribute></attribute>
> > >
> > >
> > >
> <attribute><code>INITIAL_POOL_SIZE</code><attribute>5</attribute></attrib
> > >ute>
> > >
> > >
> > >
> <attribute><code>IS_CLUSTERED</code><attribute>N</attribute></attribute>
> > >
> > >
> > >
> <attribute><code>MAXIMUM_POOL_SIZE</code><attribute>10</attribute></attri
> > >bute>
> > >
> > >
> > >
> <attribute><code>MSSQL_DOUBLE_DECIMAL_SEPARATOR</code><attribute>N</attri
> > >bute></attribute>
> > >
> > >
> > >
> <attribute><code>POOLING_defaultCatalog</code><attribute>catalog</attribu
> > >te></attribute>
> > >
> > >
> > >
> <attribute><code>POOLING_removeAbandoned</code><attribute>true</attribute
> > >></attribute>
> > >
> > >
> > >
> <attribute><code>POOLING_testOnReturn</code><attribute>false</attribute><
> > >/attribute>
> > >
> > >
> > >
> <attribute><code>PORT_NUMBER</code><attribute>3306</attribute></attribute
> > >>
> > >
> > >
> > >
> <attribute><code>QUOTE_ALL_FIELDS</code><attribute>N</attribute></attribu
> > >te>
> > >
> > >
> > >
> <attribute><code>STREAM_RESULTS</code><attribute>Y</attribute></attribute
> > >>
> > >
> > >
> > >
> <attribute><code>USE_POOLING</code><attribute>N</attribute></attribute>
> > > </attributes>
> > > </connection>
> > > <order>
> > > <hop> <from>Generate Rows</from><to>Dummy (do
> > > nothing)</to><enabled>Y</enabled> </hop> <hop> <from>Generate Rows
> > > 2</from><to>Dummy (do nothing)</to><enabled>Y</enabled> </hop> <hop>
> > > <from>Dummy (do nothing)</from><to>Text file
> > > output</to><enabled>Y</enabled> </hop> </order>
> > > <step>
> > > <name>Generate Rows</name>
> > > <type>RowGenerator</type>
> > > <description/>
> > > <distribute>Y</distribute>
> > > <copies>1</copies>
> > > <partitioning>
> > > <method>none</method>
> > > <schema_name/>
> > > </partitioning>
> > > <fields>
> > > <field>
> > > <name>A</name>
> > > <type>String</type>
> > > <format/>
> > > <currency/>
> > > <decimal/>
> > > <group/>
> > > <nullif>The quick brown fox jumped over the lazy dog</nullif>
> > > <length>20</length>
> > > <precision>-1</precision>
> > > </field>
> > > </fields>
> > > <limit>10</limit>
> > > <cluster_schema/>
> > > <remotesteps> <input> </input> <output> </output>
> > > </remotesteps> <GUI>
> > > <xloc>217</xloc>
> > > <yloc>141</yloc>
> > > <draw>Y</draw>
> > > </GUI>
> > > </step>
> > >
> > > <step>
> > > <name>Generate Rows 2</name>
> > > <type>RowGenerator</type>
> > > <description/>
> > > <distribute>Y</distribute>
> > > <copies>1</copies>
> > > <partitioning>
> > > <method>none</method>
> > > <schema_name/>
> > > </partitioning>
> > > <fields>
> > > <field>
> > > <name>A</name>
> > > <type>String</type>
> > > <format/>
> > > <currency/>
> > > <decimal/>
> > > <group/>
> > > <nullif>A shorter text.</nullif>
> > > <length>20</length>
> > > <precision>-1</precision>
> > > </field>
> > > </fields>
> > > <limit>10</limit>
> > > <cluster_schema/>
> > > <remotesteps> <input> </input> <output> </output>
> > > </remotesteps> <GUI>
> > > <xloc>220</xloc>
> > > <yloc>246</yloc>
> > > <draw>Y</draw>
> > > </GUI>
> > > </step>
> > >
> > > <step>
> > > <name>Dummy (do nothing)</name>
> > > <type>Dummy</type>
> > > <description/>
> > > <distribute>Y</distribute>
> > > <copies>1</copies>
> > > <partitioning>
> > > <method>none</method>
> > > <schema_name/>
> > > </partitioning>
> > > <cluster_schema/>
> > > <remotesteps> <input> </input> <output> </output>
> > > </remotesteps> <GUI>
> > > <xloc>422</xloc>
> > > <yloc>205</yloc>
> > > <draw>Y</draw>
> > > </GUI>
> > > </step>
> > >
> > > <step>
> > > <name>Text file output</name>
> > > <type>TextFileOutput</type>
> > > <description/>
> > > <distribute>Y</distribute>
> > > <copies>1</copies>
> > > <partitioning>
> > > <method>none</method>
> > > <schema_name/>
> > > </partitioning>
> > > <separator/>
> > > <enclosure/>
> > > <enclosure_forced>N</enclosure_forced>
> > > <header>N</header>
> > > <footer>N</footer>
> > > <format>Unix</format>
> > > <compression>None</compression>
> > > <encoding/>
> > > <endedLine/>
> > > <file>
> > > <name>${java.io.tmpdir}/fixed</name>
> > > <is_command>N</is_command>
> > > <extention>txt</extention>
> > > <append>N</append>
> > > <split>N</split>
> > > <haspartno>N</haspartno>
> > > <add_date>N</add_date>
> > > <add_time>N</add_time>
> > > <pad>Y</pad>
> > > <fast_dump>N</fast_dump>
> > > <splitevery>0</splitevery>
> > > </file>
> > > <fields>
> > > <field>
> > > <name>A</name>
> > > <type>String</type>
> > > <format/>
> > > <currency/>
> > > <decimal/>
> > > <group/>
> > > <nullif/>
> > > <length>20</length>
> > > <precision>-1</precision>
> > > </field>
> > > </fields>
> > > <cluster_schema/>
> > > <remotesteps> <input> </input> <output> </output>
> > > </remotesteps> <GUI>
> > > <xloc>615</xloc>
> > > <yloc>205</yloc>
> > > <draw>Y</draw>
> > > </GUI>
> > > </step>
> > >
> > > <step_error_handling>
> > > </step_error_handling>
> > > <slave-step-copy-partition-distribution>
> > > </slave-step-copy-partition-distribution>
> > > <slave_transformation>N</slave_transformation>
> > > </transformation>
> >
> >
>
>
> --
> Matt
> ____________________________________________
> Matt Casters
> Chief Data Integration - Kettle founder
> Pentaho, Open Source Business Intelligence
> http://www.pentaho.org -- mcasters (AT) pentaho (DOT) org
> Tel. +32 (0) 486 97 29 37
>
> >
>
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "kettle-developers" group.
To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com
To unsubscribe from this group, send email to kettle-developers-unsubscribe (AT) googlegroups (DOT) com
For more options, visit this group at http://groups.google.com/group/kettle-developers?hl=en
-~----------~----~----~----~------~----~------~--~---