Hitachi Vantara Pentaho Community Forums
Results 1 to 11 of 11

Thread: Filter rows

  1. #1
    Join Date
    Jun 2007
    Posts
    128

    Default Filter rows

    I am trying to use Filter rows step.I am trying to retrieve all the records only if a specified field in the record starts with suppose say 11.But the field is split into parts and the record contains more number of fields than the given.
    What is the problem hete.
    I want to retrieve the entire record if the first field in the record starts with 11.


    Tahnks
    Sreelatha

  2. #2
    Join Date
    May 2006
    Posts
    4,882

    Default

    If that's happening it shouldn't be. Do you have a small transformation example showing your problem (and preferably not using databases as input/output).

    Regards,
    Sven

  3. #3
    Join Date
    Mar 2007
    Posts
    216

    Default Seems to be a regular expression

    Hi,

    Please tell what PDI / Kettle version you are using.
    You can use v2.5.1 with Regex Evaluation.
    See samples attached.

    a+, =)
    -=Clement=-

    Pentaho Data Integration Spoon v2.5.1
    Windows XP Pro SP2
    Attached Files Attached Files

  4. #4
    Join Date
    Jun 2007
    Posts
    128

    Default

    Quote Originally Posted by clement View Post
    Hi,

    Please tell what PDI / Kettle version you are using.
    You can use v2.5.1 with Regex Evaluation.
    See samples attached.

    a+, =)
    -=Clement=-

    Pentaho Data Integration Spoon v2.5.1
    Windows XP Pro SP2
    Hi,

    I am using csv files for input and output and Kettle-2.5.0.


    My input is :
    anumber,bnumber
    116790,332345
    124545,6878978
    1164565,984955
    117854,445454
    11948394,334345
    13454545,321243
    1145456,787898

    Result is:
    anumber,bnumber
    116,790,332,345
    1,164,565,984,955
    117,854,445,454
    11,948,394,334,345
    1,145,456,787,898

    Transformation is:
    <?xml version="1.0" encoding="UTF-8"?>
    <transformation>
    <info>
    <name>filter</name>
    <description/>
    <extended_description/>
    <trans_version/>
    <trans_status>0</trans_status>
    <directory>/</directory>
    <log>
    <read/>
    <write/>
    <input/>
    <output/>
    <update/>
    <rejected/>
    <connection/>
    <table/>
    <use_batchid>Y</use_batchid>
    <use_logfield>N</use_logfield>
    </log>
    <maxdate>
    <connection/>
    <table/>
    <field/>
    <offset>0.0</offset>
    <maxdiff>0.0</maxdiff>
    </maxdate>
    <size_rowset>1000</size_rowset>
    <sleep_time_empty>1</sleep_time_empty>
    <sleep_time_full>1</sleep_time_full>
    <unique_connections>N</unique_connections>
    <feedback_shown>Y</feedback_shown>
    <feedback_size>5000</feedback_size>
    <using_thread_priorities>N</using_thread_priorities>
    <shared_objects_file/>
    <dependencies>
    </dependencies>
    <partitionschemas>
    <partitionschema>
    <name>part1</name>
    </partitionschema>
    </partitionschemas>
    <slaveservers>
    </slaveservers>
    <clusterschemas>
    </clusterschemas>
    <modified_user>admin</modified_user>
    <modified_date>2007/09/13 16:55:30.174</modified_date>
    </info>
    <notepads>
    </notepads>
    <connection>
    <name>con1</name>
    <server>localhost</server>
    <type>MYSQL</type>
    <access>Native</access>
    <database>sampledata</database>
    <port>3306</port>
    <username>pentaho_user</username>
    <password>Encrypted </password>
    <servername/>
    <data_tablespace/>
    <index_tablespace/>
    <attributes>
    <attribute><code>EXTRA_OPTION_MYSQL.defaultFetchSize</code><attribute>500</attribute></attribute>
    <attribute><code>EXTRA_OPTION_MYSQL.useCursorFetch</code><attribute>true</attribute></attribute>
    <attribute><code>IS_CLUSTERED</code><attribute>N</attribute></attribute>
    <attribute><code>MAXIMUM_POOL_SIZE</code><attribute>10</attribute></attribute>
    <attribute><code>PORT_NUMBER</code><attribute>3306</attribute></attribute>
    <attribute><code>STREAM_RESULTS</code><attribute>Y</attribute></attribute>
    <attribute><code>USE_POOLING</code><attribute>N</attribute></attribute>
    </attributes>
    </connection>
    <order>
    <hop> <from>Text file input</from><to>Filter rows</to><enabled>Y</enabled> </hop> <hop> <from>Filter rows</from><to>Text file output</to><enabled>Y</enabled> </hop> </order>
    <step>
    <name>Text file input</name>
    <type>TextFileInput</type>
    <description/>
    <distribute>Y</distribute>
    <copies>1</copies>
    <partitioning>
    <method>none</method>
    <field_name/>
    <schema_name/>
    </partitioning>
    <accept_filenames>N</accept_filenames>
    <accept_field/>
    <accept_stepname/>
    <separator>,</separator>
    <enclosure/>
    <enclosure_breaks>N</enclosure_breaks>
    <escapechar/>
    <header>Y</header>
    <nr_headerlines>1</nr_headerlines>
    <footer>N</footer>
    <nr_footerlines>1</nr_footerlines>
    <line_wrapped>N</line_wrapped>
    <nr_wraps>1</nr_wraps>
    <layout_paged>N</layout_paged>
    <nr_lines_per_page>80</nr_lines_per_page>
    <nr_lines_doc_header>0</nr_lines_doc_header>
    <noempty>Y</noempty>
    <include>N</include>
    <include_field/>
    <rownum>N</rownum>
    <rownumByFile>N</rownumByFile>
    <rownum_field/>
    <format>Unix</format>
    <encoding/>
    <file>
    <name>/home/srilatha/Kettle-2.5.0/input.csv</name>
    <filemask/>
    <file_required/>
    <type>CSV</type>
    <compression>None</compression>
    </file>
    <filters>
    </filters>
    <fields>
    <field>
    <name>anumber</name>
    <type>Integer</type>
    <format/>
    <currency>$</currency>
    <decimal>.</decimal>
    <group>,</group>
    <nullif>-</nullif>
    <ifnull/>
    <position>-1</position>
    <length>8</length>
    <precision>0</precision>
    <trim_type>none</trim_type>
    <repeat>N</repeat>
    </field>
    <field>
    <name>bnumber</name>
    <type>Integer</type>
    <format/>
    <currency>$</currency>
    <decimal>.</decimal>
    <group>,</group>
    <nullif>-</nullif>
    <ifnull/>
    <position>-1</position>
    <length>7</length>
    <precision>0</precision>
    <trim_type>none</trim_type>
    <repeat>N</repeat>
    </field>
    </fields>
    <limit>0</limit>
    <error_ignored>N</error_ignored>
    <error_line_skipped>N</error_line_skipped>
    <error_count_field/>
    <error_fields_field/>
    <error_text_field/>
    <bad_line_files_destination_directory/>
    <bad_line_files_extension>warning</bad_line_files_extension>
    <error_line_files_destination_directory/>
    <error_line_files_extension>error</error_line_files_extension>
    <line_number_files_destination_directory/>
    <line_number_files_extension>line</line_number_files_extension>
    <date_format_lenient>Y</date_format_lenient>
    <date_format_locale>en_us</date_format_locale>
    <cluster_schema/>
    <GUI>
    <xloc>89</xloc>
    <yloc>68</yloc>
    <draw>Y</draw>
    </GUI>
    </step>

    <step>
    <name>Filter rows</name>
    <type>FilterRows</type>
    <description/>
    <distribute>Y</distribute>
    <copies>1</copies>
    <partitioning>
    <method>none</method>
    <field_name/>
    <schema_name/>
    </partitioning>
    <send_true_to/>
    <send_false_to/>
    <compare>
    <condition>
    <negated>N</negated>
    <leftvalue>anumber</leftvalue>
    <function>STARTS WITH</function>
    <rightvalue/>
    <value><name>constant</name><type>Integer</type><text> 11</text><length>-1</length><precision>0</precision><isnull>N</isnull></value> </condition>
    </compare>
    <cluster_schema/>
    <GUI>
    <xloc>296</xloc>
    <yloc>102</yloc>
    <draw>Y</draw>
    </GUI>
    </step>

    <step>
    <name>Text file output</name>
    <type>TextFileOutput</type>
    <description/>
    <distribute>Y</distribute>
    <copies>1</copies>
    <partitioning>
    <method>none</method>
    <field_name/>
    <schema_name/>
    </partitioning>
    <separator>,</separator>
    <enclosure/>
    <enclosure_forced>N</enclosure_forced>
    <header>Y</header>
    <footer>N</footer>
    <format>Unix</format>
    <compression>None</compression>
    <encoding/>
    <endedLine/>
    <file>
    <name>result</name>
    <is_command>N</is_command>
    <extention>csv</extention>
    <append>N</append>
    <split>N</split>
    <haspartno>N</haspartno>
    <add_date>N</add_date>
    <add_time>N</add_time>
    <pad>N</pad>
    <fast_dump>N</fast_dump>
    <splitevery>0</splitevery>
    </file>
    <fields>
    <field>
    <name>anumber</name>
    <type>Integer</type>
    <format/>
    <currency/>
    <decimal/>
    <group/>
    <nullif/>
    <length>8</length>
    <precision>0</precision>
    </field>
    <field>
    <name>bnumber</name>
    <type>Integer</type>
    <format/>
    <currency/>
    <decimal/>
    <group/>
    <nullif/>
    <length>7</length>
    <precision>0</precision>
    </field>
    </fields>
    <cluster_schema/>
    <GUI>
    <xloc>433</xloc>
    <yloc>85</yloc>
    <draw>Y</draw>
    </GUI>
    </step>

    <step_error_handling>
    </step_error_handling>
    </transformation>


    Thanks
    Sreelatha
    Last edited by Sreelatha; 09-13-2007 at 07:41 AM.

  5. #5
    Join Date
    May 2006
    Posts
    4,882

    Default

    I'll try... but at first sight it's just your locale formatting that is bugging you... set the format explicitly.

    Also change your last mail, and remove the "encrypted" passwords.

    Regards,
    Sven

  6. #6
    Join Date
    Jun 2007
    Posts
    128

    Default

    Hi,

    I tried it with Kettl1-2.5.1 and it is working fine.


    Thanks
    Sreelatha

  7. #7
    Join Date
    May 2006
    Posts
    4,882

    Default

    So which version did you have a problem with? Or did you just switch put in a format?

    Regards,
    Sven

  8. #8
    Join Date
    Jun 2007
    Posts
    128

    Default

    Hi,

    I had problem with Kettle-2.5.0.
    I am trying various combinations with kettle-2.5.1 and i will post depending on the results.

    Thanks
    Sreelatha

  9. #9
    Join Date
    Jun 2007
    Posts
    128

    Default

    Hi,

    I don't understand what might be the problem here.But the same problem is repeated in kettle-2.5.1 also.For the first time it worked fine.When I try to run it again it is giving me the data with splitted attributes.

    Thanks
    Sreelatha

  10. #10
    Join Date
    May 2006
    Posts
    4,882

    Default

    Lol, it's not spliited... it's just your Locale kicking in. But it should do the same everytime, however in 2.5.x the preview is probably different than the real output (the preview didn't format, the output did). On the text output go to the fields tab and put e.g. a 0 as format.

    Regards,
    Sven

  11. #11
    Join Date
    Jun 2007
    Posts
    128

    Default

    Hi,

    Thanks for the help.I got it now by specifying format as 0.

    Thanks
    Sreelatha

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.