Hitachi Vantara Pentaho Community Forums
Results 1 to 7 of 7

Thread: Excel Input Validation for duplicates rows

  1. #1
    Join Date
    May 2013
    Posts
    8

    Default Excel Input Validation for duplicates rows

    Hello Every One,

    I have excel sheet as input.
    Before porting the data into tables, i have to check that there should be no duplicates in the excel sheet.

    So my question is , how to check whether there are any duplicate rows in the excel sheet or not .


    Advance Thanks.

    Regards,
    Deepak

  2. #2
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    You can identify duplicates in your input stream using a "Unique Rows" step, after proper sorting that is.
    So long, and thanks for all the fish.

  3. #3
    Join Date
    May 2013
    Posts
    8

    Default

    Thanku Marabu

    If i am using "Unique Rows" step , then atleast 1 which is duplicated will be ported into my db but i have to throw an error message to the excel output showing both the rows.

  4. #4
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    The first step towards a solution was to indentify duplicates in your input stream.
    If you don't want any row with a non-unique keyset to make it into your table, why don't you just filter the stream?
    So long, and thanks for all the fish.

  5. #5
    Join Date
    May 2013
    Posts
    8

    Default

    Quote Originally Posted by marabu View Post
    The first step towards a solution was to indentify duplicates in your input stream.
    If you don't want any row with a non-unique keyset to make it into your table, why don't you just filter the stream?

    can u please help me in filtering, with an example.

    for example , i have the following recordsName:  Img1.jpg
Views: 56
Size:  10.6 KB

    Now i have emp A 3 times,but in that only 2 records are duplicate.
    so i want to show error for A in DEP1
    Last edited by phx.deepak; 05-29-2013 at 06:11 AM.

  6. #6
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    Filtering is only necessary, if you don't want the duplicates to show up in your Excel output.
    If you want to attach some notification, you don't need a Filter Rows step.
    So make up your mind.
    Attached Files Attached Files
    So long, and thanks for all the fish.

  7. #7
    Join Date
    Jan 2013
    Posts
    1

    Default

    Quote Originally Posted by marabu View Post
    Filtering is only necessary, if you don't want the duplicates to show up in your Excel output.
    If you want to attach some notification, you don't need a Filter Rows step.
    So make up your mind.
    Today I met exact the same issue and this really works. You saved my day . Thanks a lot.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.