Hitachi Vantara Pentaho Community Forums
Results 1 to 4 of 4

Thread: Is it ineffecient to frequently prune fields using select steps?

  1. #1
    DEinspanjer Guest

    Default Is it ineffecient to frequently prune fields using select steps?

    I'm just wondering if someone has tested this before or maybe just innately knows it.

    My file input step generates fifty or so fields. I have a long series of steps that treat fields or groups of fields. (e.g. concatenate two fields to get a date-time timestamp then convert that to UTC and local time.)

    Would it be very inefficient to have select steps that remove the processed fields immediately after the steps that process them? It would make the list of fields much cleaner from a development point further down in the chain, but it would also be tedious to revert if it were a big performance hit.

  2. #2
    Join Date
    May 2006
    Posts
    4,882

    Default

    Try it and let us know

    I woudn't start removing data every other step, but if you halfway remove a lot of values it may save you some seconds ;-) ... probably also depends on the kind of data you remove and how many steps you have in your transformation, ...

    Regards,
    Sven

  3. #3
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    The cheapest is always to leave the row of data the same as much as possible.
    For example, if you have 100 fields, adding one or 2 doesn't cost you a thing.
    However, re-ordering them and keeping 20 of those is expensive performance wise.

    YMMV.

    Matt

  4. #4
    DEinspanjer Guest

    Default

    Thanks, that's exactly the sort of guidance I was hoping for.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.