Hitachi Vantara Pentaho Community Forums
Results 1 to 3 of 3

Thread: Repeat a value till the next identified increment in unstructured text

  1. #1
    Join Date
    Jun 2007
    Posts
    233

    Question Repeat a value till the next identified increment in unstructured text

    Hi Everyone

    It's been a very long time for me since I have used PDI and I am seeing a lot of work has been done since the days of version 2.x Kudos to all the developers and Matt for keeping this project on a good path.

    I am slowly getting back into this and things are starting to come back to me. I need to ask for a little help in processing a file I would have probably laughed at years ago, but alas I find myself a babe in the woods again.

    In short the file is unstructured text that has been extracted from a PDF. The text file is like a printout with each page of the PDF being marked in the TXT with a --------Page xx--------- at the top of it. I can easily identify the page with a regex, and am in fact using the regex evaluation step to do just that. The page number (xx) is caught using a capture group and placed into a new field (string) pageNum. What I would like to do is to repeat this value on each row until the next pageNum increments the value. I can remember doing something like this in the past but I confess the solution eludes me.

    Would anyone be so kind as to point me in the right direction?

    Cheers
    An extremely grateful Frog
    Everything should be made as simple as possible, but not simpler - Albert Einstein

  2. #2
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    This might put you on the track again: Primitive alternating sequence

    PS: You can skip the PAS and add a second Group By step instead.
    Attached Files Attached Files
    Last edited by marabu; 01-15-2014 at 02:23 AM.
    So long, and thanks for all the fish.

  3. #3
    Join Date
    Jun 2007
    Posts
    233

    Thumbs up Thankyou for the helpful pointers

    Thanks Marabu, most helpful. You have me back on track and I really appreciate your help. Played with PDI all day yesterday getting ready for a data migration and it's coming back to me. I had totally forgotten so many of the steps and their possible uses. Thanks for your help. You have put my mind back into the right way of thinking.

    Cheers

    A very Happy Frog
    Everything should be made as simple as possible, but not simpler - Albert Einstein

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.