Hitachi Vantara Pentaho Community Forums
Results 1 to 4 of 4

Thread: Plugin with multiple output row meta

  1. #1
    Join Date
    Oct 2010
    Posts
    5

    Question Plugin with multiple output row meta

    Hello listers,

    I have a plugin that reads data from a file which records content is described by a COBOL structure.

    I is usually relatively easy to map a COBOL structure to a RowMetaInterface and so far my plugin, like most plugins I have seen, had a single output format (a single instance of RowMetaInterface returned via getFields).

    Now COBOL supports an equivalent of the C union, where the same byte set is described by 2 different field sets.

    In order to support this use case, I would need to support 2 output formats (or 2 instances of RowMetaInterface). The choice between one or the other alternative would happen at runtime, based on some logic (usually checking a previous field value).

    This would also mean that a step using my plugin should be able to support multiple outgoing hops, so that different steps could each process one of the output formats.

    Is this possible? Is there an example of a plugin that does something similar?

    Thanks very much.

  2. #2
    Join Date
    Nov 2010
    Posts
    9

    Default

    Wouldn't it be easier to split up your input data into two separate streams and then treat them separately, each one having its own cobol RowMetaInterface.
    You can read your input using the most common input format (p.e. a string of 80 chars). I do not know very much about Cobol, but I guess your input file is ASCII.
    The splitting can be done with a Switch/Case (that is in directory Flow) component. Then you can use the Replace in String and the Split Fields (for each of the streams) to get the desired formats.
    Good luck!
    Carl.

  3. #3
    Join Date
    Oct 2010
    Posts
    5

    Default

    Thanks Carl,

    Switch/Case is close to what I need to do. I will look into it's code.

    I can't use it directly though, because mainframes file content is not ASCII. Characters are encoded in EBCDIC and numerics are encoded in various, mainframe specific, formats.

    Also important to note:

    - records are not terminated with the usual \n
    - records can be variable size (Size is inferred from a separate COBOL structure, similar to a C struct)

    Thanks for pointing out the Switch/Case plugin.

    fady

  4. #4
    Join Date
    Oct 2010
    Posts
    5

    Default Case/Switch does not support multiple RowMetaInterface

    After looking at the Case/Switch step more closely, it seems that it assumes all downstream steps will receive the same field set. So it does not solve my problem.

    To put the issue in perspective, assuming my step processes records of different types (each type with a different field set) is there any way I can fan out records, based on their type, to different target steps?

    I know I could systematically output all possible fields for each record but I deal with long data descriptions and would rather avoid this if possible.

    I think the matter has to do with the BaseStepMeta#getFields which produces a single RowMetaInterface object. If this analysis is correct, then a step cannot produce more than one field set.

    Any ideas?

    Thanks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.