Hitachi Vantara Pentaho Community Forums
Results 1 to 9 of 9

Thread: Convert 1 to many data relationship to single row

  1. #1

    Question Convert 1 to many data relationship to single row

    Hi,
    I've been wrestling with this for ages, but have not found the right way to do this in Kettle.

    I have a customer record and a phone record, which has a one to many relationship with the customer. They are linked on the key companyID (integer value), well in the contactdata table, the relevant field is called entityID because the table holds phone records for companies and people.

    Each phone record has a text entry like: Main 1:, Fax 1:, Fax 2: etc. There are a known text value which the user cannot alter.

    I would like to convert these to a single line: companyID, mainPhone1, mainPhone2, faxPhone1, faxPhone2, companyWebSite, companyEMail and write these values to a new record.

    I've attached my current attempt, but the problem right now is that the last "Select Values" gives an error like:

    Couldnt find field {0} in row!

    I've checked the values in the select and alter tab for each of the preceding hops and all are fine. Viewing the Input/Output fields also show the correct values.

    The data structures are as follows:

    Code:
    CREATE TABLE  `contactdata` (
      `contactID` int(10) unsigned NOT NULL auto_increment,
      `entityID` int(10) unsigned NOT NULL default '0',
      `PhoneType` int(11) default NULL,
      `description` varchar(16) character set latin1 collate latin1_general_ci NOT NULL default '',
      `number` varchar(64) character set latin1 collate latin1_general_ci NOT NULL default '',
      PRIMARY KEY  (`contactID`),
      KEY `entityID` USING BTREE (`entityID`,`PhoneType`)
    ) ENGINE=MyISAM AUTO_INCREMENT=1026139 DEFAULT CHARSET=utf8
    
    
    
    CREATE TABLE  `companycontact` (
      `phoneID` int(11) NOT NULL auto_increment,
      `companyID` int(11) NOT NULL,
      `mainPhone1` varchar(26) default NULL,
      `mainPhone2` varchar(26) default NULL,
      `mainPhone3` varchar(26) default NULL,
      `faxPhone1` varchar(26) default NULL,
      `faxPhone2` varchar(26) default NULL,
      `companyEMail` varchar(45) default NULL,
      `companyWebSite` varchar(76) default NULL,
      PRIMARY KEY  (`phoneID`),
      KEY `website` (`companyWebSite`),
      KEY `email` (`companyEMail`),
      KEY `companyID` (`companyID`),
      KEY `mainPhone1` USING BTREE (`mainPhone1`)
    ) ENGINE=MyISAM AUTO_INCREMENT=7707 DEFAULT CHARSET=latin1
    Sorry for the probably dumb question, but I cannot figure out what I'm doing wrong.

    Thanks in advance,
    Peter
    Attached Files Attached Files

  2. #2

    Red face

    I have got a version working (attached). If anyone has a better solution, or a more elegant way of resolving this problem, I would be very interested.
    Regards,
    Peter
    Attached Files Attached Files

  3. #3
    Join Date
    May 2006
    Posts
    4,882

    Default

    In your "select values" step different kind of rows will arrive, and this blows up your transaction, each step can process records of only 1 type.

    So you probably need multiple insert/update steps per flow and probably using "unique connection" switched on in order to avoid deadlocks.

    Also no idea why you have multiple table input steps.

    Regards,
    Sven

  4. #4

    Default

    I would still be interested in knowing why the transform above does not work, though! Thanks!

  5. #5
    Join Date
    May 2006
    Posts
    4,882

    Default

    Because you try to process multiple kind of rows in 1 step... Most steps in 2.5 contain an optimization that they take the first row they ever process and determine the fields from that one.... if then afterwards you throw in another kind of row (using different fields) the step gets confused and aborts

    1 step = 1 type of input rows.

    Regards,
    Sven

  6. #6

    Default

    Hi Sven,
    Thanks for reply. The Table Input #1 selects a DISTINCT on description which gives a "master" list of categories, e.g. Main 1, Fax 1, etc. These are passed to Table Input #2 to select data for that particular case, e.g. Main 1 phones only.

    I did in the end use multiple Inserts to solve the problem, as attached.

    What I am still not clear on is, when a "row" is being processed, say from a database, does it process each row through the whole transform or batch up the data in segments and process them in parallel?

    In my design, I have assumed the preceding step has been processed PRIOR to the current one, rather than in parallel.
    Thanks again,
    Peter

  7. #7

    Default

    OK. I was under the impression that "Select Values" would join the input rows together regardless. Thanks that clears things up.

  8. #8

    Default

    Quote Originally Posted by sboden View Post
    Because you try to process multiple kind of rows in 1 step... Most steps in 2.5 contain an optimization that they take the first row they ever process and determine the fields from that one.... if then afterwards you throw in another kind of row (using different fields) the step gets confused and aborts

    1 step = 1 type of input rows.

    Regards,
    Sven
    Sven,
    Sorry not too clear when you say "type", clearly this is not a low level data type, since multiple data types can co-exist in a row.

    Do you mean type = source? E.g. from an excel field cannot be "merged" with another row from say a Text Input by a "Select Values" step?

    I have a couple of transforms that seem to allow row elements to be merged using a "Select Values".

  9. #9
    Join Date
    May 2006
    Posts
    4,882

    Default

    More "rowtype" ... every row flowing over 1 hop has to be of the same type... same number of fields, same names of fields, same type of all the fields and same order of fields.

    You can get information from more than 1 source, but they all have to send rows of the same "type" as explained in the previous sentence.

    Most steps will only process one kind of type (except for some steps which use the yellow hops e.g. to lookup information).

    Regards,
    Sven

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.