PDA

View Full Version : AW: Questions about web service step



Jens Bleuel
04-25-2007, 10:00 AM
> Here you have a many-to-many relationship. Your input is people - maybe
first name and last name, your output is shop records. [...]

Yes, and I think it is even more complex:

Shop records is a good example, because every shop record can have children,
e.g. conditions.
So we do not deal with a flat structure not in the output and not in the
input - we had those discussions on the XML input / output step - let's call
it multi-level structure.

For an example of this look in the Kettle directory
\samples\transformations\XML Add - creating multi level XML files.ktr (oups,
just found this in error, so see the output enclosed to this mail and look
how the transformation "would" produce this at this time ;-)

Now, think a web service would like to get such a structure ;-)

One approach would be to have many input and output steps (like in the
enclosed picture) and to define the relations within the WebServices step.
(Also the underlying system would need some information from the inputs in
the outputs as discussed some mails ago.)

May be the features of databases to store XML documents could help us to
figure out a more easy/better ongoing to this and how to integrate this into
Kettle.

Anyway if we found a solution for this we could integrate this to the XML
input / output steps and use them from the Webservice step (see picture
WebServiceMultiLevelRFC.jpg).

Does someone have any information how other ETL-tools handle this?

Cheers,
Jens

> -----Ursprüngliche Nachricht-----
> Von: kettle-developers (AT) googlegroups (DOT) com
> [mailto:kettle-developers (AT) googlegroups (DOT) com] Im Auftrag von Tim Pigden
> Gesendet: Sonntag, 22. April 2007 22:31
> An: kettle-developers (AT) googlegroups (DOT) com
> Betreff: RE: Questions about web service step
>
>
> Unless I'm missing some fundamental philosophical point here,
> is this view of a step actually sensible? Why should you
> expect the inputs for any particular operation to be matched
> directly by the outputs? For example, supposed you want to
> know about the shops which a group of people go to. Here you
> have a many-to-many relationship. Your input is people -
> maybe first name and last name, your output is shop records.
> The number of rows is different, the number of columns is different.
> Copying input rows to output rows is merely a characteristic
> of a class of steps - we just happen to have implemented
> almost exclusively those steps - but even this isn't really
> true. What about variables?
>
> Essentially what we are talking about is parameters to
> operations. These parameters could themselves be multi-row
> (as in the above example). They can certainly be multi-column.
>
> Wouldn't it be more logical to say that any data can or
> should participate in the data graph and that
> parameters/variables should be treated in the same way as
> other data streams. Then your web service might be seen as a
> form of input - just like a file input, which takes
> parameters. This might be a better way of treating things
> like lists of file names - you have an optional parameter
> input stream to the file reader. Then you could use a
> directory listing (multi-column to include dates and access
> rights) as a data stream just like any other that you could
> use filters or javascript to manipulate before passing to the
> csv reader. Or you could do the same for a list of names that
> become sequential parameters to a sql query.
>
> Tim
>
>
> -----Original Message-----
> From: kettle-developers (AT) googlegroups (DOT) com
> [mailto:kettle-developers (AT) googlegroups (DOT) com] On Behalf Of Sven Boden
> Sent: 22 April 2007 20:48
> To: kettle-developers
> Subject: Re: Questions about web service step
>
>
>
> I'd pick b), but I see your point on multiple input rows.
>
> What I had in mind: as example: you have 10 input rows I1...
> I10 each resulting in 12 output rows: O1... O12.
> For each output row O also include the fields of the input
> row that caused the output.... so for each input row you
> would get 12 output rows, but each of the 12 would include
> the fields of that 1 input row.
> But this only works of course if you have a single input row.
>
> So maybe c) with a switch on/off or so.
>
> Best regards,
> Sven
>
>
> > b - Add a checkbox to choose to keep input data into output
> data (but
> > keeping input data is difficult if the number of rows send in each
> call is
> > higher than 1)
> > c - Only copy input rows to output if the call size is 1
> >
> > I think c might be the best solution but it can be difficult to
> undestand.
> > Maybe b is the better solution with some automatic check/uncheck
> > enable/disable function based on the operation we work on.
> >
>
>
>
>
> >

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "kettle-developers" group.
To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com
To unsubscribe from this group, send email to kettle-developers-unsubscribe (AT) googlegroups (DOT) com
For more options, visit this group at http://groups.google.com/group/kettle-developers?hl=en
-~----------~----~----~----~------~----~------~--~---