PDA

View Full Version : "Use unique connections" how make it works ?



michelvvl
12-14-2007, 10:06 AM
Hi,

I am using Kettle - Spoon v3.0.1 build:534.

In the same transformation, I need to do all database operation steps (Insert, Call Proc, SQL ...) in the same connection ( or session ).

I try to use "Use unique connections" option in the misc tab without success.

I have a 10g database and I am using Native JDBC method access (I tried OCI method access aswell).

Here is an example :
http://forums.pentaho.org/attachment.php?attachmentid=1820&stc=1&d=1197638889

- In my database connection (in the SQL tab), I execute a log_sessionID right after connecting.

- "Use unique connections" option is activated in the Misc Tab.

- I ran this transformation then checked the log in my DB and I get that :eek::

240 SESSIONID=4509143 - PDI - connection
241 SESSIONID=4509144 - PDI - connection
242 SESSIONID=4509145 - PDI - connection
243 SESSIONID=4509145 - PDI - Log SessionID 1
244 SESSIONID=4509146 - PDI - connection
245 SESSIONID=4509146 - PDI - Log SessionID 2
246 SESSIONID=4509143 - PDI - Call AddDataSource
247 SESSIONID=4509144 - PDI - Call AddDataSource 2

But I need SessionIDs are all the same.

I tried the pooling but it seems that we cannot have a pooling less than 5.

Any Idea ? :confused:


Thank you for your help.:D

Michel.

BeLienig
12-14-2007, 12:47 PM
I also use "unique connections" in transformations and it's working for me, though I use PDI 2.5.2 for that.

The pooling size should be corrected since 2.5.2 and 3.0 RC1 (see PDI-226 (http://jira.pentaho.org/browse/PDI-226)for details). Are you sure it's really bugged in 3.0.1?

Even if this feature isn't meant for what you are trying to do, it should work. Define a pool with only one connection (min, max) and set the parameter "maxWait" to higher value (like e.g. 50000).

You may also want to take a look at this thread:
http://forums.pentaho.org/showthread.php?t=56853


Regards,
Ben

MattCasters
12-14-2007, 01:09 PM
I have no idea what the GP is referring to but the "Unique connections" option opens a single database connection for each defined database connection in use in the transformation.
That is different from the normal behavior of opening one database connection per copy of a step.

What is also different is that the "unique connections" option only fires a commit to the database when the last step that holds the database connection is done with it. (or a rollback)

And that is the problem I think in your transformation. The SQL step is always throwing a commit behind each statement and I bet it's messing with that scheme.

I've said it before and I'll say it again now: I should never have written that step.

Matt

michelvvl
12-14-2007, 01:54 PM
Even if this feature isn't meant for what you are trying to do, it should work. Define a pool with only one connection (min, max) and set the parameter "maxWait" to higher value (like e.g. 50000).

You may also want to take a look at this thread:
http://forums.pentaho.org/showthread.php?t=56853


Thanks Ben for your answer, I'll try with the "maxWait" parameter.

Yes I know that the polling it's a dirty way to do that but it can be a solution ... if the "unique connections" doesn't work.

Michel.

michelvvl
12-14-2007, 02:13 PM
I have no idea what the GP is referring to but the "Unique connections" option opens a single database connection for each defined database connection in use in the transformation.
That is different from the normal behavior of opening one database connection per copy of a step.

What is also different is that the "unique connections" option only fires a commit to the database when the last step that holds the database connection is done with it. (or a rollback)

And that is the problem I think in your transformation. The SQL step is always throwing a commit behind each statement and I bet it's messing with that scheme.

I've said it before and I'll say it again now: I should never have written that step.

Matt

Thanks Matt for your quick response.:)

However a commit in a Oracle session doesn't close the session or doesn't change the SessionID.

As we can see on my Oracle Log, there are 4 different SessionID for the 4 db step.
So It seems that PDI opened a connection for each database steps.

Which step do you think I should have never written in this very simple transformation ? ;)

Is somebody got a transformation which perform several db step in a same connection ?
It may help me to figure out what's wrong.

Thanks,:D
Michel.

MattCasters
12-14-2007, 02:48 PM
You can use whatever you want, I'm saying **I** should never have written the "Execute SQL" step.
It's been a nightmare of one abuse of the step after another. :-)

Maybe you just created 4 database connections or something, how should I know?

This person seems to think it works: http://forums.pentaho.org/showthread.php?t=58720

Matt

michelvvl
12-14-2007, 03:30 PM
You can use whatever you want, I'm saying **I** should never have written the "Execute SQL" step.
It's been a nightmare of one abuse of the step after another. :-)

Maybe you just created 4 database connections or something, how should I know?

This person seems to think it works: http://forums.pentaho.org/showthread.php?t=58720

Matt

I am using "Execute SQL" steps just to make this example simple for that issue. (Not using it in my real transformation) However even without these steps, it still doesn't work :(.

LoL ... Matt, I will check if I didn't create 4 different database connections to the same database server and if I didn't use a different connection for each step :p.

Well ... I think I am too lazy to do a such thing actually ... lol.:D

I keep trying ... I appreciate your help.

Thanks,
Michel.

michelvvl
12-18-2007, 12:23 PM
Hi Matt,

I have a good news and a bad news about the "unique connection" option. :D

It seems to work on PDI v2.5.2 but not in v3.0.1.

Here is my test case with "Unique connections" activated:

http://forums.pentaho.org/attachment.php?attachmentid=1843&stc=1&d=1197991142

It calls two time this folowing stored proc compiled on my oracle db:

CREATE OR REPLACE PROCEDURE
get_cursessionid( p_sessionid_out OUT INTEGER )
IS
BEGIN
SELECT userenv( 'SESSIONID' ) sessionid
INTO p_sessionid_out
FROM DUAL;
END get_cursessionid;

This transformation has been saved in a xml format and launched on the both version. (Without any change)

On PDI v2.5.2, I have this logfile:
start;p_sessionid1;p_sessionid2
1;4512316;4512316

On PDI v3.0.1, I have :
start;p_sessionid1;p_sessionid2
1;4512319;4512318


As we can see PDI v2.5.2 keep the same sessionid but not PDI v3.0.1.

Regression or not?
Someone else can confirm ?

Please find in attched the transformation (.ktr) if you want to test it by yourself.

PS: Thanks Ben to make me think to try with PDI v2.5.2.

Cheers, ;)

michelvvl
12-19-2007, 02:19 PM
Little up and performance pb on PDI v2.5.2 :
(Reminder : As said below, "Use unique connection" seems not to work on PDI 3.0.1)

Without "Use unique connections", it takes around 15s.

With "Use unique connections" activated, it takes around 350s.

That means 20 times longer!

Do you have any idea for improving that ?

Thanks,

Cheers,
Michel.

sboden
12-19-2007, 02:42 PM
Well ... there are things to optimize but you can't make it go back from 350 to 15... the reason is simple, when you use "unique connection" you only have 1 connection which is a very limiting factor for the queries/updates done by Kettle.

Best is usually to redo your logic so that you can use multiple copies... how... that depends.

Regards,
Sven

MattCasters
12-19-2007, 07:51 PM
Wiser words have rarely been spoken on this forum.

michelvvl
12-20-2007, 09:00 AM
Yes I was thinking to redo my logic to use normal tables for staging data instead of using temporary tables.
Like this I will be able to use multiple connection but It will be harder to do multiple Loading at the same time in my case.

However I wanted to raise the "Use unique connection" BUG for PDI v3.0.1 which seems NOT to work.

How to do ?

sboden
12-20-2007, 09:08 AM
http://wiki.pentaho.org/display/EAI/Frequently+Asked+Questions

Read the bits under bug reports and feature requests

Regards,
Sven

michelvvl
12-20-2007, 09:10 AM
Well ... there are things to optimize but you can't make it go back from 350 to 15...

Could you give me some tips to make it faster even if I know it won't go back to 15s ?

Thanks,

Michel.

sboden
12-20-2007, 09:42 AM
Without seeing your transformation, possibly upping the commit size.

Regards,
Sven

jbleuel
01-08-2008, 11:01 AM
Just a link to Jira "Use unique connections" regression, http://jira.pentaho.org/browse/PDI-593

Summary: 3.0.1 (and I suppose 3.0.0, too) failed in using a unique connection, Sven fixed this in 3.0.2 and 3.1.0.

Cheers,
Jens

Dhiogo
01-25-2008, 12:01 PM
Hello,

I have a problem also with "user unique connection" (Kettle 3.1.0 build 561), I don't know if it was fixed. Kettle can't initialize some steps when I enable the "use unique connection" checkbox. It's appearing a strange error about access denied. My database setting is OK (user, password, URL, databasename...). This error disappering when I disenable the checkbox of "use unique connection. The error is shown below:
A database error occurred, stopping everything:
2008/01/25 13:42:12 - Lookup formation.0 - ERROR (version 3.1.0, build 561 from 2008/01/07 00:04:36) : Error occured while trying to connect to the database
2008/01/25 13:42:12 - Lookup formation.0 - ERROR (version 3.1.0, build 561 from 2008/01/07 00:04:36) :
2008/01/25 13:42:12 - Lookup formation.0 - ERROR (version 3.1.0, build 561 from 2008/01/07 00:04:36) : Error connecting to database: (using class com.intersys.jdbc.CacheDriver)
2008/01/25 13:42:12 - Lookup formation.0 - ERROR (version 3.1.0, build 561 from 2008/01/07 00:04:36) : [Cache JDBC] Communication link failure: Access Denied

I'm using Intersystems Caché Database v. 2007 with JDBC native (the driver is OK in my classpath).

Can anybody help me?

Thanks
Dhiogo

Dhiogo
01-26-2008, 11:00 AM
Hello,

I have a problem also with "user unique connection" (Kettle 3.1.0 build 561), I don't know if it was fixed. Kettle can't initialize some steps when I enable the "use unique connection" checkbox. It's appearing a strange error about access denied. My database setting is OK (user, password, URL, databasename...). This error disappering when I disenable the checkbox of "use unique connection. The error is shown below:
A database error occurred, stopping everything:
2008/01/25 13:42:12 - Lookup formation.0 - ERROR (version 3.1.0, build 561 from 2008/01/07 00:04:36) : Error occured while trying to connect to the database
2008/01/25 13:42:12 - Lookup formation.0 - ERROR (version 3.1.0, build 561 from 2008/01/07 00:04:36) :
2008/01/25 13:42:12 - Lookup formation.0 - ERROR (version 3.1.0, build 561 from 2008/01/07 00:04:36) : Error connecting to database: (using class com.intersys.jdbc.CacheDriver)
2008/01/25 13:42:12 - Lookup formation.0 - ERROR (version 3.1.0, build 561 from 2008/01/07 00:04:36) : [Cache JDBC] Communication link failure: Access Denied

I'm using Intersystems Caché Database v. 2007 with JDBC native (the driver is OK in my classpath).

Can anybody help me?

Thanks
Dhiogo

Dhiogo
03-06-2008, 02:24 PM
Hi,

This problem still is happening with me. Will it be solve only in the next version?
The "use unique connection" option continues unstable. Beyond the Intersystems Caché Database, I'm using the Kettle's repository to store my definitions of the connections.
Somebody give me a answer, please.

Thanks
Dhiogo

sboden
03-06-2008, 03:39 PM
Well make a jira for your problem http://jira.pentaho.org/browse/PDI

The unique connectons where fixed in 3.0.2, but your specific problem seems to be something else at first sight.

Regards,
Sven