Hitachi Vantara Pentaho Community Forums
Results 1 to 11 of 11

Thread: Building a HA cluster for BI suite

  1. #1
    Join Date
    Jul 2010
    Posts
    1

    Default Building a HA cluster for BI suite

    Hi all,
    we are evaluating how to deploy Pentaho BI server infrastructure. We are looking in particular at the load balancing and HA clustering aspect. Historically we worked with either LVS or heartbeat, but never approached a jboss/j2ee driven application.
    What scenarios and alternatives are available with Pentaho to build a load balanced HA cluster? The reference scenario is to have a LB directing traffic to a battery of servers hosting the BI suite, so that the death of a node will not impact on the functionality of the application. Databases are balanced/clustered independently.
    (For now let's exclude scalability for the PDI part)

    Thanks for your help,
    Mauro

  2. #2

    Default

    I'm also interested in setting up the same type of architecture described below. My experience has been that you can point multiple BI servers to the same solution repository. The trick is you need to deploy all reporting content to each node, and then refresh the solution repository from one node. As long as all the directories on each node match exactly, the content metadata in the hibernate database should work on each node.

    At this point, you should come up with another method of deploying content. With one node, pushing reports via client gui tools isn't so bad, but for several nodes, you should have deployment scripts to copy files from your version control area to each node. I've setup a sandbox area to test Pentaho BI server clustering, and it seems to work. I had two nodes, each pointed to the same database server containing my hibernate and quartz schemas, as well as my data warehouse. Everything seems to work. You can even add users via one node, and have the other nodes recognize them.

    I would love to hear of any other experiences people have with setting up such an architecture.

  3. #3
    Join Date
    Apr 2007
    Posts
    2,010

    Default

    it works fine exactly as you describe. just dont forget to reconfigure quartz to support clustering otherwise all your jobs will fire as many times as you have nodes.

  4. #4

    Default

    Thanks for pointing that out. Just to clarify, is this simply a matter of making the following change in pentaho-solutions/system/quartz/quartz.properties?

    org.quartz.jobStore.isClustered = true

  5. #5

    Default

    This is exactly what I'm looking for, thanks for the tips.

  6. #6
    Join Date
    Jan 2007
    Posts
    25

    Default

    Hi, i have working jboss 5.1 cluster , 2 nodes, pentaho 3.8, but with a little diferent aproach. I have the pentaho_solutions as a shared folder in my network,
    and the two nodes are working with the same folder. the nodes shares hibernate and quartz databases too. all works, except when i create or delete a file in
    the shared pentaho solutions folder from one node, refresh the solution repository from that node, and works, but node 2 can't see the changes, even if node2 cache is refreshed. is that approach possible? ani hints to solve this? thanks and sorry about my english...

  7. #7

    Default

    I like the approach of the shared folder for pentaho-solutions. Each node should read from the hibernate database for it's directory, but I'm guessing there's some additional caching on each node preventing this. Is the problem fixed when you restart jboss on the node not updating?

    We're having a related caching issue with our cluster in regards to ad hoc queries. The logs indicate ad hoc resources aren't available when they clearly are, so I believe one server is caching resources that the other node doesn't have access to. We're NOT using sticky sessions, so requests for all resources are split across the cluster. I'm currently researching the ICacheManager to understand this better.

    Can anyone elaborate on BI Server caching in regards to clustering?

  8. #8
    Join Date
    Jan 2007
    Posts
    25

    Default

    When i restart the node not updating , i can see the changes in this node. I ve tested with and without sticky sessions and its the same behaviour.
    additionally , I have problems with jpivot /mondrian serialization when not using sticky sessions......
    any help will be great...

  9. #9
    Join Date
    Apr 2007
    Posts
    2,010

    Default

    you do want to use sticky sessions, i dont think it is engineered to work otherwise. And you need to "refresh" the solution repository on both nodes, even though the repo db is shared. this is because each node caches in memory too.

  10. #10
    Join Date
    Jan 2007
    Posts
    25

    Default

    Thanks for the answer codek. I understand i have to use sticky sessions now, but i still have a question: i started to explore this topic based in pentaho_linear_scalability_1.4.pdf , that shows a jboss/jgropups based architecture .... so, if i can't have session replication, why to use Jgroups?,
    i can have my cluster with separate jboss instances and sticky sessions only?

    answering: "And you need to "refresh" the solution repository on both nodes, even though the repo db is shared. this is because each node caches in memory too" ... i have tried to refresh from PUC and From administration console, but the second node can't see changes, only if i restart thah node changes are shown. There is another form..... may be programatically..... to refresh that cache?....

    Tnanks for your help

  11. #11

    Default

    codek, thanks for the reply. I guess we'll have to switch over to sticky session.

    dduenas, I believe you can programatically refresh the solution repository by taking advantage of the ResetRepository service action. You could hit each node as follows, and post the user credentials...

    http://your_bi_server/pentaho/ResetRepository

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.