PDA

View Full Version : Database partitioning



MattCasters
12-07-2006, 04:53 AM
We’ve been experimenting lately with database partitioning (in version 2.3.2-dev (http://www.javaforge.com/proj/doc.do?doc_id=3701), make sure to update your kettle.jar to the latest snapshot). In our context, database partitioning means that we divide our data over a cluster of several databases.
A typical way of doing that is that you divide the customer_id by the number of hosts in the cluster and get the remainder. If the remainder is 0, you store the data on the first host in the cluster, 1 for the second, 2 for the third, etc.
This sort of thing is something that we’ve been implementing in Kettle for the last couple of weeks. The reasoning is simple: if one database is not up to the task, split the load over 2 or 5 or 10 databases on any amount of hosts.