Intermediate SQL Color Coded SQL, UNIX and Database Essays

24Apr/124

IOUG Collaborate 2012: The art of database sharding

Thanks everyone, who attended my presentation about database sharding at IOUG Collaborate. That was a lot of fun!

Sharding is an awesome technique and I’ll blog separately on it, but for now see my presentation and white paper.

Cheers

Comments (4) Trackbacks (1)
  1. Hi Maxym,

    Thank you for sharing presentation’s docs. It was very interesting info for me.

    I have a question if you don’t mind. Could you please explain in short why Oracle RAC technology is not good enough for application scaling?

    Thank you in advance.

    Andriy

  2. Hello Andriy,

    I’m glad that you liked my sharding presentations … :-)

    To your question: First of all, I have nothing against RAC or clustering technology per se, I think it is awesome in some respects (such as uninterrupted app availability) … I just do not believe RAC is a good scaling tool.

    Here is why:
    In a nutshell, on any system apps are competing for 4 basic resources: cpu cycles, memory, disk iops and network. The goal of scaling is to make sure apps have enough of each to work properly.

    While RAC can help with CPU and (with some difficulty) memory and network, it cannot really help much with scaling ios as disk resource in RAC is shared. Unfortunately, in a high volume OLTP environments, disk tends to be the most precious resource if, i.e. you are doing LOTS of small random reads or inserts. That’s problem #1.

    Problem #2 is that RAC systems tend to significantly increase in complexity when you add more nodes to the cluster as each node affects every other node directly or indirectly. That’s the reason that you do not often see RAC clusters with huge number of nodes, such as, say > 100. On the other hand, “one node” complexity of sharded system stays constant as “sharded nodes” do not interact with each other. Because of it, you can have lots of them working side by side without affecting their performance (or reliability for that matter). In a nutshell, that means unlimited scalability (with some caveats, of course).

    And finally, there is a matter of cost. One can argue that RAC, is, in general, more costly as it requires specialized (and, often, high end) components, such as SAN for shared storage, high speed “inter-instance” network etc and more experienced engineers to support it (not to mention, you have to pay ORACLE RAC license).

    Sharded systems, on the other hand can be dirt cheap off the shelf machines with very generic components. They also tend to be very simple, which means that their maintenance, to a large extent, can be automated.

    Regards,
    Maxym Kharchenko

  3. Hi Maxym,

    thanks for posting this — without exaggeration, this is the best presentation about sharding (and some other Big Data concepts) that I have seen so far. Thumbs up!

  4. Thanks Nikolay,

    I’m glad you found it useful.

    Cheers,
    Maxym Kharchenko


Leave a comment