Sitecore 9 xDB Sharding

Published by

on

Have you ever wondered what is going on with those new Shard databases in Sitecore 9? This is the new xDB! The new Shard Manager stores data based on the contact ID. A contact ID is a GUID identifier that is unique for each contact. It looks something like this:

B9814105-1F45-E611-82E6-34E6D7117DCB

The xDB scales out by splitting these contacts across the various shards based on their contact identifier. A 16-byte hash of the GUID is used and assigned to each shard to ensure distribution.

Default Sharding

To simplify, we’ll use 4-digit numbers in our demonstration instead of the Hex hashes of the GUIDs. Let’s assume a default xDB installation with 2 shards:

xDB-shards

In this scenario, a Contact with a hash ID of 3333 would be stored in Shard 1, and a Contact with hash ID of 6666 would be stored in Shard 2.

Adding more shards

If we want to handle more data, we could scale out the xDB to 4 shards:

xDB-shards-4

In this scenario, the Contact with hash ID 3333 would be stored now in Shard 2, and the Contact with hash ID 6666 would be stored in Shard 3.

The Shard Map Manager

The application layer gets all the information it needs from the Shard Map Manager. The map manager tells the application which shard to load, and where that shard is.

This means that you could split your shards across different database servers to give you better horizontal scaling. Shards 1 and 2 might be on Database Server A and Shards 3 and 4 may be on a separate Database Server B.

All of this gives you great control over how you can scale your xDB data store using built-in SQL Server technology!

7 responses to “Sitecore 9 xDB Sharding”

  1. Chris Perks Avatar

    Hi! Is the Shard Map Manager applicable for non-Azure installations, too?

    1. Jason St-Cyr Avatar

      Yes, Sitecore 9.0 uses sharding on-premise as well.

  2. Dylan Young Avatar

    How do you configure additional shard databases? I’m curious because I’m sure one day I’ll be asked to do this.

    1. Jason St-Cyr Avatar

      Right now this requires updating the installation scripts in Powershell to ensure you install more shard dbs and that the permissions are set at the end of the installation.

      You can see some of what is running during installation in the xconnect-xp0.json config file for install (or the appropriate config file for scaled)

  3. Alex Smagin Avatar

    Is there a rebalancing mechanism in case of “add more”? You most likely won’t be able to plan properly in advance and will need more shards as your data grows.

    1. Jason St-Cyr Avatar

      At the moment, the ability to re-allocate sharding is not an out-of-the-box feature. This currently needs to be scripted to migrate the data and remap it to the new correct databases. I would love to see a “grow shards” option in the future as a tool!

  4. […] the post from Jason St-Cyr to get some in depth knowledge on the sharding mechanism Sitecore uses: click here.Please copy your production shard0db, shard1db and refdatadb to the SQL server that you want to […]

Leave a comment

Create a website or blog at WordPress.com