Sitecore 9 xDB Sharding

Have you ever wondered what is going on with those new Shard databases in Sitecore 9? This is the new xDB! The new Shard Manager stores data based on the contact ID. A contact ID is a GUID identifier that is unique for each contact. It looks something like this:

B9814105-1F45-E611-82E6-34E6D7117DCB

The xDB scales out by splitting these contacts across the various shards based on their contact identifier. A 16-byte hash of the GUID is used and assigned to each shard to ensure distribution.

Default Sharding

To simplify, we’ll use 4-digit numbers in our demonstration instead of the Hex hashes of the GUIDs. Let’s assume a default xDB installation with 2 shards:

xDB-shards

In this scenario, a Contact with a hash ID of 3333 would be stored in Shard 1, and a Contact with hash ID of 6666 would be stored in Shard 2.

Adding more shards

If we want to handle more data, we could scale out the xDB to 4 shards:

xDB-shards-4

In this scenario, the Contact with hash ID 3333 would be stored now in Shard 2, and the Contact with hash ID 6666 would be stored in Shard 3.

The Shard Map Manager

The application layer gets all the information it needs from the Shard Map Manager. The map manager tells the application which shard to load, and where that shard is.

This means that you could split your shards across different database servers to give you better horizontal scaling. Shards 1 and 2 might be on Database Server A and Shards 3 and 4 may be on a separate Database Server B.

All of this gives you great control over how you can scale your xDB data store using built-in SQL Server technology!

6 thoughts on “Sitecore 9 xDB Sharding

    1. Right now this requires updating the installation scripts in Powershell to ensure you install more shard dbs and that the permissions are set at the end of the installation.

      You can see some of what is running during installation in the xconnect-xp0.json config file for install (or the appropriate config file for scaled)

  1. Is there a rebalancing mechanism in case of “add more”? You most likely won’t be able to plan properly in advance and will need more shards as your data grows.

    1. At the moment, the ability to re-allocate sharding is not an out-of-the-box feature. This currently needs to be scripted to migrate the data and remap it to the new correct databases. I would love to see a “grow shards” option in the future as a tool!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s