So, by now, if you’re one of the 70,000 or so people testing Windows Home Server, you’ll be regularly backing up your home computers each night, and by and large, you’ll have seen that it’s a pretty seamless experience – once your home server has checked out what computers you have on your network, it just works.
That’s the point – it just works – WHS sucessfully hides a lot of pretty amazing technology from the user, to make using the home server as simple as possible. So, I thought it would be interesting to take a look at some of that technology and see if I can do any kind of decent job of explaining it to the everyday user. I’m calling these posts “Under the Hood“, and this may be the first and last if I can’t figure out the technology myself. ![]()
First up is Windows Home Server backup, or rather, one particular element of backup which provides a great benefit to the user.
Let’s say you have a two desktops and a laptop on your home network – the desktops each have a 250Gb drive, and the laptop and 80Gb drive. And let’s say they’re all half full. That’s 290Gb of storage to back up every night and hang on, I only have 750Gb of storage in my home server – it’s going to fill up in like three days!
How does WHS fit in so many daily backups of all my computers in a limited amount of space?
The answer’s actually pretty simple, thanks to a piece of smart thinking.
Each night, Windows Home Server checks the data on each of your machines and only backs up data that it hasn’t backed up before – i.e. if a file on your desktop hasn’t been changed since it was originally backed up, it doesn’t need to be backed up again, so that’s one way of saving space.
Better still, if you have the same data on multiple machines, WHS only stores one copy of that data, but registers that it belongs on each machine. So when it comes to restoring the files, WHS knows which machines to restore that data to.
Let me give you an example – Windows system files. The laptop I’m writing this post on is a Windows XP Pro machine. The Windows System folder (holding all of the operating system files) is 2.05 Gb. I definitely want this folder backed up in case something goes wrong with my laptop. But those same system files also exist on the XP Media Center machine I use upstairs in the den – they’re exactly the same operating system files. (XP Media Center is very much based on XP Pro. It just has a prettier dress and a few new dance moves). So WHS backs up the files once, but knows that they’re needed on both machines if I choose to restore either of them. Pretty smart, and saves storage space and backup time.
That’s why your first WHS backup often takes a while – the first backup of your first machine post-installation is literally backing up everything. Subsequently, data on your other machines is compared to the data backed up from the first machine, and if it’s the same, there’s no need to back it up – it’s already safely stored.
Hope you’re still with me – it’s time to get under the hood.
So, how does this all work? Windows Home Server uses a version of a technology known as Single Instance Storage. The general idea behind Single Instance Storage (or SIS as it’s known) is that which I’ve tried to bring to life above – keeping one copy of data that multiple computers share. SIS is used quite commonly in lots of backup solutions and other server products – email systems, file servers that sort of thing.
In the example above, I mentioned that WHS each night compares data it has already backed up (the Windows System files on my laptop) with data on other machines (the same files on my Media Center machine) and then takes a call on whether to back those files up too or not.
Windows Home Server doesn’t actually compare or even store the whole files themselves. It works with fragments of those files – called clusters. Your Windows Home Server comes with a built-in, custom-designed database which has two jobs:
1. Store fragments of your data (clusters)
2. Store metadata (data about data) that describes how to reassemble those fragments of data into an entire file system if required.
So for the first backup of the first machine, WHS examines every cluster on that machine, and generates a hash (a checksum generated from performing a calculation on the data itself) which is stored in the database. If another cluster is examined and is found to be the same (by comparing the hash values), then this is noted in the database, but the cluster is not stored again.
The following day, the same process occurs, but only new or changed clusters are copied to the database.
Working at this deeply granular level, using single instance storage means that Windows Home Server is massively efficient at only storing the data it really needs, whilst still guaranteeing that you can backup individual files, folders or indeed your entire computer should you wish.
Head hurting? Yep, mine too. Thanks to MVP Doug Knox and Charlie Kindel for providing a lot of the insight above. Their heads aren’t hurting as much as mine ![]()


















