It’s no secret that Apple’s Time Machine backup solution is clunky at best. For those who are unaware, it backs up a Mac to an HFS+ filesystem using directory hard links. The biggest problem with this is that HFS+ was an old filesystem when Time Machine was first designed and it’s gotten even older since. These days Apple uses APFS as their filesystem on computers and iOS devices but the Time Machine backup still uses the fragile HFS+ filesystem for its backups. If you select a backup target that’s not HFS+, such as a network share, it creates a sparse disk image there and creates an HFS+ filesystem in that image. I question whether there was a better way. There is but I wouldn’t expect Apple to bother actually implementing it because, well, they don’t actually seem to care about quality anymore.
The Better Way
I’m going to outline a conceptual way to do a better backup system than Time Machine but not dig into the gory details. Some of these concepts are based on zfs but not heavily so. The idea is to eliminate the weakest link in the whole system (HFS+) while still allowing for the current capabilities that Time Machine does deliver. Additionally provide some greater stability and resilience to interruption in the process.
Object Based Storage
If you’ve used S3 or other cloud-based storage systems, you are familiar with object-based storage. The basics are that instead of storing data at paths, the data is stored in an object referenced by an object id. Objects are created, read, renamed, and destroyed. In this example I will only use those operations and not use modification of objects. The advantage of object-based storage is that it can live on any kind of filesystem. The object IDs tend to be a long hex string and can be stored on a traditional filesystem which means in this case you can connect an APFS disk or a random network share and the backup will function the same with either.
Typically object IDs are a long hex string. One simple way to store objects in a traditional filesystem is to break up this hex string into a nested hierarchy of directories. So the object id
0123456789abcdef would be stored at the location
/01/23/45/6789abcdef and this works regardless of what the actual filesystem’s implementation is (excluding filename length limitations of certain ancient file systems) leaving the only real concern whether the filesystem has any file size limit. I will assume that the filesystem does implement some sort of barrier operation where every operation before the barrier is completed before operations after the barrier are started. This allows for assurance of certain atomicity of operations.
Turning a Backup into Objects
Before starting a backup, a new backup id is created. This is a simple number which is incremented for each backup taken. The id is stored in the metadata referencing each object and effectively provides a representation of when the object was first created. This will be useful later when it comes to removing backups.
It is not hard to represent a file system as a set of objects. First a single file’s contents is a single object. If there are concerns about a maximum size of objects, then a single file can be broken into a set of constant size objects (with the last object being variable size) plus one object which lists those objects. A directory is another object. The contents of this object is a list of directory items. Each directory item contains data about the item such as backup id, size, permissions, dates, etc and the object id that represents the contents of that directory item. Each directory item is either a file or directory. For purposes of Time Machine, the directory item could be just a slightly modified HFS+ B-tree element.
Handling things such as the resource fork (which isn’t used much anymore) is simply done by storing the resource and data forks in separate objects and the directory item has a reference to 2 objects for contents to represent each fork.
From this any file’s contents is easily found by starting with the directory object which represents the root of the drive. So finding
/Users/macuser/important.txt is found by looking at the root object and finding the entry for
Users. The entry indicates it is a directory and the object id for it. Read that object and find the entry for
macuser. Same deal as last time and find
important.txt. This time the directory entry indicates it is a file and its object id. Open that object and you have the content of the important file.
When an incremental backup is performed, the current Time Machine already determines what has changed since last time and only backs up those pieces. So if it determines that the only file that changed is
/Users/macuser/important.txt, then it creates a new object for the new important text file, stores that, stores the new
/Users/macuser in a new object and similarly with
/. This is essentially a copy-on-write system where new content is stored in new objects and old objects are never modified. This ensures the contents of old backups are intact.
Each backup should have some important metadata. Most notably is the object id of the top of the backup. Additionally it should contain the size, backup id, date, etc. Inexplicably to me, Apple’s current Time Machine requires mdsync (spotlight) to re-index the backup. This can be implemented by reading the previous backup’s database, apply the changes to re-index (the objects new to this backup), and storing this database in a new object. Then this object is just one piece of metadata for this backup. Additionally the snapshot should store the kill list
Temp Object Storage
The current Time Machine has a concept of a backup which is in progress. If this is interrupted, then on the next backup the in progress is examined and effectively resumed. In case an object was backed up for the in-progress backup and is no longer needed, it could be difficult to find this object (I’m not sure if this is the case). So an incremental backup could store all of its new objects in a temp location until the backup is complete.
Once the backup is complete, a barrier is enacted to ensure all the objects are written, the backup metadata is changed to update this state, another barrier is enacted, and the backup objects are moved to their permanent locations. If this process is interrupted, a subsequent run can detect that the in-progress backup is in the state where the objects are being moved, and simply resume the operation.
The Kill List is an idea that I took from ZFS. Since it is a copy-on-write filesystem, it needs a way to know what data to erase when a snapshot is removed. The same is true of backups as I have described. Above I mentioned objects are replaced with new objects when they are modified. During this process it is not difficult to maintain a list of objects which are not needed in the new backup. For the case where
/Users/macuser/important.txt is replaced, the previous object for
/ are not needed in the new backup. So these objects are added to this backup’s kill list.
When a backup is being removed because it is either old or space requirements, then the subsequent backup’s kill list provides some of the information on which objects need to be deleted. So in the case where backups 5, 6, and 8 exist, and backup 6 needs to be deleted, then objects which only belong in backup 6 should be deleted. The kill list in backup 8 lists all the objects which are not needed in backup 8 nor any subsequent backup. We can find the objects new in backup 6 by traversing the tree of objects in the backup and only going down paths where the object’s backup id is 6. The union of these two sets are the objects only in backup 6 and should be deleted with it. Those objects should be removed from backup 8’s kill list (because they are being removed) and backup 6’s kill list should be added to 8’s.
Deleting the first backup is much easier. Simply delete all objects in the subsequent backup’s kill list and empty that list.
Again, barriers should be used to indicate that a backup is being removed, perhaps with the list of objects to remove, as well as updates to the subsequent backup’s kill list. This way an interrupted process can still be resumed.
While this isn’t an exhaustive design, one familiar with the art can see how it can be superior to the current Time Machine. The ideas here are not terribly novel nor were they conceived after Apple’s engineers designed Time Machine. It’s a shame that the Time Machine we have is so poor in comparison to what it could’ve been.
Why Did I Write This?
I opened this post describing problems with the current Time Machine implementation. I have personally had to repair HFS+ disk images many times due to the fact it was never designed to have a significant level of reliability. I had kept up with the design of the ZFS file system and began thinking about how ideas from it could be adapted to Time Machine. I realized that the idea can actually work and be much more reliable (and likely faster) than the current implementation. So I decided to write the ideas down in case others find them useful.