Backup Software Cache Management
So, normally, backup software is going to read 90+% of your machine’s hard drive, and push that data over the network to someplace else.
However, in the process of doing this, it is going to steamroll your operating system’s filesystem cache.
There are two improvements that can be had here, in my reckoning, and I’ve seen neither in implementation.
1) Have the backup software back up the files that are in the operating system’s cache FIRST. Not only does this mean that important, often-changed files get priority, it means that backups will be slightly faster, as this data will come from memory, rather than from the disk.
This would probably require some way of getting at a list of what files the kernel/caching-daemon/whatever has in memory, which I’m not sure exists at the moment.
2) Have the backup software use a different way of accessing the disk such that the file caching daemon does not cache the files that the backup software is reading. This way, rather than steamrolling over the carefully-laid out filesystem cache of the system’s most often and/or most recently used files, the uncommon never-accessed-normally files don’t suddenly get pushed into filesystem cache when the backup software accesses them. This would lead to general system speedup, as “better” files from disk would be cached, rather than rarely-used junk.
This could be as simple as adding an oflag option to the kernel (i.e. fcntl.h in linux) that says “don’t cache this please”, and then using this when calling the open function in the backup application.
… Just some musings I had while talking to Russ on our way driving to California.