Dear Contributors!
Since it is unlikely to post here a PDF, I recommend to search for "A Fast
File System for UNIX".
The current file name is most probably "ffs.ps". The older file name
"05fastfs.ps" ten years ago is no longer existent.
GhostView will show it, you can convert it to PDF with this tool as well if
you prefer Adobe Acrobat Reader etc.
I'll refer to this paper in discussing defragging.
Back in old DOS/Win16 days, there may have been an advantage for contiguous
files.
Like on the PDP-11, binaries got slurped into RAM in one chunk, OS/2's and
Windows' DLLs already posing a problem because no longer a single binary
got loaded.
See on page "3" why defragging the "old" 7thED file system was a non-issue
under *NIX then, it involved a dump, rebuild, and restore.
There also an idea published 1976 was mentioned that suggested regularly
reorganising the disk for restoring locality which could be viewed as
defragging.
The VAX introduced a new concept of virtual memory, demand paging. Prior to
this, only swapping of segments, mostly of 64KB size, was common.
Since then, binaries are read only that far to set up the process, the
"call" to main(argc,argv), note an OS *returns* to a process to facilitate
multitasking, involves a page fault.
With some luck, that page is in the buffer cache, but surely the first call
to a function will result in another page fault, where the luck of finding
it in the buffer cache is greatly diminished and the disk block surely have
been rotated away.
Page "7" of the FFS paper mentions a rotationally optimal layout, in DOS
days, there were tuning programs to change the interleave factor which
became obsolete when CPUs got fast enough and DMA disk access became
common, the paper calls this I/O channel, and interleave factor "1" became
standard.
Also, booting becomes less sequential if you extend this term beyond the
loading and starting of the kernel to hardware detection and especially to
loading and starting background processes and initialising the GUI and it's
processes.
Linux is still sequential up to GUI start, which is parallelised
everywhere, but some of the BSDs try to go parallel after hardware
detection, albeit with some provision for interdependencies.
OS/2, Windows, and MacOS_X switch early to parallelised GUI mode, MacOS<=9
never showed a text screen, I don't know if MacOS_X ever shows a text
screen.
Then quite a bazillion of processes contend for the disk arm, you may
separate some of *NIX subtrees to different SCSI disks to limit this,
albeit not too much, IDE disks are only recently capable of detaching after
a command to enable parallelity.
Partitions on the same disk may aggravate the problem because they force a
long seek when the elevator algorithm has to switch partitions.
Especially DLLs, due to their shared nature, shared libraries under *NIX
are not that numerous and pervasive, are never in the vicinity of the
binary calling them.
Thus defragmenting becomes practically irrelevant, at least for
executables.
Buffer underrun protection is now common with any CD/DVD toaster due to
their high speed, but the source of buffer underruns is more a process
madly accessing the disk and/or the GUI than a fragmented disk which is
usually faster than any high speed CD/DVD.
So defragmenting becomes irrelevant for normal files as well.
Traditional defraggers run in batch mode, which may be tolerable on a
workstation after business hours, but intolerable on an Internet server
which is accessed 24/7.
Also batch defraggers which don't need umounting the disk and thus can run
in the background have the problem, that their analysis is likely to be
obsolete at it's end so the defrag is suboptimal.
This is especially true for mail and/or news servers where bazillions of
mostly small files are created and deleted in quick succession.
There would be the option of an incremental defragger which moves any file
closed after writing to the first contiguous free space after and fill the
gap from files below this boundary.
Over time, file shuffling decreases as static files tend to land at the
beginning of the disk and the dynamic ones behind them.
A batch defrag with ascending sort over modification date may shorten this
process significantly.
However, this scheme also gets overwhelmed on mail and/or news servers.
As mentioned on page "3" of the FFS paper, defragging was too costly back
then, thus they decided to implement a controlled fragmentation scheme
described mostly on page "8" with cylinder groups and heuristics to place
files there, large files being deliberately split up.
OS/2's HPFS definitely is modelled after BFFS, Microsoft tries to hide that
this holds also for NTFS.
I verified this both on NTFS 4 and 5.1 by loading a bazillion of files,
including large ones, to the NTFS drive and firing up a defragger with a
fine block display.
A checkerboard pattern will show up, revealing BFFS-like strategies.
Defragging this spoils the scheme and only calls for regular defrag runs.
Thus even under NTFS, defragging becomes a non-issue, this may be different
for FAT.
Note NTFS is still difficult to read with a dead Windows, and practically
impossible to repair.
Bad idea for production systems.
The successor to NTFS is still to be published, so no information about
this is available, it will only be sure that your precious data again are
practically lost with a dead Windows.
So it is reasonable to keep your precious data on FAT, or better on a Samba
server.
They will be accessible for Windows' malware anyway, that is the design
fault of this OS.
Even Vista will not help, the "security" measures are reported to be such a
nuisance that users will switch them off.
And malware will find it's way into even with full "security" enabled.
However, XP runs the built-in defragger during idle time and places the
files recorded in %windir%\Prefetch\ in the middle of free space and leaves
enough gaps for new files.
Boot time is marginally affected by this.
To get rid of this, you must disable the Windows equivalent of the cron
daemon which may be undesirable.
You can disable the use of %windir%\Prefetch\ with X-Setup, then these
files aren't moved, but the defragmentation will still take place.
Thus it is a better idea to leave these setting as they are, file shuffling
settles comparably fast.
Thus defragging becomes an old DOS/Win16 legacy which is still demanded by
the users.
This demand is artificially kept up by the defrag software providers which
want to secure their income, even new companies jump on the bandwagon.
Back in DOS times, Heise's c't magazine closed their conclusion with the
acid comment that defragging is mostly for messies which like to watch
their disks being tidied up, but only these, not their room or house.
Debian Sarge cometh with an ext2fs defragger, unusable with ext3fs,
requiring umounting the disk, thus practically useless.
The mail address was dead, so no discussion possible.
However, ext2fs already follows the ideas of BFFS, so defrag should be a
non-issue there, too.
ReiserFS got somewhat out of focus since the fate of Hans Reiser is quite
unknown with that lawsuit for murdering his wife.
Also tests of Heise's iX magazine revealed that balancing it's trees will
create an intolerable load on mail and/or news servers.
Rumours were that a defragger was thought of.
Note also that internally the CHS scheme is broken by some disk vendors,
Heise's c't magazine once found an IBM drive going over one surface from
rim to spindle and then the next surface from spindle to rim, creating a
HCS scheme.
Also disk platters are now few to one to cope with low height profiles,
even beyond laptops, disks are now 3.5" and 2.5" with heights below a third
of the standard height form factor. 5.25" disks with full height, CD/DVD
are half height, and ten platters as Maxtor built once are unlikely to
reappear.
Also the sector zoning breaks internally the CHS scheme, but BFFS' cylinder
groups are still beneficial in all these cases, it will spread disk access
time and speed evenly anyway.
Conclusion: Defraggers are obsolete now, only an issue for some software
providers, and probably for harddisk vendors.
Kind regards
Norbert Grün (
gnor.gpl@googlemail.com)