CHAPTER 1

HOW A DISK WORKS


You can skip this chapter if you already know all about disks. However, you need to understand disks well to really understand fragmentation. The rest of this book depends heavily upon the terms and concepts presented in this chapter. Rather than skipping this chapter altogether, I recommend that the experienced System Manager skim it or read it quickly, if only to ensure that we agree on all the essential terminology.

There are many different kinds of disks. We can omit some of them from discussion. Read-Only disks, for example, such as CD-ROM, cannot become fragmented after manufacture and, if they are already fragmented when created, there is nothing you can do about it. They are unchangeable. RAM disks and semiconductor disks are really memory, not disks, though they appear as disks to the computer system. While it is theoretically possible for such disks to become fragmented, little additional overhead is caused by their fragmentation, since there are no moving parts to cause delays finding data in "distant" parts of the disk. Floppy disks are usually too small to suffer from fragmentation and they are not ordinarily used for on-line storage. They are typically used once to store a small amount of data and then stored in a box on a shelf somewhere. Such usage does not suffer from the effects of fragmentation.

The disks with which we will concern ourselves are hard (not floppy) disks used for storing frequently-accessed information in a computer. In an OpenVMS system, the disk is an essential, central part of the computer itself. Without a hard disk, there could be no Virtual Memory System, which is the VAX operating system itself.

Throughout this book, when the word disk is used, the above is the type of disk being referred to.

Now let's take a look at the basic parts of a disk and the terminology that describes them:

The Physical Parts Of A Disk

A disk looks like this, conceptually:

A Disk

Figure 1-1 A Disk

Magnetic Surface

Any disk has, as a critical part, a surface on which to record data. That surface is usually magnetic, meaning that it is capable of storing a small amount of magnetism. Perhaps the most remarkable aspect of disk technology is the number of distinct amounts of magnetism that can be stored on a single surface of a disk. At the time of this writing, commercially available disks selling for a few hundred dollars are capable of storing hundreds of millions of separate, distinguishable bits of information on each square inch of surface. These numbers have been multiplying so fast for so long that I dare not even speculate on the future of disk storage capacities.

The surface of a disk is shaped like a platter, similar to a phonograph record.

Disk Surface

Figure 1-2 Disk Surface

Bits

Each single magnetic entity is used by the computer as a binary symbol. Binary means "having two and only two possible states" such as on or off, true or false, and so on. Each such entity is called a bit, which is short for binary digit. Binary digits are represented in written communication as zeros and ones. The use of numbers to represent bits, however, is only a convenience and does not mean that bits have anything to do with mathematics. They do not. Bits are in the realm of logic, which is a matter of reasoning and not arithmetic calculation.

Bit

Figure 1-3 Bit

Byte

When eight bits are considered together, they are referred to as a byte. A single eight-bit byte is the amount of computer storage typically used to store a single letter of the alphabet or other symbol of human communication. The word animal could thus be stored in six bytes of computer storage.

The byte is so common to computers that disks are referred to in terms of how many millions of bytes can be stored on them. Disks that can hold hundreds or even thousands of millions of bytes are commonplace today.

Byte

Figure 1-4 Byte

Block, Sector

The surface of a disk is divided into sections. This sectioning is not a physical marking on the surface, but rather it is just an idea that the disk is so divided. These sections are called sectors or blocks. The term sector is more common to personal computers and VAX or Alpha AXP hardware, while block is common OpenVMS terminology. In OpenVMS, a block is a collection of 512 bytes. OpenVMS disks are formatted into blocks of 512 bytes each. With a disk so formatted, it is common to talk about blocks of disk capacity. A 456 megabyte disk, for example, could also be said to have a capacity of about 890,000 blocks (890,000 x 512 = about 456 MB).

Block

Block

Figure 1-5 Block

Cluster

With larger disk capacities, it is inefficient for the computer system to deal with millions of individual blocks one at a time. The operating system's map of the disk's blocks is too big to be useful unless single bits in the map can represent more than one disk block. Accordingly, disk blocks are grouped into clusters, which are groups of blocks read and written as a unit. In other words, a cluster is the minimum allocation quantity for a disk. The cluster size, in terms of number of blocks per cluster, can be varied by reinitializing the disk.

Cluster

Figure 1-6 Cluster

Tracks

The blocks and clusters of storage space are arranged in groups referred to as tracks. A single track is one strip of disk space beginning at one point on the surface and continuing around in a circle ending at the same point. The tracks are concentric rings, not a spiral like the grooves on a phonograph record. Each surface has many tracks.

Tracks

Figure 1-7 Tracks

Platters

A disk may consist of one or more platters, each of which may be recorded on both sides. The platter spins like a phonograph record on a turntable.

Platters

Figure 1-8 Platters

Cylinder

The tracks at the same radius on each platter, taken together, are referred to as a cylinder. If you visualized these tracks without any other part of the disk, they would form the shape of a hollow cylinder.

Cylinder

Figure 1-9 Cylinder

Head

To detect magnetic information on the recording surface, the disk has one or more heads. A head is a tiny magnetic device capable of reading or writing magnetic bits on the disk surface. The platter spins near the head(s), so that a single track of recorded information is continuously passing under the head, available for reading or writing. The head never touches the surface. Rather, it floats on a cushion of air so thin that a human hair or even a particle of cigarette smoke cannot pass between the head and the surface. As foreign particles that small would cause the disk to fail, such disks are sealed in air-tight containers.

While some disks have had one head hovering over each track, it is far more common to have movable heads capable of moving from track to track as needed. The movement of a head from one track to another is called a seek. This term will be important when we are talking about disk speeds. The time it takes for a head to seek is one of the most critical factors in determining the speed of a disk.

Head

Figure 1-10 Head

Arms

Disk heads are mounted on arms that hold the heads close to the platter surface at precisely the right point to read or write data. There may be one arm for each head, but on multiple-platter disks a single arm may support two heads - one for the platter above the arm and one for the platter below. Some disks mount all the heads on a group of arms that move in unison. Imagine your spread fingers moving between the fanned pages of a book and you will get the idea of multiple disk arms moving together in and out of disk platters.

Arms

Figure 1-11 Arms

Spindle

A disk platter is attached to a spindle around which it rotates like a wheel on the axle of a car. The spindle is at the exact center of the platter. The arm moves the head from the outer edge of the platter toward the spindle at the center and back out again. Though most disks have only one spindle, some complex disks are made up of two or more single-spindle disks treated as one large disk. These are called multi-spindle disks. However, no platter ever has more than one spindle.

Spindle

Figure 1-12 Spindle

Electronics

Electronic circuitry is required to sense and record the magnetism on the surface of the platters and to move the heads. This circuitry is commonly referred to as the electronics of the disk. The electronics communicate data between the physical disk and the computer.

Electronics

Figure 1-13 Electronics

Drive

The combination of one or more spindles, arms, heads, platters and electronics into a single physical device for storing and retrieving data is known as a disk drive. In this book, the term drive will be used often to refer to the disk drive. Viewed outside its cabinet, a disk drive is a sealed metal object that looks something like this:

Disk Drive

Figure 1-14 Disk Drive

Cable

The electronics in the disk drive are connected to circuitry in the computer by means of cables, which are no more than wires with a certain type of connector on each end. Often, the individual wires are color-coded for clarity.

Cable

Figure 1-15 Cable

Controller

While the cables attach directly to the electronics of the disk drive on one end, they do not really attach directly to the computer on the other end. On the computer end, the cables attach to a controller, which is sometimes referred to as an interface. The controller, which is attached to the computer, decodes instructions from the computer and issues instructions to the disk drive to do what the computer has instructed. The controller also receives data and status information from the disk drive, which it passes on to the computer in a form the computer can understand. A single controller may service more than one disk drive.

Intelligent Disk Controller Functions

Disk controllers range in complexity from a very simple controller that merely relays instructions and data, to an intelligent controller that uses its information about the status of the disk to help the computer process data faster. Two examples of intelligent disk controller functions are seek ordering and data caching.

Seek Ordering

By keeping track of the exact position of the heads at all times, the controller can determine which one of multiple requests from the computer can be serviced in the shortest time. Then, instead of servicing the computer's requests in the order received, the controller can service first the requests for data nearest the heads and then the requests for data farther away. This is called seek ordering, which simply means putting the seeks in a better order.

For an over-simplified example, let's say the head is hovering over track 7. The computer is waiting for data from track 2 and from track 5. To go from track 7 to track 2 to service the first request and then back to track 5 for the second would require more time than it would to just stop off at track 5 on the way to track 2. So, the intelligent controller reorders the two requests and services the track 5 request first, then the track 2 request. The result is faster access to disk data on average.

Seek Ordering

Figure 1-16 Seek Ordering

Data Caching

Dramatic performance improvements can be gained by placing a significant amount of memory inside the disk controller. This local memory is called a cache and is used to store data recently retrieved from the disk by the computer. Then, if the computer should happen to request exactly the same data again, the controller can service the request from the local cache at memory speed (microseconds) instead of at disk speed (milliseconds). Of course, these dramatic gains are only available when the same disk block is retrieved more than once and when that block has been saved by the controller in its local cache. The amount of performance gain from such a system is wildly variable, being largely application-dependent.

Data Caching

Figure 1-17 Data Caching

Disk-Related Software

Driver

Just as there are many different types of disks, there are also many different types of disk controllers. Different computer instructions are needed to deal with each different type of controller. The set of instructions used to manipulate a controller is a software component known as a driver. The driver resides at the lowest levels of a computer's operating system, where it can interact directly with the hardware. The driver translates the instruction codes of the disk controller into standardized instructions recognizable and usable by the more generalized parts of the computer system, and vice versa. Conversely, the driver enables an application program to issue a generic "get data" instruction, for example, to the disk without having to concern itself with the peculiarities of that particular disk (the number of tracks, platters, and so on). A single disk driver may service more than one disk controller.

users
applications software
system services (software)
drivers
hardware

Example 1-1 Drivers in the OpenVMS Hierarchy

Input/Output (I/O) Request Queue

A driver has associated with it a queue for holding input/output (I/O) requests. This queue is merely a data structure enabling the computer to store an I/O request while it carries on with its work without having to wait for the I/O processing to complete. Entries can be added to the queue or deleted from it and, under certain conditions, the entries in the queue can be rearranged.

QIO

The OpenVMS operating system contains a mechanism for queuing (inserting) an I/O request to the queue of a driver. This mechanism is called the $QIO system service. The dollar sign indicates that this abbreviation is Digital's. QIO stands for "Queue Input Output," where queue is used as a verb. An application program or a higher level of the operating system uses the $QIO system service to cause I/O to occur.

Application

An application is a computer program which controls the computer system to perform some useful work for the user.

Disk File Structures

OpenVMS deals with disk blocks from two different points of view.

Physical Block

The actual arrangement of information on the surface of a disk platter is referred to as a physical block. The Physical Block Number (PBN) is an address used for identifying a particular block on the surface of the disk.

Physical Block Number

Figure 1-18 Physical Block Numbers

Logical Block

When the blocks on a disk are considered from a programming point of view, they are viewed as logical blocks. The address of a logical block on a disk is its Logical Block Number (LBN). LBN 0 (zero) is the first LBN on a disk. Logical blocks correspond one-for-one to physical blocks, but the logical block number might not correspond directly to the same physical block numbers.

Logical Block Numbers

Figure 1-19 Logical Block Numbers

OpenVMS drivers, controllers and some disk electronics are capable of detecting a physical block that is unreliable for storage of data and replacing it with a spare block from the same disk. When this occurs, the logical block number does not change, even though the physical block number is now different. In this book, when discussing blocks or block numbers, we will be referring to logical blocks unless otherwise specified.

Volume

In OpenVMS, the word volume refers to a structured (formatted) disk. When considering a disk as a logical (conceptual) unit of storage, rather than a physical unit of storage, it is referred to as a volume. OpenVMS has the capability of treating one or more physical disk drives as one disk. This capability is implemented by the use of software and does not involve any additional electronics or cabling. When two or more disks are so combined, the combination is referred to as a volume set, which is described more fully later.

Storage Bitmap

For each disk, OpenVMS maintains a map indicating which clusters of logical blocks are in use and which are free. Each cluster consists of one or more logical blocks. Each bit in the map represents one cluster. Therefore, a cluster is the minimum amount of disk space that can be allocated to anything. The map is called the Storage Bitmap. When OpenVMS allocates space on the disk for a file, it can scan the storage bitmap to find available clusters.

Storage Bitmap Relationship To Clusters

Figure 1-20 Storage Bitmap Relationship To Clusters

Extents

When OpenVMS allocates space on a disk for a file, it is not always possible to allocate all the needed space contiguously on the disk. Sometimes it is necessary to allocate part of a file in one place and the remainder in another. Files have been known to be allocated in dozens and even hundreds of pieces scattered around the disk. Each piece of a file so allocated is called an extent. The concept of an extent is inherent in a study of file fragmentation, as the allocation of multiple extents for a single file is file fragmentation. A contiguous file has only one extent.

Extents

Figure 1-21 Extents

Extent Cache

The extent cache is a portion of the system's memory that is set aside solely for the use of the OpenVMS file allocation mechanism. The extent cache stores the addresses of deallocated clusters, making it fast for OpenVMS to find free disk space by reusing these same clusters. This saves the overhead of scanning the Storage Bitmap of a disk to find free space. The extent cache, however, cannot store the logical block numbers of all deleted clusters. Ordinarily, there is room for only 64 LBNs to be stored, though this number can be changed by the System Manager.

Extent Cache

Figure 1-22 Extent Cache

File

Information stored on a disk is ordinarily stored in a file. In fact, for any OpenVMS disk using the ODS-2 structure, no information can be retrieved from a disk unless it is contained in a file. A file is "a collection of related records treated as a unit and usually referenced by a . . . name." While a record is a unit of data within a file, an extent (see above) is a unit of the file as a container for data.

File With One Extent

Figure 1-23 File With One Extent

Index File

The Digital Dictionary defines index file as "The file on a . . . volume that contains the access information for all files on the volume and enables the operating system to identify and access the volume." The index file is a catalog of all the files on a particular disk. In fact, the header (identifying information) of a file resides within the index file. All the information needed to access a file is contained here.

Index File Pointing To Extents

Figure 1-24 Index File Pointing To Extents

File Header

"A block in the index file that describes a file on a . . . disk. Every file residing on the disk has at least one header, which provides the location of the file's extents." While the header actually contains much more information than this, this is all we need to know about it for the purposes of this book.

Retrieval Pointers

Within the file header, the information critical to our discussion of file fragmentation is the section headed Retrieval Pointers. These pointers indicate where the file's data is located on the disk. Each pointer consists of the LBN of the first data block and a count of how many successive contiguous blocks contain data for that file. For example, this file has a single retrieval pointer:


Map area
     Retrieval pointers
          Count:     12          LBN:     27990

Example 1-2 File With One Retrieval Pointer

The first data block of this file is at Logical Block Number 27990. The entire file is contained in that block plus the 11 blocks following it in LBN sequence. This 12-block file is contiguous (not fragmented). It has only one extent.

In the following example, the file is broken into four fragments. The first consists of 6 blocks starting at LBN 5; the second is 3 blocks at LBN 297; the third, 3 blocks at LBN 200460; and the fourth, 4104 blocks at LBN 200760. This file is fragmented. It has four extents.


Map area
     Retrieval pointers
        Count:       6            LBN:     5
        Count:       3            LBN:     297
        Count:       3            LBN:     200460
        Count:    4104            LBN:     200760

Example 1-3 File With Four Retrieval Pointers

Directory

A directory is "a file that briefly catalogs a set of files stored on a disk. . ." From the user's point of view, a directory is a catalog of the names of files that are grouped in a particular way.

Directory

Figure 1-25 Directory

Disk Performance Terminology

Seek Time

As mentioned earlier, the movement of a disk head from one track to another is called a seek. The time it takes for a head to seek is the most critical factor in determining the speed of a disk. This is known as the disk's seek time. It consists of three parts: the time to start the head in motion and get it up to speed, the time to move the head from one track to another, and the time it takes to stop the head. Additional overhead is required to identify the correct track and, if necessary, reposition the head.

Seek Time

Figure 1-26 Seek Time

Rotational Latency

As a disk platter spins around the spindle, the blocks in a single track of recorded data are brought near a disk head. The head can only read or write a block when that block is immediately under the head. Accordingly, the time to access a block of data on the disk varies. It is much quicker to access a block that is currently or about to be under the head than it is to access a block that has recently passed under the head and is moving away. The block that has just passed under the head has to wait nearly a full rotation of the disk for another access opportunity. This delay is known as rotational latency.

A common disk rotational speed is 3600 RPM, meaning that the disk spins completely around 3600 times each minute. At this speed, each revolution of the disk takes 16.67 milliseconds. Naturally, rotational latency on this disk varies from zero to 16.67 milliseconds, and the average rotational latency is half the maximum, or 8.33 milliseconds.

Rotational Latency

Figure 1-27 Rotational Latency

Average Access Time

To compute the time it takes to access any block on a disk, it is necessary to combine the seek time and the rotational latency. Clearly, if the head is over the proper track and the desired block is immediately under the head, the access time is approximately zero. If the head is over the proper track, the access time is the same as the rotational latency. If the head has to move from one track to another, however, you must add the rotational latency to the seek time to compute the average access time. The average access time for modern-day disks falls roughly in the range of 8 to 25 milliseconds.

Seek Time + Rotational Latency

Figure 1-28 Seek Time + Rotational Latency

Disk Sets

Several methods of combining disks into sets are in current use. One reason for combining disks into sets is to improve performance by reducing the average access time when figured for the set as a whole.

Volume Sets

A volume set is a group of two or more disks combined so as to be treated by OpenVMS as a single disk equal in capacity to the total capacities of all the disks in the set. In a volume set, a file can span two or more disks in the set. The primary benefit of a volume set over separate disks is size - the volume set can accommodate a file larger than any individual disk in the set. The volume set also reduces average access time by spreading the disk accesses over two (or more) disks instead of one. Theoretically, half the disk accesses will occur on one disk in a two-volume set, while the other half occur on the other disk. If the theory were borne out in practice, this would allow twice as many disk operations in the same time, but actual results are not greatly improved.

Volume Set

Figure 1-29 Volume Set

Shadow Sets

A shadow set is a group of two (or more) identical disks combined so as to be treated by OpenVMS as a single disk equal in capacity to only one of the disks in the group. Each time data is written to the shadow set, the same data is written to all the disks in the set. That way, the data can be retrieved even if all but one of the disks fails. The primary benefit of a shadow set over separate disks is safety. The likelihood of losing the data on two or more disks at the same time is much more remote than that of losing the data on just one disk.

A shadow set also alters performance by increasing the time needed to write data (as it has to be written to all disks in the set) and by reducing the time it takes to read data. The two or more disks in the set are unlikely to have exactly the same block under their heads at exactly the same time, so OpenVMS will direct the read to the one which can retrieve the data faster, either because the data is closer to its head or because there is less work outstanding for that drive to do. Therefore, on average, the read time for the set is faster than that for either disk.

Shadow Set

Figure 1-30 Shadow Set

Stripesets

Two or more disks can also be combined into a stripeset. A stripeset is similar to a volume set in that the full capacity of all the disks in the set is available for data storage, and the whole set is treated by OpenVMS as a single large disk. Also, the virtual disk (the disk which is treated by OpenVMS as a single large disk) is divided into "chunks" rather than clusters.

The main difference is that, while each file extent must reside entirely on a single disk in a volume set, in a stripeset file extents are deliberately spread across multiple disks. One chunk (which could contain one or more extents) resides on one disk, the next chunk in sequence resides on the next disk, the next chunk on the next disk, and so on, starting over at the first disk when the last disk in the set is reached.

The primary benefit of a stripeset is application performance. Data read from a single file will come from several disks simultaneously, greatly reducing the application wait time. For example, if each disk has a 24 millisecond (ms) access time, the average time to collect a chunk from two disks simultaneously is the same - 24 ms. But that is 24 ms for two chunks instead of one.

Stripeset

Figure 1-31 Stripeset

Disk Architecture Evolution

Perhaps the most interesting question that comes up with respect to fragmentation is, "Why is the OpenVMS operating system designed so badly as to suffer from fragmentation?"

The answer to this question is that age-old saw, "It's a feature, not a bug!"

It's true. Originally, the VAX/VMS operating system, as it was called then, was deliberately designed to allow fragmentation and to deal with it efficiently. This design was actually the solution to an older and more serious problem.

The Original Problem

You see, when computers first appeared on the scene, they had no disks at all. Obviously, disk fragmentation was not a problem. No one had even had the idea yet, so the operating systems of that age had no mechanism to deal with fragmentation at all. If the computer even had an operating system.

This ancient era extended into the late 1960's, before the VAX and before even the PDP-11, on which the VAX was based. In the Digital world, these were the days of the PDP-8.

Then, disks happened. The first disks were small - unbelievably small by today's standards. They were measured in thousands of bytes rather than millions, never mind the billion-byte disks now available. Nevertheless, this was a fabulous advance in the days when memory sold for a dollar a byte. (At this writing, memory sells for $40 a megabyte retail - 25,000 times less!)

Later, the early PDP-11 operating system, called RT-11 (Real Time-11), was capable of storing data on disks in files, and the files were organized in a formal file structure. This file structure, however, required that all files be contiguous. That is, no file could be split into two or more pieces. It was not a question of performance; the system simply had no capability to create or access a file split into pieces. A file had a single location on a disk and that was that.

This requirement for contiguous files meant that a newly-created file had to fit within a single gap on the disk or not at all. It was not possible to allocate a part of the file to one gap and the rest to another, as can be done with today's OpenVMS. This was true even when there were no individual gaps large enough to accommodate the file, in spite of the fact that the total free space on the disk far exceeded the size of the new file. There had to be a large enough contiguous free space, period.

Naturally, this problem (of not enough contiguous free space to create a file) occurred every time a disk filled up, and small disks fill up very fast. With frequent file deletions, it was not unusual to have a disk reach the point where no more files could be created even though the disk was little over half full.

The SQUEEZE Solution

The solution to this problem was the SQUEEZE command. SQUEEZE compacted the disk, rearranging the files so they were all together near the beginning (LBN 0) of the disk, leaving all the free space in one large, contiguous area at the end. After a SQUEEZE, any file could be created, provided it would fit in the total free space remaining.

SQUEEZE

Figure 1-32 SQUEEZE

Nothing else could be done on that disk while the SQUEEZE was in progress, but this was not a big drawback, as RT-11 was a single-user system anyway. The only person inconvenienced was the one who wanted that new file created, so the benefit always outweighed the inconvenience.

The Fragmentation Solution

Then a wondrous new operating system came along, one that allowed multiple simultaneous users of the same PDP-11 computer - RSX-11. Now the inconvenience of a SQUEEZE would certainly outweigh the benefit, as all the users would have to stop working, not just the one who wanted to create the new file. Clearly, SQUEEZE was no longer a viable solution.

The designers of RSX-11 cleverly created a file structure that included the revolutionary capability to locate parts of a file in different places on the disk. Each file had a header that gave the location and size of each piece of the file, so the file could be in pieces scattered around the disk. Now a file could be created anytime there was sufficient free space anywhere on the disk; the space did not have to be contiguous.

Nor was there any drawback to this mechanism whatsoever. It really was "a feature, not a bug." Performance losses due to fragmentation, even when taken to extremes, caused very little difficulty for anyone. You must realize that at this time, in the early 1970's, disks were very small. Let's take a look at a real example:

The RK05 disk, which was in common use at the time, held 2½ megabytes (5,000 blocks). Suppose the disk was totally and utterly fragmented, that is, no two consecutive data blocks were contiguous. Every single disk access likely required moving the head and waiting for rotational latency. Even so, the whole disk could be read in 250 seconds (50 ms times 5,000 blocks). That's a little over four minutes - worst case. The same action on today's 700MB disk, even with a 16 ms access time, takes over 6 hours. During the same period of time, disk capacities have increased to over 500 times that of the RK05, and CPU speeds have increased to over 400 times that of the original PDP-11. Even though disk speeds have increased by a factor of three, they have not kept pace with the rest of computer technology. This makes the speed of the disk a major bottleneck in the computer system. This point is critical to an understanding of the fragmentation problem.

Clearly, today's larger disks brought with them more than higher capacities and speeds. They brought a susceptibility to a new kind of computer disease - fragmentation.

Here is a table showing the time required to access every block on a disk:


                           AVERAGE ACCESS TIME IN MILLISECONDS
  		70.00	50.00	40.00	33.00	25.00	16.00	10.00	8.00
               ---------------------------------------------------------------- 
	0.032 | 4.48	3.20	2.56	2.11	1.60	1.02	0.64	0.51
S	0.5   | 70.00	50.00	40.00	33.00	25.00	16.00	10.00	8.00
I	1     | 2.33	100.00	80.00	66.00	50.00	32.00	20.00	16.00
Z	2     | 4.67	3.33	2.67	2.20	100.00	64.00	40.00	32.00
E	5     | 11.67	8.33	6.67	5.50	4.17	2.67	100.00	80.00
	10    | 23.33	16.67	13.33	11.00	8.33	5.33	3.33	2.67
I	20    | 46.67	33.33	26.67	22.00	16.67	10.67	6.67	5.33
N	40    | 93.33	66.67	53.33	44.00	33.33	21.33	13.33	10.67
	100   | 3.89	2.78	2.22	110.00	83.33	53.33	33.33	26.67
M	206   | 8.01	5.72	4.58	3.78	2.86	109.87	68.67	54.93
B	456   | 17.73	12.67	10.13	8.36	6.33	4.05	2.53	2.03
	700   | 27.22	19.44	15.56	12.83	9.72	6.22	3.89	3.11
	1200  | 46.67	33.33	26.67	22.00	16.67	10.67	6.67	5.33

                Legend:	  seconds minutes hours
Table 1-1 Time Required To Access Every Block On A Disk

So the ability to deal with fragmented files, which was carried over from the RSX-11 operating system to the OpenVMS operating system, was a solution to an earlier problem that failed to anticipate the enormous capacities to which disks would grow. There is no end to this growth in sight. Deliberate fragmentation is no longer only a feature; it is now a problem, too.

Summary

In this chapter, the inner workings of a disk have been explained, with care taken to define the terms needed to truly understand fragmentation. To be sure, there is a lot more to know about disks, but our concern here is not to learn disk design, construction or maintenance. Our purpose is to understand enough about disks so that fragmentation and its cure make sense. To this end, the first chapter has also devoted time to the basic concepts of files and file structure. With these concepts and terms well understood, we are ready to tackle the real problem - the fragmentation disease.

[PREVIOUS PAGE][NEXT PAGE][RETURN TO TOP][TABLE OF CONTENTS]