Storage Disk Array Storage Terminology

To facilitate the readability of the subsequent chapters in this book, here are some essential disk array storage terms. To maintain the compactness of the chapters, detailed technical explanations will not be provided.

SCSI:
Short for Small Computer System Interface, it was initially developed in 1979 as an interface technology for mini-computers but has now been fully ported to regular PCs with the advancement of computer technology.

ATA (AT Attachment):
Also known as IDE, this interface was designed to connect the bus of the AT computer manufactured in 1984 directly to the combined drives and controllers. The “AT” in ATA comes from the AT computer, which was the first to use the ISA bus.

Serial ATA (SATA):
It employs serial data transfer, transmitting only one bit of data per clock cycle. While ATA hard drives have traditionally used parallel transfer modes, which can be susceptible to signal interference and affect system stability during high-speed data transfer, SATA resolves this issue by using a serial transfer mode with only a 4-wire cable.

NAS (Network Attached Storage):
It connects storage devices to a group of computers using a standard network topology such as Ethernet. NAS is a component-level storage method aimed at addressing the growing need for increased storage capacity in workgroups and department-level organizations.

DAS (Direct Attached Storage):
It refers to connecting storage devices directly to a computer through SCSI or Fibre Channel interfaces. DAS products include storage devices and integrated simple servers that can perform all functions related to file access and management.

SAN (Storage Area Network):
It connects to a group of computers through Fibre Channel. SAN provides multi-host connectivity but does not use standard network topologies. SAN focuses on addressing specific storage-related issues in enterprise-level environments and is primarily used in high-capacity storage environments.

Array:
It refers to a disk system composed of multiple disks that work in parallel. A RAID controller combines multiple disks into an array using its SCSI channel. In simple terms, an array is a disk system consisting of multiple disks that work together in parallel. It’s important to note that disks designated as hot spares cannot be added to an array.

Array Spanning:
It involves combining the storage space of two, three, or four disk arrays to create a logical drive with a continuous storage space. RAID controllers can span multiple arrays, but each array must have the same number of disks and the same RAID level. For example, RAID 1, RAID 3, and RAID 5 can be spanned to form RAID 10, RAID 30, and RAID 50, respectively.

Cache Policy:
It refers to the caching strategy of a RAID controller, which can be either Cached I/O or Direct I/O. Cached I/O uses read and write strategies and often caches data during reads. Direct I/O, on the other hand, reads new data directly from the disk unless a data unit is repeatedly accessed, in which case it employs a moderate read strategy and caches the data. In fully random read scenarios, no data is cached.

Capacity Expansion:
When the virtual capacity option is set to available in the RAID controller’s quick configuration utility, the controller establishes virtual disk space, allowing the additional physical disks to expand into the virtual space through reconstruction. Reconstruction can only be performed on a single logical drive within a single array, and online expansion cannot be used in a spanned array.

Channel:
It is an electrical path used to transfer data and control information between two disk controllers.

Format:
It is the process of writing zeros on all data areas of a physical disk (hard drive). Formatting is a purely physical operation that also involves consistency checking of the disk medium and marking unreadable and bad sectors. Since most hard drives are already formatted at the factory, formatting is only necessary when disk errors occur.

Hot Spare:
When a currently active disk fails, an idle, powered-on spare disk immediately replaces the failed disk. This method is known as hot sparing. Hot spare disks do not store any user data, and up to eight disks can be designated as hot spares. A hot spare disk can be dedicated to a single redundant array or be part of a hot spare disk pool for the entire array. When a disk failure occurs, the firmware of the controller automatically replaces the failed disk with a hot spare disk and reconstructs the data from the failed disk onto the hot spare disk. The data can only be rebuilt from a redundant logical drive (except for RAID 0), and the hot spare disk must have sufficient capacity. The system administrator can replace the failed disk and designate the replacement disk as the new hot spare.

Hot Swap Disk Module:
Hot swap mode allows system administrators to replace a failed disk drive without shutting down the server or interrupting network services. Since all power and cable connections are integrated on the server’s backplane, hot swapping involves simply removing the disk from the drive cage slot, which is a straightforward process. Then, the replacement hot swap disk is inserted into the slot. Hot swap technology only works in configurations of RAID 1, 3, 5, 10, 30, and 50.

I2O (Intelligent Input/Output):
I2O is an industrial standard architecture for input/output subsystems that is independent of the network operating system and does not require support from external devices. I2O uses driver programs that can be divided into Operating System Services Modules (OSMs) and Hardware Device Modules (HDMs).

Initialization:
It is the process of writing zeros on the data area of a logical drive and generating corresponding parity bits to bring the logical drive into a ready state. Initialization deletes previous data and generates parity, so a logical drive undergoes consistency checking during this process. An array that has not been initialized is not usable because it hasn’t generated parity yet and will result in consistency check errors.

IOP (I/O Processor):
The I/O Processor is the command center of a RAID controller, responsible for command processing, data transfer on PCI and SCSI buses, RAID processing, disk drive reconstruction, cache management, and error recovery.

Logical Drive:
It refers to a virtual drive in an array that can occupy more than one physical disk. Logical drives divide the disks in an array or a spanned array into continuous storage spaces distributed across all the disks in the array. A RAID controller can set up to 8 logical drives of different capacities, with at least one logical drive required per array. Input/output operations can only be performed when a logical drive is online.

Logical Volume:
It is a virtual disk formed by logical drives, also known as disk partitions.

Mirroring:
It is a type of redundancy where data on one disk is mirrored on another disk. RAID 1 and RAID 10 use mirroring.

Parity:
In data storage and transmission, parity involves adding an additional bit to a byte to check for errors. It often generates redundant data from two or more original data, which can be used to rebuild the original data from one of the original data. However, parity data is not an exact copy of the original data.

In RAID, this method can be applied to all disk drives in an array. Parity can also be distributed across all disks in the system in a dedicated parity configuration. If a disk fails, the data on the failed disk can be rebuilt using the data from the other disks and the parity data.


Post time: Jul-12-2023