File Structures
File structure or organisation refers to the relationship Of the key of the record to the physical location of that record in the computer file.
Disk Storage
Databases must be stored physically as files of records, which are typical, stored on some coil–Witter storage medium. The DBMS software can than
retrieve, update and process this data as needed.
Several aspects of storage media must be taken into account
(i) Speed with which data can be accessed.
- Cost per unit of data
- Reliability
- Data toss on power failure or system crash.
- Physical failure of the storage device.
- So, we can differentiate storage into
Volatile Storage
Losses contents when power is switched off.
Non-volatile Storage
Contents persist even when power is switched off.
Category of Computer Storage Media
Computer storage media form a storage hierarchy that includes two main categories.
- Primary Storage This category includes storage media that can be operated on directly by the computer Central Processing Unit (CPU), such as the computer main memory and smaller but faster cache memories. Primary storage usually provides fast access to data but is of limited storage capacity.
- Secondary Storage This category includes magnetic disks, optical disks and tapes. These devices usually have a larger capacity, cost less and provides slower access to data than do primary storage devices. Data in secondary storage cannot be
processed directly by the CPU.
Characteristics of Secondary Storage Devices
- Random access versus sequential access
- Read-write, write-once, read-only
Ali Character versus block data access
–
Storage of Databases
Need data to be stored permanently or persistently over long periods to time.Cost of storage per unit of data is an order of magnitude. less for disk secondary storage than for primary storage.
magnetic Hard Disk Mechanism.
magnetic disks are used for storing large amount of data. The most basic unit of data on the disk is a single bit of information. All disks are made of magnetic material shaped as a thin circular disk and protected by .a plastic or acylic cover. A disk is single sided, if it stores information on one of its surfaces only and double-sided, if both surfaces are used.
Key Points
- To increase storage capacity, disks arc assembled into a disk pack, may include many disks.
- Information is stored on disk surface in concentric circles, each cirde is called a track.
- The disk surfaces are called platter surface.
Read-write Head
- Positioned very close to the platter surface (touching it).
- Reads or writes magnetically encoded information.
Each Track is Divided into Sectors
- A sector is the smallest unit of data that can be read or written.
- Sector size typically 512
- Typical sectors per track : 500 (on inner tracks) to 1000 (on outer track.
|
|
Performance Measures on Disks
Access Time The time it takes from when a read or write request is issued
to when data transfer begins. It consists of
- Seek time Time it takes to reposition the arm over the correct track. Average seek time is 1/2 of the worst case seek time.
- Rotational latency Time it takes for the sector to be accessed to appear under the head. Average latency is 1/2 of the worst case latency.
Data Transfer Rate The rate at which the data can be retrieved from or
stored to the disk. Mean Time To Failure (MTTF) The average time, the disk is expected to run continuously without any failure.
Optimization of Disk-Block Access
Block A contiguous sequence of sectors from a single track. Data is transferred between disk and main memory in blocks. Sizes range from 512 bytes to several kilobytes. Blocks are of two types
- Smaller blocks more transfers from
- Larger blocks more space wasted due to partially filled blocks. Disk-arm Scheduling This algorithms order pending accesses to tracks so that disk arm movement is minimized.
Elevator Algorithm In this algorithm, move disk arm in one direction (from
outer to inner tracks or vice-versa), processing next request in that direction. till no more request in that direction, then reverse direction and repeat.
RAID (Redundant Arrays of Independent Disks)
The choice of disk structure is very important in databases. Important factors, besides price are
0) Capacity (ii) Speed (iii) Reliability
it is a disk organisations technique that manage a large numbers of disks’ providing a view of a single disk of
- High capacity and high speed by using multiple disks in parallel and
- High reliability by storing data redundantly so that data can be recovered even if a disk fails.
Buffer
it is a portion of main memory that is available to store copies of disk blocks when several blocks need to be transferred from disk to main memory and all the block addresses are known. several buffers can be reserved in main memory to speed up the transfer while one buffer is being read or written, the CPU can process data in the other buffer.
Buffer Manager
It is a subsystem which is responsible for allocating buffer space in main memory.
Programs call on the buffer manager when they need a block from disk.
0) if the block is already in the buffer, buffer manager returns the address of the block in main memory.
(ii) If the block is not in the buffer, the buffer manager
Space in the Buffer
Allocates space in the buffer for the block.
- Replacing (throwing out) some other block, if required to make space for the new block.
- Replaced the block written back to disk only if it was modified, since the most recent time that it was written to/fetched from the
Reads the block from the disk to the buffer and returns the address of the block in main memory to requester.
Buffer Replacement Policies Most operating systems replace the block that is Least Recently Used (LRU strategy).
Most Recently Used (MRU) Strategy System must pin the blocks currently being processed. After the final tuple of that block has been processed, the block is unpinned and it becomes the most recently used blocks.
Pinned Block Memory block that is not allowed to be written back to disk.