Earn upto Rs. 9,000 pm checking Emails. Join now!

Enter your email address:

Delivered by FeedBurner

Saturday, November 24, 2007

Distributed Database
Management Systems
Database Systems:
Design, Implementation, and Management, Sixth Edition, Rob and Coronel
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
In this chapter, you will learn:
What a distributed database management system (DDBMS) is and what its components are
How database implementation is affected by different levels of data and process distribution
How transactions are managed in a distributed database environment
How database design is affected by the distributed database environment
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
The Evolution of Distributed Database Management Systems
Distributed database management system (DDBMS)
Governs storage and processing of logically related data over interconnected computer systems in which both data and processing functions are distributed among several sites
Database Systems: Design, Implementation, & Management
The Evolution of Distributed Database Management Systems (continued)
Centralized database required that corporate data be stored in a single central site
Dynamic business environment and centralized database’s shortcomings spawned a demand for applications based on data access from different sources at multiple locations
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
Centralized Database Management System
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
DDBMS Advantages
Data are located near “greatest demand” site
Faster data access
Faster data processing
Growth facilitation
Improved communications
Reduced operating costs
User-friendly interface
Less danger of a single-point failure
Processor independence
Database Systems: Design, Implementation, & Management
DDBMS Disadvantages
Complexity of management and control
Security
Lack of standards
Increased storage requirements
Greater difficulty in managing the data environment
Increased training cost
Database Systems: Design, Implementation, & Management
Distributed Processing Environment
Database Systems: Design, Implementation, & Management
Distributed Database Environment
Database Systems: Design, Implementation, & Management

Characteristics of Distributed Management Systems
Application interface
Validation
Transformation
Query optimization
Mapping
I/O interface
Formatting
Security
Backup and recovery
DB administration
Concurrency control
Transaction management
Database Systems: Design, Implementation, & Management
Characteristics of Distributed Management Systems (continued)
Must perform all the functions of a centralized DBMS
Must handle all necessary functions imposed by the distribution of data and processing
Must perform these additional functions transparently to the end user
Database Systems: Design, Implementation, & Management
A Fully Distributed Database Management System
Database Systems: Design, Implementation, & Management
DDBMS Components
Must include (at least) the following components:
Computer workstations
Network hardware and software
Communications media
Transaction processor (or, application processor, or transaction manager)
Software component found in each computer that requests data
Data processor or data manager
Software component residing on each computer that stores and retrieves data located at the site
May be a centralized DBMS


Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


14


Distributed Database System Components



Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


15


Database Systems: Levels of Data and Process Distribution



Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


16


Single-Site Processing, Single-Site Data (SPSD)


All processing is done on single CPU or host computer (mainframe, midrange, or PC)
All data are stored on host computer’s local disk
Processing cannot be done on end user’s side of the system
Typical of most mainframe and midrange computer DBMSs
DBMS is located on the host computer, which is accessed by dumb terminals connected to it
Also typical of the first generation of single-user microcomputer databases


Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


17


Single-Site Processing, Single-Site Data (Centralized)



Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


18


Multiple-Site Processing, Single-Site Data (MPSD)


Multiple processes run on different computers sharing a single data repository
MPSD scenario requires a network file server running conventional applications that are accessed through a LAN
Many multi-user accounting applications, running under a personal computer network, fit such a description


Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


19


Multiple-Site Processing, Single-Site Data



Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


20


Multiple-Site Processing,
Multiple-Site Data (MPMD)


Fully distributed database management system with support for multiple data processors and transaction processors at multiple sites
Classified as either homogeneous or heterogeneous
Homogeneous DDBMSs
Integrate only one type of centralized DBMS over a network


Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


21


Multiple-Site Processing,
Multiple-Site Data (MPMD) (continued)


Heterogeneous DDBMSs
Integrate different types of centralized DBMSs over a network
Fully heterogeneous DDBMS
Support different DBMSs that may even support different data models (relational, hierarchical, or network) running under different computer systems, such as mainframes and microcomputers


Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


22


Heterogeneous Distributed
Database Scenario



Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


23


Distributed Database
Transparency Features


Allow end user to feel like database’s only user
Features include:
Distribution transparency
Transaction transparency
Failure transparency
Performance transparency
Heterogeneity transparency


Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


24


Distribution Transparency


Allows management of a physically dispersed database as though it were a centralized database
Three levels of distribution transparency are recognized:
Fragmentation transparency
Location transparency
Local mapping transparency


Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


25


A Summary of Transparency Features



Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


26


Fragment Locations



Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


27


Transaction Transparency


Ensures database transactions will maintain distributed database’s integrity and consistency


Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


28


Distributed Requests and Distributed Transactions


Distributed transaction
Can update or request data from several different remote sites on a network
Remote request
Lets a single SQL statement access data to be processed by a single remote database processor
Remote transaction
Accesses data at a single remote site


Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


29


Distributed Requests and Distributed Transactions (continued)


Distributed transaction
Allows a transaction to reference several different (local or remote) DP sites
Distributed request
Lets a single SQL statement reference data located at several different local or remote DP sites


Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


30


A Remote Request



Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


31


A Remote Transaction



Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


32


A Distributed Transaction



Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


33


A Distributed Request



Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


34


Another Distributed Request



Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


35


Distributed Concurrency Control


Multisite, multiple-process operations are much more likely to create data inconsistencies and deadlocked transactions than are single-site systems


Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


36


The Effect of a Premature COMMIT



Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


37


Two-Phase Commit Protocol


Distributed databases make it possible for a transaction to access data at several sites
Final COMMIT must not be issued until all sites have committed their parts of the transaction
Two-phase commit protocol requires each individual DP’s transaction log entry be written before the database fragment is actually updated


Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


38


Performance Transparency
and Query Optimization


Objective of query optimization routine is to minimize total cost associated with the execution of a request
Costs associated with a request are a function of the:
Access time (I/O) cost
Communication cost
CPU time cost


Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


39


Performance Transparency
and Query Optimization (continued)


Must provide distribution transparency as well as replica transparency
Replica transparency:
DDBMS’s ability to hide the existence of multiple copies of data from the user
Query optimization techniques:
Manual or automatic
Static or dynamic
Statistically based or rule-based algorithms


Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


40


Distributed Database Design


Data fragmentation:
How to partition the database into fragments
Data replication:
Which fragments to replicate
Data allocation:
Where to locate those fragments and replicas


Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


41


Data Fragmentation


Breaks single object into two or more segments or fragments
Each fragment can be stored at any site over a computer network
Information about data fragmentation is stored in the distributed data catalog (DDC), from which it is accessed by the TP to process user requests


Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


42


Data Fragmentation Strategies


Horizontal fragmentation:
Division of a relation into subsets (fragments) of tuples (rows)
Vertical fragmentation:
Division of a relation into attribute (column) subsets
Mixed fragmentation:
Combination of horizontal and vertical strategies


Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


43


A Sample CUSTOMER Table



Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


44


Horizontal Fragmentation of the CUSTOMER Table by State



Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


45


Table Fragments in Three Locations



Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


46


Vertically Fragmented Table Contents



Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


47


Mixed Fragmentation of the
CUSTOMER Table



Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


48


Data Replication


Storage of data copies at multiple sites served by a computer network
Fragment copies can be stored at several sites to serve specific information requirements
Can enhance data availability and response time
Can help to reduce communication and total query costs


Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


49


Table Contents After the Mixed Fragmentation Process



Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


50


Data Replication



Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


51


Replication Scenarios


Fully replicated database:
Stores multiple copies of each database fragment at multiple sites
Can be impractical due to amount of overhead
Partially replicated database:
Stores multiple copies of some database fragments at multiple sites
Most DDBMSs are able to handle the partially replicated database well
Unreplicated database:
Stores each database fragment at a single site
No duplicate database fragments


Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


52


Data Allocation


Deciding where to locate data
Allocation strategies:
Centralized data allocation
Entire database is stored at one site
Partitioned data allocation
Database is divided into several disjointed parts (fragments) and stored at several sites
Replicated data allocation
Copies of one or more database fragments are stored at several sites
Data distribution over a computer network is achieved through data partition, data replication, or a combination of both


Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


53


Client/Server vs. DDBMS


Way in which computers interact to form a system
Features a user of resources, or a client, and a provider of resources, or a server
Can be used to implement a DBMS in which the client is the TP and the server is the DP


Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


54


Client/Server Advantages


Less expensive than alternate minicomputer or mainframe solutions
Allow end user to use microcomputer’s GUI, thereby improving functionality and simplicity
More people with PC skills than with mainframe skills in the job market
PC is well established in the workplace
Numerous data analysis and query tools exist to facilitate interaction with DBMSs available in the PC market
Considerable cost advantage to offloading applications development from the mainframe to powerful PCs


Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


55


Client/Server Disadvantages


Creates a more complex environment, in which different platforms (LANs, operating systems, and so on) are often difficult to manage
An increase in the number of users and processing sites often paves the way for security problems
Possible to spread data access to a much wider circle of users increases demand for people with broad knowledge of computers and software increases burden of training and cost of maintaining the environment


Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


56


C. J. Date’s Twelve Commandments for Distributed Databases


Local site independence
Central site independence
Failure independence
Location transparency
Fragmentation transparency
Replication transparency
Distributed query processing
Distributed transaction processing
Hardware independence
Operating system independence
Network independence
Database independence


Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


57


Remote DB Request Admin



Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


58


Summary


Distributed database stores logically related data in two or more physically independent sites connected via a computer network
Database is divided into fragments
Distributed databases require distributed processing
Main components of a DDBMS are the transaction processor and the data processor


Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel


10


59


Summary (continued)


Current database systems can be classified by extent to which they support processing and data distribution
DDBMS characteristics are best described as a set of transparencies
A transaction is formed by one or more database requests
A database can be replicated over several different sites on a computer network
Client/server architecture refers to the way in which two computers interact over a computer network to form a system
 
Thanks

Total Pageviews