CIS502 Database Management Systems


<< back to courseware demo page
- This online lecture is for demonstration purposes

 

Lecture One

Introduction to Databases
 
 

About This Lecture
 

We will cover various topics including: 

- Concepts of DBMS 

- Database properties

- Characteristics of the Database Approach versus traditional file Approach
 
 
 
 
 
 

 

Lecture Menu

About this Lecture

Learning Objectives

Introduction

Characteristics of Database Approach

Benefits of Database Approach

Summary

Review Questions

Practice Test

Required Readings

Learning Objectives

After completing this lecture, you will be proficient in key aspects of the competency expectation including: 

  • Explaining what a database is and why database systems are desirable. 
  • Articulating the difference between relational database systems and others. 
Introduction

Databases and database technology are having a major impact on the growing use of computers. 

A database system is basically just a computerized record-keeping system.   The database itself can be regarded as a kind of electronic filing cabinet; i.e., it is a repository or container for a collection of computerized data files. 

Users of the system can perform a variety of operations on such files—for example: 

• Adding new, empty files to the database;

• Inserting data into existing files;

• Retrieving data from existing files;

• Changing data in existing files;

• Deleting data from existing files;

• Removing existing files from the database.

It is fair to say that databases play a critical role in almost all areas where computers are used, including business, engineering, medicine, law, education, and library science, to name a few. 


Figure 1: Example of Database


Figure 2: Retrieval example


Figure 3: Insert/ Change/ Delete examples
 

A database has the following implicit properties

• A database is a logically coherent collection of data with some inherent meaning. A random assortment of data cannot correctly be referred to as a database. 

• A database has some source from which data are derived, some degree of interaction with events in the real world, and an audience that is actively interested in the contents of the database. 

• A database can be of any size and of varying complexity. For example, the list of names and addresses referred to earlier may consist of only a few hundred records, each with a simple structure. 

• On the other hand, the card catalog of a large library may contain half a million cards stored under different categories—by primary author’s last name, by subject, by book title—with each category organized in alphabetic order. 

• A database may be generated and maintained manually or it may be computerized. 
The library card catalog is an example of a database that may be created and maintained manually. 

• A computerized database may be created and maintained either by a group of application programs written specifically for that task or by a database management system. 

• A database management system (DBMS) is a collection of programs that enables users to create and maintain a database. 

• The DBMS is hence a general-purpose software system that facilitates the processes of defining, constructing, and manipulating databases for various applications. 

• Defining a database involves specifying the data types, structures, and constraints for the data to be stored in the database. 

• Constructing the database is the process of storing the data itself on some storage medium that is controlled by the DBMS. 

• Manipulating a database includes such functions as querying the database to retrieve specific data, updating the database to reflect changes in the miniworld, and generating reports from the data. 

• It is not necessary to use general-purpose DBMS software to implement a computerized database. 

• We could write our own set of programs to create and maintain the database, in effect creating our own special-purpose DBMS software. 

• In either case—whether we use a general-purpose DBMS or not—we usually have to employ a considerable amount of software to manipulate the database. 

• We will call the database and DBMS software together a database system. 
 


Figure 4: A simplified database system environment.
 

Example of database:

• A UNIVERSITY database for maintaining information concerning students, courses, and grades in a university environment. 

• Figure 5 shows the database structure and a few sample data for such a database. 
 


Figure 5: An example of the students database
 

• The database is organized as five files, each of which stores data records of the same type.

• To define this database, we must specify the structure of the records of each file by specifying the different types of data elements to be stored in each record. 

• We must also specify a data type for each data element within a record. 

• For example, we can specify that Name of STUDENT is a string of alphabetic characters, StudentNumber of STUDENT is an integer, and Grade of GRADE_REPORT is a single character from the set {A, B, C, D, F, I}. 

• We may also use a coding scheme to represent a data item. For example, we represent the Class of a STUDENT as 1 for freshman, 2 for sophomore, 3 for junior, 4 for senior, and 5 for graduate student. 

• To construct the UNIVERSITY database, we store data to represent each student, course, section, grade report, and prerequisite as a record in the appropriate file. 

• Notice that records in the various files may be related. For example, the record for "Smith" in the STUDENT file is related to two records in the GRADE_REPORT file that specify Smith’s grades in two sections. 

• Similarly, each record in the PREREQUISITE file relates two course records: one representing the course and the other representing the prerequisite. 

• Most medium-size and large databases include many types of records and have many relationships among the records. 

• Database manipulation involves querying and updating. Examples of queries are "retrieve the transcript—a list of all courses and grades—of Smith"; "list the names of students who took the section of the Database course offered in fall 1999 and their grades in that section"; and "what are the prerequisites of the Database course?" 

• Examples of updates are "change the class of Smith to Sophomore"; "create a new section for the Database course for this semester"; and "enter a grade of A for Smith in the Database section of last semester." 
 

Note: These informal queries and updates must be specified precisely in the database system language before they can be processed. 
 
 
 
 
 
 

Characteristics of the Database Approach

• A number of characteristics distinguish the database approach from the traditional approach of programming with files. 

• In traditional file processing, each user defines and implements the files needed for a specific application as part of programming the application. 

• The redundancy in defining and storing data results in wasted storage space and in redundant efforts to maintain common data up-to-date. 

• In the database approach, a single repository of data is maintained that is defined once and then is accessed by various users. 

• The main characteristics of the database approach versus the file-processing approach are the following. 
 

Self-Describing Nature of a Database System:

• A fundamental characteristic of the database approach is that the database system contains not only the database itself but also a complete definition or description of the database structure and constraints. 

• This definition is stored in the system catalog, which contains information such as the structure of each file, the type and storage format of each data item, and various constraints on the data. 

• The information stored in the catalog is called meta-data, and it describes the structure of the primary database (Figure 1).

• The catalog is used by the DBMS software and also by database users who need information about the database structure. 

• A general purpose DBMS software package is not written for a specific database application, and hence it must refer to the catalog to know the structure of the files in a specific database, such as the type and format of data it will access. 

• The DBMS software must work equally well with any number of database applications—for example, a university database, a banking database, or a company database—as long as the database definition is stored in the catalog. 

• In traditional file processing, data definition is typically part of the application programs themselves. Hence, these programs are constrained to work with only one specific database, whose structure is declared in the application programs. 

• For example, a PASCAL program may have record structures declared in it; a C++ program may have "struct" or "class" declarations; and a COBOL program has Data Division statements to define its files. 

• Whereas file-processing software can access only specific databases, DBMS software can access diverse databases by extracting the database definitions from the catalog and then using these definitions. 

• The DBMS stores in the catalog the definitions of all the files shown. Whenever a request is made to access, say, the Name of a STUDENT record, the DBMS software refers to the catalog to determine the structure of the STUDENT file and the position and size of the Name data item within a STUDENT record. 

• By contrast, in a typical file-processing application, the file structure and, in the extreme case, the exact location of Name within a STUDENT record are already coded within each program that accesses this data item. 
 
 
 

Benefits of the Database Approach

• The data can be shared

• Redundancy can be reduced

• Inconsistency can be avoided (to some extent)

• Transaction support can be provided

• Integrity can be maintained

• Security can be enforced

• Conflicting requirements can be balanced

• Standards can be enforced
 

Support of Multiple Views of the Data:

• A database typically has many users, each of whom may require a different perspective or view of the database. 

• A view may be a subset of the database or it may contain virtual data that is derived from the database files but is not explicitly stored. 

• Some users may not need to be aware of whether the data they refer to is stored or derived. 

• A multiuser DBMS whose users have a variety of applications must provide facilities for defining multiple views. 

• For example, one user of the database of Figure 2 may be interested only in the transcript of each student; the view for this user is shown in Figure 4(a). 

• A second user, who is interested only in checking that students have taken all the prerequisites of each course they register for, may require the view shown in Figure 4(b).


Figure 6
 

Sharing of Data and Multiuser Transaction Processing:

• A multiuser DBMS, as its name implies, must allow multiple users to access the database at the same time. 

• This is essential if data for multiple applications is to be integrated and maintained in a single database. 

• The DBMS must include concurrency control software to ensure that several users trying to update the same data do so in a controlled manner so that the result of the updates is correct. 

• For example, when several reservation clerks try to assign a seat on an airline flight, the DBMS should ensure that each seat can be accessed by only one clerk at a time for assignment to a passenger. 

• These types of applications are generally called on-line transaction processing (OLTP) applications. A fundamental role of multiuser DBMS software is to ensure that concurrent transactions operate correctly. 

 

Summary

In this lecture we defined a database as a collection of related data, where data means recorded facts. 

• A typical database represents some aspect of the real world and is used for specific purposes by one or more groups of users. 

• A DBMS is a generalized software package for implementing and maintaining a computerized database. 

• The database and software together form a database system. 

• We identified several characteristics that distinguish the database approach from traditional file-processing applications: 

•Existence of a catalog. 
•Program-data independence and program-operation independence. 

•Data abstraction. 

•Support of multiple user views. 

•Sharing of data among multiple transactions. 
Review Questions 

1. Define the following terms: Database, DBMS, database Manipulation, and DBA.
 

2. What are the advantages of using a database system?
 

3. What are the disadvantages of using a database system?
 
 

ANSWERS:

3. Some disadvantages are as follows:
 

• Security might be compromised (without good controls);

• Integrity might be compromised (without good controls); 

• Additional hardware might be required;

• Performance overhead might be significant.

Practice Test 

1. Show the effects of the following SQL update operations on the wine cellar database of Figure 1:
a. SELECT WINE, PRODUCER FROM   CELLAR WHERE BIN# = 72 ;

b. SELECT WINE, PRODUCER FROM   CELLAR WHERE YEAR > 1996 ;

c. SELECT BIN#, WINE, YEAR FROM   CELLAR WHERE READY < 1999 ;

d. SELECT WINE, BIN#, YEAR FROM   CELLAR WHERE PRODUCER = 'Robt. Mondavi' AND    BOTTLES > 6 ;

e. INSERT INTO   CELLAR ( BIN#, WINE, PRODUCER, YEAR, BOTTLES, READY ) VALUES ( 80, 'Syrah', 'Meridian', 1994, 12, 1999 ) ;

f. DELETE FROM   CELLAR WHERE READY > 2000 ;

g.  UPDATE CELLAR SET   BOTTLES = 5 WHERE BIN# = 50 ;

h. UPDATE CELLAR SET   BOTTLES = BOTTLES + 2 WHERE BIN# = 50 ;

2. Give some additional views that may be needed by other user groups for the database shown in Figure 5. 
 
 
 

ANSWERS:

e. A row for bin number 80 is added to the CELLAR table. 

f. The rows for bin numbers 45,48,64, and 72 are deleted from the CELLAR table. 

g. The row for bin number 50 has the number of bottles set to 5. 

h. Same as g.
 
 

2. (a) A view that groups all the students who took each section and gives each student's grade. This may be useful for printing the grade report for each section for the university administration's use.

2. (b) A view that gives the number of courses taken and the GPA (grade point average) for each student. This may be used to determine honors students.

 

Required Readings

Textbook:

                   Chapter 1, “Databases and Database Users”

Elmasri & Navathe
Fundamentals of Database Systems (3rd Ed.)
Addison Wesley 2000

ISBN: 0-8053-1755-4