Template Presentation Support 0
×

285

Views

 1 : Relations, Functions, and Matrices Mathematical Structures for Computer Science Chapter 4 Copyright © 2006 W.H. Freeman & Co. MSCS Slides Relations, Functions and Matrices
 2 : Section 4.3 Relations and Databases 1 Database A database is a storehouse of associated information about some enterprise. The user of a database can certainly retrieve some specific fact stored in the database. A well designed database is more than simply a list of facts. The user can perform queries on the database to retrieve information not contained in any single fact. The whole becomes more than the sum of its parts. To design a useful and efficient computerized database, it is necessary to model or represent the enterprise with which the database is concerned. A conceptual model attempts to capture the important features and workings of the enterprise. Considerable interaction with those who are familiar with the enterprise may be required to obtain all the information necessary to formulate the model.
 3 : Section 4.3 Relations and Databases 2 Entity-Relationship Model One high-level representation of an enterprise is the entity-relationship model. In this model, important objects, or entities, in the enterprise are identified, together with their relevant attributes or properties. The information of the relationships between these various entities is represented graphically by an entity-relationship diagram, or E-R diagram. In an E-R diagram, rectangles denote entity sets, ellipses denote attributes, and diamonds denote relationships.
 4 : Section 4.3 Relations and Databases 3 Example: E-R Diagram The Pet Lovers of America Club (PLAC) wants to set up a database. PLAC has bought mailing lists from commercial sources, and it is interested in those people who own pets and in some basic information about those pets, such as the name, type of pet (dog, cat, etc.), and breed. The figure below shows an E-R diagram for the PLAC enterprise.
 5 : Section 4.3 Relations and Databases 4 Example: E-R Diagram This diagram says that persons and pets are the entities. Persons have the attributes of Name, Address, City, and State. Pets have the attributes of Pet-name, Pet-type, and Breed. The diagram also shows that persons own pets. Thinking of the entities as sets, the Person set and the Pet set, the relationship “owns” is a binary relation from Person to Pet The ownership relation is captured by (person, pet) ordered pairs. The “1”and “N” on the connecting lines indicate that this binary relation is one-to-many. In this particular enterprise, one person can own many pets, but no pet has multiple owners. (Pets with multiple owners would result in a many-to-many relation.) Also, in this example, some persons may own no pets, and some pets may have no owners. The fact that no pet has multiple owners is one of the “business rules” of the enterprise. Such business rules are important to identify when designing a database.
 6 : Section 4.3 Relations and Databases 5 Relational Model Another representation of an enterprise called relational model can be developed from the E-R model. Both the entity sets and the relationships of the E-R model become relations (in the mathematical sense) in the relational model. The relations are described by tables, and a relational database consists of collections of such tables. An entity set table is named for the entity set. Each row in the table contains the values of the n attributes for a specific instance of that entity set. Thus, the relational table may be thought of as a set of n-tuples (rows), and an individual row is called a tuple. True to the idea of a set, no duplicate tuples exist, and no ordering of the tuples is assumed. The ordering of the attributes is unimportant, except that consistency must be maintained; that is, each column in the table contains values for a specific attribute in all of the tuples.
 7 : Section 4.3 Relations and Databases 6 Relational Model The number of attributes (columns) is called the degree of the relation. The number of tuples (rows) is called the cardinality of the relation; it is the cardinality (in the set-theoretic sense) of the set of rows. More formally, a database relation is a subset of D1 ? D2 ? ... ? Dn, where Di is the domain from which attribute Ai takes its values. This means that the database use of the word relation is consistent with our definition of an n-ary relation on multiple sets.
 8 : Section 4.3 Relations and Databases 7 Relational Model Because there are no duplicate tuples in a relation, giving the value of all n attributes of a tuple clearly distinguishes the tuple from all others. However, there may be a minimal subset of the attributes that can be used to uniquely identify each tuple. This subset is called the primary key of the relation; if the subset consists of more than one attribute, then it is a composite primary key. In the table describing the relation, the primary key is underlined in the row of attribute names.
 9 : Section 4.3 Relations and Databases 8 Example: Relational Model The Person relation in the PLAC database might contain the following data: The four attributes for each tuple are Name, Address, City, and State. The Pet relation could be:
 10 : Section 4.3 Relations and Databases 9 Example: Relational Model Another business rule of the PLAC enterprise is that all people have unique names; therefore, Name is sufficient to identify each tuple and was chosen as the primary key in the Person relation. In the Person relation as shown in this example, State could not serve as a primary key because there are two tuples with State value “IL.” However, just because Name has unique values in this instance does not preclude the possibility of duplicate names. It is the business rule that determines that names will be unique. There is no business rule that says that addresses or cities are unique, so neither of these attributes can serve as the primary key, even though there happen to be no duplicates in the Person relation shown. The primary key in a relation involving people is often an identifying number, such as a Social Security number, which is a convenient unique attribute.
 11 : Section 4.3 Relations and Databases 10 Example: Relational Model Because Pet-name is the primary key in the Pet relation of this example, we can surmise the even more surprising business rule that in the PLAC enterprise all pets have unique names. A more realistic scenario would call for creating a unique attribute for each pet, sort of a pet Social Security number, to be used as the primary key. This key would have no counterpart in the real enterprise, so the database user would never need to see it; such a key is called a blind key.
 12 : Section 4.3 Relations and Databases 11 Relational Model No attribute of the primary key should ever have a null (empty) value. This entity integrity constraint merely confirms that each tuple must have a primary key value to distinguish that tuple and that all attribute values of the primary key are needed to identify a tuple uniquely. An attribute in one relation (called the “child” relation) may have the same domain as the primary key attribute in another relation (called the “parent” relation). Such an attribute is called a foreign key (of the child relation) into the parent relation. A relation for a relationship between entities uses foreign keys to establish connections between those entities. There will be one foreign key in the relationship relation for each entity participating in the relationship.
 13 : Section 4.3 Relations and Databases 12 Example: Relational Model The PLAC enterprise has identified the following instance of the Owns relationship: The Name attribute of Owns is a foreign key into the Person relation where Name is a primary key. Pet-name of Owns is a foreign key into the Pet relation, where Pet-name is a primary key. The first tuple establishes the Owns relationship between Bob Smith and Spot; that is, it indicates that Bob Smith owns Spot. Persons who do not own pets are not represented in Owns; nor are pets with no owners. The primary key of Owns is Pet-name. Since no pet has multiple owners, if any pet could have multiple owners, the composite primary key Name/Pet-name would have to be used.
 14 : Section 4.3 Relations and Databases 13 Operations on Relations Two unary operations that can be performed on relations are restrict and project, which can be thought of in terms of subsets. The restrict operation creates a new relation made up of those tuples of the original relation that satisfy a certain property. The restrict operation creates a subset of the rows that satisfy certain properties. The project operation creates a new relation made up of certain attributes from the original relation, eliminating any duplicate tuples. The project operation creates a subset of the columns that represent certain attributes.
 15 : Section 4.3 Relations and Databases 14 Operations on Relations Another binary operation, join, can be performed on two relations with a common attribute (column). This operation initially forms the Cartesian product of all n-tuples (rows) in the first relation with all k-tuples (rows) in the second relation. It views the result as a set of (n + k)-tuples and then restricts to the subset of those where the common attribute has the same value, writing the result as a set of (n + k ? 1)-tuples (the common attribute is written only once). Join is therefore not really a separate operation but is defined as the result of doing a Cartesian product followed by a restrict.
 16 : Section 4.3 Relations and Databases 15 Example: Unary Operations The operation Restrict Pet-owner where Pet-type “Dog” giving Dog-owner results in the relation Dog-owner: The operation Project Pet-owner over (Name, Pet-type) giving Preference results in the relation Preference:
 17 : Section 4.3 Relations and Databases 16 Example: Binary Operation The operation Join Person and Pet-owner over Name giving Listing results in the Listing relation:
 18 : Section 4.3 Relations and Databases 17 Example: Operations Restrict Pet-owner where Pet-type “Cat” giving Results1: Restrict Person where State = “IL” giving Results2:
 19 : Section 4.3 Relations and Databases 18 Example: Operations Join Results2 and Results1 over Name giving Results3: Project Results3 over Pet-name giving Final_Results:
 20 : Section 4.3 Relations and Databases 19 Relational Algebra Relational algebra is a theoretical relational database language in which the restrict, project, and join operations can be combined. The relational algebra equivalent of the sequence of operations we did to find the names of cats whose owners live in Illinois (as shown in the previous two slides) would be the statement: project (join(restrict Pet-owner where Pet-type “Cat”) and (restrict Person where State = “IL”) over Name) over Pet-name giving Final_Results
 21 : Section 4.3 Relations and Databases 20 Structured Query Language (SQL) SQL is an international standard relational database language. The query on the preceding slide would appear as the following SQL statement (lines are numbered only for discussion purposes): SELECT Pet-name FROM Pet-owner, Person WHERE Pet-owner.Name = Person.Name AND Pet-type = “Cat” AND State = “IL”; SQL’s SELECT statement can actually perform relational algebra restricts, projects, and joins, as shown here: Lines 4 and 5 represent the two restrict operations. Line 3 represents the join. Line 1 represents the project operation. Line 2, of course, identifies the tables to be used.
 22 : Section 4.3 Relations and Databases 21 Database Integrity The database will be subjected to add (add new information to database), delete (remove obsolete information), and modify (change existing information) operations. An add operation can be carried out by creating a second relation table with the new information and performing a set union of the existing table and the new table. Delete can be accomplished by creating a second relation table with the tuples to be deleted and performing a set difference that subtracts the new table from the existing table. Modify can be achieved by a delete (of the old tuple) followed by an add (of the modified tuple). These operations must be carried out so that the information in the database remains in a correct and consistent state that agrees with the business rules.
 23 : Section 4.3 Relations and Databases 22 Database Integrity Enforcing three “integrity rules” will help maintain the database integrity. Data integrity requires that the values for an attribute do indeed come from that attribute’s domain. In the PLAC enterprise example, for instance, values for the State attribute of Person must be legitimate two-letter state abbreviations (or the null value). Entity integrity requires that no component of a primary key value be null. Null is an “unknown value.” These integrity constraints clearly affect the tuples that can be added to a relation.