What We Will Cover
Elucidations
Homework Questions?
Quiz Questions?
A Note About SQL Export Compatibility
- MySQL has many versions and the later versions have newer features not present in older versions
- However, most ISPs use an older version of MySQL for compatibility with PHP 4
- PHP 5 is set up for version 4.1 and later
- PHP 4 is set up for version 4.0 or earlier
- If you have a later version of MySQL, then phpMyAdmin will show you a SQL export compatibility drop-down list
- We cannot see this in class because we are using an older version of MySQL
- However, you should see this at home
- Later in the course we will be turning in complete databases
- When this happens, you need to export with MYSQL 40 or earlier compatibility
^ top
4.1: Relational Database Fundamentals
Objectives
At the end of the lesson the student will be able to:
- Describe the fundamental characteristics of relational databases
- Choose a primary key for a relational-database table
|
^ top
4.1.1: About Relational Databases
- Relational databases are based on the relational model for database management created by Dr. E. F. Codd
- The term "relational" comes from the mathematical concept of a relation
- A relation is a set of tuples, which is commonly visualized as a table
- Relational databases are the most commonly used type of database
- Part of the popularity of relational databases is their simple data model
Relational Data Model
- In the relational model, data is organized into tables around the concept of an entity
- An entity is a "thing" in the real world: customers, addresses, products, orders, etc.
- By design, each table represents only one entity like:
Customers table
Addresses table
Products table
Orders table
- etc.
- The columns in a table are attributes of the entity
- An attribute is a property, characteristic or trait of an entity
- The attributes you chose are the data you need to store in the database
- For example, attributes of a customer might be: (id, name, address, SSN)
- A row in a table is a specific instance of an entity
- In the
Customers table: an individual customer
- If one attribute depends on another, they are linked through "key" columns
^ top
4.1.2: Primary Keys
- The relational model dictates that each row in a table be unique
- This requirements allows us to choose any row whenever we choose
- We guarantee row uniqueness by designating a primary key
Choosing a Primary Key
- Candidate keys: column(s) that contains unique values in each row of a table
- Each table can have only one primary key
- The primary key must be selected from among candidate keys
- After selecting the primary key, other candidates become alternate keys
- Keys can be simple or composite
- A simple key is made up of one field
- Composite key is made up of two or more fields
- No absolute rule for choosing a primary key from candidate keys
- Some guidelines:
- Choose a key where the type of data will not change
- Choose the key with the fewest columns
- Choose a key that is both simple and familiar to users
For Example
- Consider the following table:
| CustomerID |
LastName |
FirstName |
Address |
City |
Phone |
| 1 |
Bond |
James |
221A Baker St. |
London |
123-456 |
| 2 |
Jones |
John |
123 Lake St |
St. Paul |
123-4567 |
| 3 |
Newton |
Issac |
2345 Apple St. |
London |
123-457 |
- Which of the following columns are possibly candidate keys?
- CustomerID
- Address
- Phone
- LastName + FirstName
- LastName + FirstName + Address
- LastName + FirstName + Address + City
- LastName + FirstName + Address + City + Phone
- CustomerID + LastName + FirstName + Address + City + Phone
- What is the best choice for the primary key? Why?
^ top
4.1.3: Foreign Keys
- Now that we can choose a primary key let us consider foreign keys
- Foreign key: a column in a table that contains some of the values of a primary key in a (usually different) table
- Foreign keys provide the links we need for joining tables in a query
- From our prior example, assume we chose CustomerID as primary key
- Now we define a second table Orders as follows:
| OrderID |
CustomerID |
OrderDate |
| 1 |
1 |
2002-02-01 |
| 2 |
2 |
2002-02-01 |
| 3 |
1 |
2002-02-03 |
- CustomerID is a foreign key in Orders since it refers to a customer described in another table
- Foreign keys and primary keys must share a common meaning and data type
- For example, consider the following diagram
- What is the data type of
Customers.CustomerID? Orders.CustomerID?
- Must primary key and foreign key columns use the same name?
Primary Key and Foreign Key Diagram

^ top
4.1.4: Defining Relationships
- As stated above, we use foreign keys to link tables in a relational database
- Tables can be related in one of three ways:
One-to-Many Relationships
- This is the most common type of table relationship
- Two tables are related one-to-many (1-M) if:
- For every row in the first table
There are zero, one, or many rows in the second table
- But for every row in the second table
There is exactly one row in the first table
- For example in the following two tables:
- Each row in Orders can have multiple rows of OrderItems
- But for every row in OrderItems there is only one row in Orders
- Thus, Orders is related to OrderItems in a 1-M relationship
Orders
| OrderID |
CustomerID |
OrderDate |
| 1 |
1 |
2002-02-01 |
| 2 |
2 |
2002-02-01 |
| 3 |
1 |
2002-02-03 |
OrderItems
| OrderID |
ItemNbr |
ProductID |
PriceEach |
Quantity |
| 1 |
1 |
1 |
22.50 |
1 |
| 2 |
1 |
4 |
6.25 |
3 |
| 3 |
1 |
3 |
4.75 |
1 |
| 3 |
2 |
4 |
1.00 |
10 |
One-to-One Relationships
- Two tables are related one-to-one (1-1) if:
- For every row in the first table
- There is at most one row in the second table
- True one-to-one relationships are rare in databases
- They are sometimes used to get around some limitation of DBMS
- Security limiteations
- Performance improvement
- Exceeding the maximum number of columns allowed
(At one time, Access had a maximum of 255)
- Example useage: split a table if some columns are used more than others
(e.g. Customer -- CustConfidential)
- You probably do NOT need a 1-1 relationship in your project database tables
Many-to-Many Relationships
- Two tables are related many-to-many (M-M) if:
- For every row in the first table
There can be many rows in the second table
- And for every row in the second table
There can be many rows in the first table
- Many-to-many relationships can't be directly modeled in RDBMS
- Instead, relationships are composed of two 1-M relationships
- A third table holds the two 1-M relationships
- We look at an example a M-M relationship in the next exercise
^ top
Exercise 4.1: Database Relationships
In this exercise, we look at the relationships between tables.
Specifications
- Start a text file named exercise4.txt.
- Prepare the exercise header as described in the HowTo on submitting exercises
- Label this exercise: Exercise 4.1
- Consider the tables below and answer the following questions:
- What type of relationship exists between Orders and OrderItems?
- What type of relationship exists between Products and OrderItems?
- What type of relationship exists between Orders and Products?
Orders
| OrderID |
CustomerID |
OrderDate |
| 1 |
1 |
2002-02-01 |
| 2 |
2 |
2002-02-01 |
| 3 |
1 |
2002-02-03 |
OrderItems
| OrderID |
ItemNbr |
ProductID |
PriceEach |
Quantity |
| 1 |
1 |
1 |
22.50 |
1 |
| 2 |
1 |
4 |
6.25 |
3 |
| 3 |
1 |
3 |
4.75 |
1 |
| 3 |
2 |
4 |
1.00 |
10 |
Products
| ProductID |
ProductName |
Price |
InStock |
| 1 |
Canvas |
22.50 |
30 |
| 2 |
Brush, Big |
4.75 |
47 |
| 3 |
Brush, Small |
3.75 |
34 |
| 4 |
Oil Paint |
6.25 |
24 |
| 5 |
Acry. Paint |
6.50 |
65 |
| 6 |
Brusher, Hiliner |
4.50 |
10 |
^ top
4.2: Relational Database Design
Objectives
At the end of the lesson the student will be able to:
- Use the normalization process
- Apply normalization to designing a database
|
^ top
4.2.1: Introduction to Database Design
- In this section we look at how to design a relational database
- The benefits of databases designed with the relational model include:
- Efficient data retrieval, summarization and reporting
- Efficient data entry, updates and deletions
- Easy changes to the database design
- Predictable behavior, since the database follows a well-formulated model
How the Process Works
- The database design process starts out with an initial design
- To each table of the initial design, you apply a series of normalization steps
- Each normalization step is a test of whether your design has certain desirable characteristics
- After meeting normalization criteria, you have a design that is efficient in both memory and speed
- The efficiency and speed results from minimizing redundant data
- You may choose to "denormalize" the design if you want to trade off memory for speed, or speed for memory
^ top
4.2.2: Starting the Design Process
- Before designing, take time to analyze the system you are trying to model
- You need to examine the data and conditions of the problem
- Consider using some of the following techniques to understand the problem:
- Paraphrase the problem: write a statement about the problem in your own words
- Ask questions about the problem and try to get answers to the questions
- Draw a diagram to help visualize the problem
- Look for similarities to previous problems you or others have solved
- Separate the parts: break a big problem down into smaller problems
- You will need a good understanding of the problem to make decisions about your database design
Initial Design
- Start by thinking about your model in terms of real-world objects
- Each type of a real-world object will need its own table
- Customers have names and addresses
- Suppliers offer products and also have names and addresses
- Products have names, prices and suppliers
- Orders have customers, payment info and quantities
- Decide on a preliminary set of fields (attributes) for each table
- Each attribute has data associated with the entity
- Choose a primary key for each table
- Note that your project might be like other database projects
- If so, you can save time by starting your design from these existing designs
Refining the Design
- As we develop our design we are faced with a series of choices:
- How many tables will there be and what will they represent?
- Which columns will go into which tables?
- What relationships do we need between the tables?
- Answering these questions is part of database normalization
- Normal forms are a series of tests that you apply to your database
- Meeting these test criteria removes redundancies and structures data to prevent inconsistencies from arising
- Each higher normal form achieves a better, more efficient design
- We will cover the first three as they are the most basic forms
Foundations of the Normal Form
- A table in the relational model has the following characteristics
- Describes one entity (object)
- Has no duplicate rows, by use of a primary key
- Columns are assumed to be unordered
- Rows are assumed to be unordered
- Relational modeling requires having a primary key for every table
- Does MySQL require a primary key for every table?
^ top
4.2.3: First Normal Form
- First Normal Form (1NF) says that all attributes must be atomic
- There can exist only one value per field -- not an array or list of values
- This means that you make a separate table for each set of related values
- What are the benefits of single-valued fields?
Non-atomic Values
- The following violates 1NF -- column values are not atomic
- What is the difficulty with this arrangement of data?
Orders1
| OrderID |
CustomerID |
Items |
| 1 |
1 |
1 Brush,Big 2 Oil Paint |
| 2 |
2 |
1 Canvas |
| 3 |
1 |
1 Acry. Paint 3 Brush,Big |
- Difficult to retrieve data such as the quantity of
Oil Paint
Repeating Groups
- 1NF also prohibits repeating groups -- even if in different columns
- What are some of the difficulties with this arrangement of data?
Orders2
| OrderID |
CustomerID |
Qty1 |
Item1 |
Qty2 |
Item2 |
| 1 |
1 |
1 |
Brush,Big |
2 |
Oil Paint |
| 2 |
2 |
1 |
Canvas |
|
|
| 3 |
1 |
1 |
Acry. Paint |
3 |
Brush,Big |
- Any query would have to search all QtyX and ItemX columns
- What if more items are needed than the available groups of columns?
- When do you stop adding columns?
Achieving First Normal Form
- The following table meets the requirements of 1NF
Orders3
| OrderID |
ItemNbr |
CustomerID |
OrderDate |
Quantity |
ProductName |
| 1 |
1 |
1 |
2002-02-01 |
1 |
Canvas |
| 2 |
1 |
2 |
2002-02-01 |
3 |
Oil Paint |
| 3 |
1 |
1 |
2002-02-03 |
1 |
Brush, Small |
| 3 |
2 |
1 |
2002-02-03 |
10 |
Oil Paint |
- What is the primary key of Orders3?
^ top
4.2.4: Second Normal Form
- The second normal form only applies to tables having primary keys with multiple columns
- Requirements of the second normal form (2NF):
- The table is in 1NF
- Every non-key column is fully dependent on the entire primary key
- We want to make sure that a table contains data about only one entity (person, place, thing or event)
- If you can look up some data in your table using only part of the primary key, the table does not meet 2NF.
- The following table is in 1NF but not 2NF -- why not?
- Note that the primary key is a composite of both OrderID and ItemNbr
Orders4
| OrderID |
ItemNbr |
CustomerID |
OrderDate |
Quantity |
ProductID |
ProductName |
| 1 |
1 |
1 |
2002-02-01 |
1 |
1 |
Canvas |
| 2 |
1 |
2 |
2002-02-01 |
3 |
4 |
Oil Paint |
| 3 |
1 |
1 |
2002-02-03 |
1 |
3 |
Brush, Small |
| 3 |
2 |
1 |
2002-02-03 |
10 |
4 |
Oil Paint |
- If we use just the OrderID, can we find the CustomerID or OrderDate?
- Thus CustomerID and OrderDate do not depend on the entire primary key
- We see repeated CustomerID and OrderDate values for the same order number
- Thus, these columns are in the wrong table
- We achieve 2NF by breaking the data into two tables: Orders and OrderItems
Orders
| OrderID |
CustomerID |
OrderDate |
| 1 |
1 |
2002-02-01 |
| 2 |
2 |
2002-02-01 |
| 3 |
1 |
2002-02-03 |
OrderItems
| OrderID |
ItemNbr |
Quantity |
ProductID |
ProductName |
| 1 |
1 |
1 |
1 |
Canvas |
| 2 |
1 |
3 |
4 |
Oil Paint |
| 3 |
1 |
1 |
3 |
Brush, Small |
| 3 |
2 |
10 |
4 |
Oil Paint |
- OrderID is a foreign key in OrderItems and is used to rejoin the tables
- Always break apart tables so they can be put back together in a query
^ top
4.2.5: Third Normal Form
- Third Normal Form (3NF) requires
- Table is in 2NF
- All non-key columns are mutually independent
- You should not be able to use a non-key field to look up the value of another field
Calculated Columns
- An obvious example is a calculated column
- It is better to leave a calculated column out of a table entirely
- Saves room in the database
- Avoids having to update calculated field when other fields change
Other Dependencies
- Dependencies that aren't the result of calculations can also exist in a table
- Can you spot the dependencies in the following table?
(Hint: It shows up as redundant data)
OrderItems
| OrderID |
ItemNbr |
Quantity |
ProductID |
ProductName |
| 1 |
1 |
1 |
1 |
Canvas |
| 2 |
1 |
3 |
4 |
Oil Paint |
| 3 |
1 |
1 |
3 |
Brush, Small |
| 3 |
2 |
10 |
4 |
Oil Paint |
- Such dependencies cause problems when adding, updating, or deleting rows
- Consider if you had 1000 rows of OrderItems for the
Brush,Small:
- You have both a ProductID and ProductName for all 1000 records
- What if you change the name of the
Brush,Small to Brush,Petite?
- After you delete all your old orders, how will you know that a ProductID of 3 means
Brush,Petite?
- To correct the problems, move any column that does not depend on the primary key to another table
- In this case, put ProductID and ProductName into a separate table
- ProductID becomes the primary key of new table Products
- OrderItems has a foreign key of ProductID to relate to the primary key of Products.ProductID
- You can easily rejoin the tables using a SQL query
OrderItems
| OrderID |
ItemNbr |
ProductID |
Quantity |
| 1 |
1 |
1 |
1 |
| 2 |
1 |
4 |
3 |
| 3 |
1 |
3 |
1 |
| 3 |
2 |
4 |
10 |
Products
| ProductID |
ProductName |
| 1 |
Canvas |
| 2 |
Brush, Big |
| 3 |
Brush, Small |
| 4 |
Oil Paint |
| 5 |
Acry. Paint |
| 6 |
Brusher, Hiliner |
^ top
4.2.6: Summary
- Normalization is the process of simplifying the design of a database to an optimal structure
- Removes redundant data
- Minimizes the chances of data becoming inconsistent over time
- Achieve 1NF by:
- Making sure each data entry is single-valued (in its own row)
- Achieve 2NF by:
- Moving attributes that are not dependent on the entire primary key to another table
- Creating relationships between the new tables and the original table through the use of foreign keys
- Achieve 3NF by:
- Moving any column that does not depend on the primary key to another table
- After meeting 3NF, all non-key columns depend on the primary key, the whole primary key and nothing but the primary key.
Further Information
^ top
Exercise 4.2: Design a Database
In this exercise we design a database.
Background
Dr. Joe and Dr. Ed are running a small family practice. They have been saving patient visits information in a spreadsheet and want you to design them a database instead. Under strictest confidentiality, they have provided you with the sample data shown below.
| Patient Name |
Patient Address |
Dates Visited |
Attending Physician |
Insurance Name |
Insurance Address
|
| James Bond |
221A Baker St. |
01/1/1, 01/12/25 |
Dr. Joe, Dr. Ed |
HealthCo |
123 Big Bucks Way |
| John Jones |
123 Lake St |
01/12/30 |
Dr. Joe |
NyetCare |
234 Nopay Lane |
| Issac Newton |
2345 Apple St. |
01/2/3, 02/2/4 |
Dr. Ed, Dr. Ed |
FurtherCheck, HealthCo |
345 Seymour St, 123 Big Bucks Way |
Specifications
- Label this exercise: Exercise 4.2: Design a Database.
- Consider the above table and convert it to 3NF.
- For each table you end up with, show the design in the following form:
tablename(field1, field2, ... , fieldn)
- Underline the primary key field in each table. You may use the text convention of starting and ending the field with an underscore (
_ ) character.
tablename(_field1_, field2, ... , fieldn)
- Indicate the presence of foreign-key fields, normally indicated using a dashed underline or italics, by using an asterisk (
* ) to start and end the field.
tablename(field1, *field2*, ... , fieldn)
- Save your final schema in exercise4.txt
^ top
4.3: Catalog of Designs
Objectives
At the end of the lesson the student will be able to:
- Apply common database designs to solve their own design problems
|
- In this section, we consider several examples of database design
- Each section explains an approach to a common design problem
- Many more designs are available in the recommended book:
Data Modeling, G. Lawrence Sanders, boyd & fraser publishing company, ISBN 0-87709-066-1.
- Students are free to start their project database design from these samples
^ top
4.3.1: Auctions
- An online aution offers members the opportunity to buy and sell items
- The
Members table stores information about people buying and selling
- Includes both buyers and sellers
- The
Items table keeps track of items on the auction block
- The
Bid table links buyers with items for sale
- The
Ratings table records the ratings placed by buyers or sellers for a transaction.
Members(Username, Password, FullName, Email)
Bid(ItemID, Buyer, Price, QuantityWanted, BidTime)
Items(ID, Seller, Category, Title, Description, Quantity,
StartPrice, BidIncrement, LastBidReceived, CloseTime)
Rating(ID, ItemID, Buyer, Seller, BuyerRating,
BuyerComment, SellerRating, SellerComment)
^ top
4.3.2: Content Management
- Many web sites feature articles and news stories
- Each web page has a code that is stored in the
Pages table
- The heading for each page is stored in the
Pages table as well
- Assumes that a story will appear on only one page
- The
Stories table contains the content submitted by writers
- Which page a story appears on is controlled by the foreign key
Page
- Each contributing writer has an account in the
Writers table
- Editors use the same table
- Permissions to publish on a page is controlled by the
WritersPages table
WriterPage table used to check which writers can access which pages
- Usually an editor to prevent mayhem
Pages(PageCode, Heading)
Stories(ID, Writer, Page, Headline, StoryText, Picture,
Created, Modified, Published)
Writers(Username, Password, FullName, Email)
WritersPages(Username, PageCode)
^ top
4.3.3: Order Processing
- Internet-based e-commerce site
- Orders are placed by customers via web browsers
- The
Customers table stores information about customers
- However, we store customer addresses in the
Addresses table
- Separate table because customers can have many addresses
- Addresses are related to
Customers using the CustomerID foreign key
- We keep information about an order in the
Orders table
- The
OrderItems table stores the information for each item ordered
- Allows variable numbers of items for each
Order
- The
Products table contains a list of products items
Customers(ID, LName, FName, Email)
Addresses(ID, CustomerID, Address, City, State, Zip, Country)
Orders(ID, CustomerID, DatePlaced, Status)
OrderItems(OrderID, ItemNbr, ProductID, Quantity,
PriceCharged, Status)
Products(ID, Name, Description, Path2Image, PriceEach,
InStock, Weight)
^ top
4.3.4: Shopping Carts
- Internet-based e-commerce site
- Products are viewed by customers via web browsers
- Selected items are placed in a "shopping cart" for eventual purchase
- The
ShoppingCarts table keeps information about items selected
SessionID is a unique number assigned to track each user
- Similar to an
OrderItems table
- Do not need a separate table for the session, since it is just an identifier
ProductID provides the rest of the primary key
- The
Products table contains a list of products available
- The
SessionID is external data provided by PHP
ShoppingCarts(SessionID, ProductID, AddDate,
PriceCharged, Quantity)
Products(ID, Name, Description, Path2Image, PriceEach,
InStock, Weight)
^ top
4.3.5: Reservation Systems
- A system to allow people to reserve inventory for a period of time
- Inventory can include: equipment, rooms, seats for events, etc.
- The
Customers table stores information about people reserving inventory
- The
Inventory table stores information about each item people can reserve
- Use the
Reserves table to reserve inventory items for a period of time
Starting can be a timestamp or an integer number for day of year and the like
Duration is the number of units of time for which customers can reserve inventory
Customers(Username, Password, FullName, Email)
Reserves(Customer, InventoryID, Start, Duration, Status)
Inventory(ID, Description)
^ top
4.3.6: Role-Based Access Control
- A role-based access control subsystem
- More sophisticated access control than most projects need
- The
Users table stores information about the people who can access the system
- The
Role table stores information about each role people can play
- Use the
UserRole table to assign users to their roles
- The
Operations table stores information about each operation people can perform
- Use the
RoleOperation table to assign role-players to the operations they can perform
Users(Username, Password, FullName, Email)
UserRole(Username, RoleID)
Roles(ID, Name, Description)
RoleOperation(RoleID, OperationID)
Operations(ID, Name, Description)
^ top
Exercise 4.3: Design Questions
In this exercise we explore the use of database design patterns.
Specifications
- In your
exercise4.txt file, label this exercise: Exercise 4.3: Design Patterns
- Save your answers to the following questions in your
exercise4.txt file:
- Are database designs reuseable in another application? Why or why not?
- Which design does the physicians database of exercise 4.2 resemble?
- Which of the above designs may be usable in your project?
^ top
4.4: Proposal Presentations
- The presentation should consist of the following parts:
- Statement of the Problem
- Identification of the Expected Output
- List of the Data to Store
- Try to keep the presentation to 5 minutes or less
- Feel free to leave (or stay) after your presentation
- You can present to the instructor alone after all the public presentations are done
^ top
Wrap Up
^ top
Home
| WebCT
| Announcements
| Course info
| Expectations
| Schedule
Project
| Help
| FAQ's
| HowTo's
| Links
Last Updated: March 01 2006 @15:48:30
|