CSc 4402 Introduction to Database Management Systems

Fall 2007

 

Course: Mon & Wed 3:40-5:00 PM  Coates 102

Instructor: Donald H. Kraft     Office: 286 Coates      Phone: 578-2253

Office Hours: Mon & Wed 2:00-3:30 pm . Tues-Thurs 3:00:-4:30 pm

Email: kraft@csc.lsu.edu         Web: http://www.csc.lsu.edu/~kraft

Grader: Anindya Poddar         Email: apodda2@lsu.edu

Text: A. Silberschatz, H.F. Korth, and S. Sudarshan. Database System Concepts, 5th edition, McGraw-Hill, 2007 (http://codex.cs.yale.edu/avi/db-book/)

Reference: L. Welling and L. Thomson. PHP and MySQL Web Development, 3rd edition,

Developers Library, 2004

Additional References: C.J. Date, An Introduction to Database Systems, 8th edition,

Addison-Wesley, 2004

Credits: 3 hours                                                                      Prerequisites: CSc 3102

Course Catalog Description: Network, hierarchical, and relational, and entity-relationship models; data definition, manipulation languages, and conversion among these models; relational database design theory, efficient query evaluation; elementary query optimization techniques.

Course Outcomes: Master the basic concepts and appreciate the applications of database systems; Master the basics of SQL and construct queries using SQL; Be familiar with a commercial relational database system (Access) by writing SQL using the system; Be familiar with the relational database theory, and be able to write relational algebra expressions for queries; Master sound design principles for logical design of databases, including the E-R method and normalization approaches; Be familiar with basic database storage structures and access techniques – file and page organizations and indexing methods including B-trees and hashing; Master the basics of query evaluation techniques and query optimization; Be familiar with the basic issues of transaction processing and concurrency control; and Master working successfully on a team to design and develop a database application (optional).

Topics:

            Basic Concepts                                                                                    Chap. 1

Basic terminology, Notion of a database (DB) system, DBMS, DBA, Data independence (physical and logical), Data abstraction, Advantages of a DB, Data models (e.g., hierarchical, E-R model, relational model), Data storage and query processing, DB system architecture (3-level ANSI/Sparc), Languages (DDL, DML, SQL), Design

            Relational Model                                                                                  Chap. 2

Data structures, Domains, Attributes, Relations, Schema, Instances, Primary keys, Candidate keys, Foreign keys, Properties of relations, Formal theory underlying relational database systems, Relational data structure, Relational data manipulation, Relational data integrity (entity, referential), Relational algebra (RA, expressions)

            SQL Language                                                                         Chaps. 3-4

                        Data definition, Data manipulation, Select statement, Simple queries, Join

queries, Aggregate functions, Subqueries, Set-theoretic operators (group

by, having), Similarity and Differences (views versus base tables), Update, Delete, Insert, Data types, Integrity constraints (check constraints, assertions, authorizations), Embedded SQL, Dynamic SQL

           

PHP                                                                        Welling-Thomson, Chaps. 1,2

MySQL                                                                  Welling-Thomson, Chaps. 9-12

Security                                                                   Welling-Thomson, Chaps. 15,22

Database Design                                                                                   Chaps. 6-7

Design (approaches, pitfalls, redundancy, incompleteness, choices) E-R modeling for DB design (Is-A hierarchy), Cardinality constraints, E-R diagrams, Functional dependencies (fd, Armstrong’s axioms, logical implications), Normalization (normal forms, algorithms to go from 1NF to 2NF to 3NF to BCNF), Examples, Logical design (properties of a good or bad design, Attribute closure, Canonical covers, LLJ and fd-preserving decomposition, BCNF decomposition

            Transaction Processing (Management) and

Concurrency Control                                                                Chaps. 15-16

Notion of transactions; ACID properties of transactions, Concurrent

Schedules, Serializability, Conflict serializable and view serializable schedules, Use precedence graph, Concurrency control, Locking Protocols, waiting compatibility matrix, Two-phase locking protocol, Dead-lo0ck detection and recovery, Wait-for graphs

            Storage and Query Processing                                                  Chaps. 11-14

Storage structures, Storage hierarchy, Raid file organization (fixed-length records), Indexing, B-trees, Dense vs. non-dense indexes, Hashing; Query processing, Measures of query processing costs, Implementation of select and join operators, Query optimization (relational algebra level, RA operator level, implementation level), Transformation of RA expressions

Grading:

            Homework (7 homeworks)                                                                               15%

                        Hard copy if possible, if late up to 3 days then lose 10 points/day

            Program Assignments (5)                                                                                  30%

                        Submit source code and output, can submit electronically, if late

                                    up to 5 days then lose 10 points/day

            Paper (databases in society issues (see below)                                      10%

            Midterm examination                                                                                        20%

            Final examination - Monday, December 10 5:30-7:30 PM                            20%

            Participation                                                                                                      5%

Approximate Grading Scale:

For undergraduate students: 85-100 A, 75-84 B, 65-74 C, 55-64 D, below 55 F.  For graduate students: 90-100 A, 80-89 B, 70-79 C, 60-69 D, below 60 F.

Paper – all students must write a short (ten pages or so, but do not take that number literally), research paper on some aspect of computers and databases in society (e.g., privacy, security, ethics).  This should be along the lines of a bibliographic essay, describing a concept and relaying the state-of-the-art in terms of the social informatics aspects of the topic.  One should not provide a tutorial, but should focus on research and what the literature is saying.  One should not just concatenate several abstracts from papers nor rely solely on textbooks or the web; in fact, the use of journal articles, technical reports, and conference proceedings papers is encouraged.  One should weave a pattern of what is going on with the topic selected. One should also list all references consulted in the paper.  Remember that copying entire sections, especially without citing them, of your consulted articles is plagiarism and is academically and intellectually dishonest – violators will be prosecuted.     The choice of topics must be approved by the instructor by the end of the third week of the semester.    The paper is due by the end of the twelfth week of the semester.  Students taking this course for graduate credit must write a second paper on some other technical aspect of databases (e.g., fuzziness in databases, normalization beyond BCNF) that focuses upon research issues in database management systems topics.

Assignments - the due dates for all homeworks (http://www.csc.lsu.edu/~kraft/courses/csc4402.homeworks.html, each graded as 100 points) are considered absolute - however, late submissions are accepted at the cost of 10 points per day up to 3 days (holiday and weekend counts).  Hard-copies are preferred for homework submissions, but electronic submissions are accepted. No make-ups will be given for exams. If a student missed the mid-term exam for a justified reason (illness, family emergency, etc.), then the student's final exam grade will account for 80 percent of the student's grade for the entire course.  Students are responsible for knowing all information presented in class and materials assigned for self-reading.  The students are strongly encouraged to ask questions in and out of class.  It is recommended that students keep copies of all the submitted materials.  It is important that students observe the student conduct code in doing works in this course.  Each homework assignment and exam must be done by each individual student independently; "team work" for the examinations (midterm and final) and homework is not allowed.  For the project (http://www.csc.lsu.edu/~kraft/courses/csc4402.project.html, program assignments), each team (usually consisting of 3-5 students) has to complete the project together but independently of other teams.  Self study through reading the material BEFORE (and after) class and completing all the homeworks and project is very important for learning in this course.

Good luck!