Jennifer started off the course by providing some useful characteristics of a Database Management System (‘DBMS’):
“Efficient, reliable, , convenient and safe multi-user storage of and access to massive amounts of persistent data”
- Massive scale – Jennifer talked about databases with the potential for terabytes of data.
- Persistent – The database always outlives the programme that will operate on its data.
- Safe – Power or hardware failures for examples shouldn’t affect the database.
- Multi-user – Jennifer talked about concurrency control in this respect.
- Convenient – The convenience of DBMS comes for a large part from ‘physical data independence’, i.e. operations on the data are independent of the way in which the data is laid out.
- Efficient – This relates to database performance, e.g. thousands of queries/updates per second.
- Reliable – Jennifer pointed out that with databases 99.999999% up-time is critical.
Jennifer then went on to break down the structure of a database into to its most basic elements, something which I actually found quite helpful:
- Database is a set of named relations (or tables)
- Each relation has a set of named attributes (or columns)
- Each tuple (or row) has a value for each attribute
- Each attribute has a type (or domain)
- A database schema is a structural description of relations in a database
- An instance contains the actual contents at a given point in time
- “Null” is a special value for “unknown” or “undefined”
- A key is an attribute whose value is unique in each tuple (e.g. a student ID) or a set of attributes whose combined values are unique
- Having unique keys helps to (1) identify specific tuples (columns) (2) fast indexing of the database and (3) refers to tuples of another key
In the session, Jennifer gave a first flavour of how to create relations (table) in SQL, which is commonly used database programming language. I found below example which Jennifer gave to be very helpful:
Create Table Student (ID, name, GPA, photo)
Create Table College (name string*, state char*, enrolment integer*) * These are all attribute types
She then went on to talk a bit about querying relational databases, outlining common steps in creating and using a (relational) database:
- Design schema – Create a schema which describes the relations within a database, using a Data Definition Language (‘DDL’)
- “Bulk load” initial data – Load the initial data into the database
- Repeat – Execute queries and modifications
Jennifer then finished off by giving an example of a SQL query that returns the follow relation: “IDs of students with GPA > 3.7 applying to Stanford”. In SQL this query would look something like this:
Select Student ID
From Student, Apply
Where Student ID = Apply ID
And GPA > 3.7 and college = ‘Stanford”
Main learning point: I’ve clearly got my work cut out for me, but I felt that this first mini-course on relational data models and databases was incredibly helpful and easy to understand. I’m hoping I can build on these first foundations and understand more about how to best structure SQL queries and interpret the relations that these queries return.