Relational Algebra

Reading Time: 5 minutes

Relational algebra, first described by E.F. Codd while at IBM, is a family of algebra with a well-founded semantics used for modelling the data stored in relational databases, and defining queries on it.

To organize the data, first the redundant data and repeating groups of data are removed, which we call normalized. By doing this the data is organized or normalized into what is called first normal form (1NF). Typically a logical data model documents and standardizes the relationships between data entities (with its elements). A primary key uniquely identifies an instance of an entity, also known as a record.

Once the data is normalized and in sets of data (entities and tables), the main operations of the relational algebra can be performed which are the set operations (such as union, intersection, and cartesian product), selection (keeping only some rows of a table) and the projection (keeping only some columns). Set operations are performed in the where statement in SQL, which is where one set of data is related to another set of data.

The main application of relational algebra is providing a theoretical foundation for relational databases, particularly query languages for such databases, chief among which is SQL.

Definition adopted from Vikipedia

Contents

Note on this section

I’ve derived my notes from the book Database Systems – A Practical Approach to Design Implementation, and Management by Thomas Conolly, Carolyn Begg while taking Database Systems module at my University.

Also I’ve found this website http://138.232.66.66/ra/calc.htm which allows you to run native relational algebra queries. I found this web application very useful and easy to use. On the other hand there were other tools which offer you almost nothing and trouble. For each section, I will provide example queries and screenshots from this application. There are three databases available in this application. Rather than those examples, you can always explore more on this web site to advance your skills on the side. For your information this application has been prepared I assume by German students, for that reason naming conversions are in German. Do not get confused I produce outputs in English.

Selection – σ

The selection operation works on a single relation R and defines a relation that contains only those tuples of R that satisfy the specified condition (Predicate)

Example

List marks which are higher than 27000

σ Note > 1 (pruefen)

selection

 

Projection – π

The projection operation works on a single relation R and defines a relation that contains a vertical subset of R, extracting the values of specified attributes and eliminating duplicates

Example

Produce a list of assistants with the details of Personnel Number, Name, Subject and Boss information

π PersNr, Name, Fachgebiet, Boss (Assistenten)

projection

 

Union – ∪

The union of two relations R and S defines a relation that contains all the tuples of R or S or both R and S, duplicate tuples being eliminated. R and S must be union-compatible.

Example

List all students who have taken a test

π MatrNr (Studenten) ∪ π MatrNr (pruefen)

union

 

Set Difference – –

The set difference operation defines a relation consisting of the tuples that are in relation R, but not in S. R and S must be union compatible

Example

List all students who have taken a test but not listed in student table

π MatrNr (Studenten) - π MatrNr (pruefen)

setdifference

 

Intersection – ∩

The intersection operation defines a relation consisting of the set of all tuples that are in both R and S. R and S must be union-compatible. A different way of carrying out this objective is similar to this mathematical formula R∩S = R – (R-s)

Example

List all students who have taken a test and recorded in student table as well

π MatrNr (Studenten) ∩ π MatrNr (pruefen)

intersection

 

 

Cartesian Product – X

The Cartesian product operation defines a relation that is the concatenation of every tuple of relation R with every tuple relation S. The Cartesian product operation multiples two relations to define another relation consisting of all possible pairs of tuples from the two relations. Therefore if one relation has I tuples and N attributes and the other has J tuples and M attributes, the Cartesian product relation will contain (I*J) tuples with (N+M) attributes

Example

Produce a list of Students and exams

σ Studenten.MatrNr = pruefen.MatrNr (π MatrNr, Name, Semester (Studenten)) ⨯ (πMatrNr, VorlNr, PersNr, Note(pruefen))

Theata Join – θ

The theta join defines a relation that contains tuples satisfying the predicate F from the Cartesian product of R and S. The predicate F is of the form Rai, θ Sb1 where θ may be one of the comparison operators.

Example

(π d, b (T)) ⨝ T.d = S.d (π d, b (S))

thetajoin

 

Natural Join – ⨝

The natural joib is an Equi(theta) join of two relations R and S over all common atrributes x. One occurrence of each common attribute is eliminated from the result

Example

(π d, b (T)) ⨝ (π d, b (S))

naturaljoin

Outer/Left Join – ⋊

The (left) outer join is a join in which tuples from R that do not have matching values in the common attributes of S are also included in the result relation. Missing values in the second relation are set to null.

Example

(π d, b (T)) ⨝ T.d = S.d (π d, b (S))

outerjoin

 

Semi Join – ▷

The semi join operation defines a relation that contains the tuples of R that participate in the join of R with S satisfying the predicate F

Example

(π d, b (T)) ▷ (π d, b (S))

semijoin