Computer Science
Database Concept

DATABASE CONCEPTS: COMPLETE STUDY NOTES FOR CUET (COMPUTER SCIENCE)


1. Introduction to Data and Database Concepts

In today’s digital era, we are living in a connected world where information is produced, exchanged, and traced across the globe in real time. 


Data refers to unorganised facts that can be processed to generate a meaningful result or information. 


For example, a person’s name, age, and contact details are items of data. While single data points might not be useful alone, their collection and analysis are crucial for decision-making in fields like education, banking, and governance.


A database is a collection of related tables that store this data systematically. To manage these databases, we use a Relational Database Management System (RDBMS)


Software such as MySQL, Oracle, and SQL Server are popular RDBMS tools that allow users to create, retrieve, and manipulate data using a specialised query language.


2. Difference Between Database and File System

Before the advent of modern databases, data was managed using file systems. While both systems store data on secondary storage media like Hard Disks or SSDs, they differ significantly in efficiency and management.


  • Access Method: In a file system, one has to write specific application programs to access or manipulate data. In a DBMS, we use Structured Query Language (SQL), which consists of descriptive English words and is much easier to learn.


  • Data Retrieval: In a file system, the user must define how to get the data. In a DBMS, the user simply specifies what is to be retrieved, and the system handles the process.


  • Structure and Consistency: File processing has certain limitations regarding data redundancy (duplication) and inconsistency, which are effectively overcome by a DBMS.


  • Relationship Management: A DBMS allows for the creation of "relations" (tables) that can be linked to one another, ensuring that data across different entities (like students and their guardians) remains synchronised.


3. The Relational Data Model

The Relational Data Model is the most widely used model for database management. It organises data into structured tables, known as relations.


3.1 Key Terminology

  1. Relation: A table in a database is called a relation. It is a collection of rows and columns.

  2. Attribute: Each column in a relation represents a particular parameter or characteristic, called an attribute. For example, in a STUDENT table, RollNumber, SName, and SDateofBirth are attributes.

  3. Tuple: Each row in a relation is known as a tuple or a record. A tuple represents a single observation for different attributes.

  4. Domain: The Data Type of an attribute indicates the type of value it can hold, which effectively defines its domain. For instance, an attribute with the INT data type can only hold numeric values, while VARCHAR holds strings of variable length.

  5. Degree: The number of columns (attributes) in a table defines the degree of that relation.

  6. Cardinality: The number of rows (tuples) in a relation is called its cardinality.


4. Database Keys

Keys are essential in the relational model because they ensure that each record can be uniquely identified and that relationships between tables are maintained correctly.


4.1 Candidate Key

Although not explicitly defined as a standalone term in the summary snippets, a candidate key is any attribute (or set of attributes) that has the potential to uniquely identify a row in a table. From the set of candidate keys, the database designer selects the most appropriate one to be the Primary Key.


4.2 Primary Key

The Primary Key is the column (or set of columns) that uniquely identifies each row/record in a table.

  • Uniqueness: No two rows can have the same primary key value.


  • Non-Nullability: By default, primary key columns cannot have NULL values.


  • Composite Primary Key: Sometimes, a single attribute is not enough to identify a row uniquely. In such cases, a primary key is made up of multiple attributes, called a composite key. For example, in an ATTENDANCE table, the combination of AttendanceDate and RollNumber forms a composite primary key.


4.3 Alternate Key

Any candidate key that is not chosen as the Primary Key is known as an Alternate Key. These provide an alternative way to identify records if the primary key is not being used.


4.4 Foreign Key

A Foreign Key is a column that refers to a primary key in another table.

  • Linking Tables: It establishes a logical link between the "referencing" table and the "referenced" table.

  • Constraints: While populating a table with a foreign key, the designer must ensure the records in the referenced table already exist.

  • Example: In a STUDENT table, the attribute GUID (Guardian ID) is a foreign key that refers to the GUID primary key in the GUARDIAN table.


5. Relational Algebra

Relational algebra consists of operations that work on one or more relations to produce a new relation as a result. These operations form the mathematical basis for SQL queries.


5.1 Selection (σ)

The selection operation is used to filter rows that meet a specific condition. In SQL, this is achieved using the WHERE clause.

  • Example: Selecting employees from department 'D01' only.


5.2 Projection (π)

Projection is the process of selecting specific columns (attributes) from a relation while discarding others.

  • Example: A query that retrieves only the EmpNo and Ename of employees rather than the entire table.


5.3 Union ()

The union operation combines the selected rows of two tables.

  • Rules for Union: Both relations must have the same number of attributes, and corresponding attributes must have the same domain.

  • Duplicate Handling: If some rows are the same in both tables, the result of the Union will show those rows only once.

  • Example: Combining a list of students in a DANCE group and a MUSIC group to get a list of students participating in either event.


5.4 Set Difference (Minus —)

This operation is used to get tuples that are present in the first table but not in the second table.

  • Example: To find students who are only in MUSIC and not in DANCE, you would perform MUSIC — DANCE.


5.5 Intersection ()

The intersection operation is used to find the common tuples between two tables.

  • Example: To find students participating in both DANCE and MUSIC, you would use DANCE $\cap$ MUSIC.


5.6 Cartesian Product (X)

The Cartesian product combines every tuple from the first relation with every tuple from the second relation.

  • Resulting Degree: The sum of the degrees of both relations.

  • Resulting Cardinality: The product of the cardinalities of both relations.

  • Example: If the DANCE table has 4 rows and the MUSIC table has 5 rows, the Cartesian product will result in 4×5=20 rows. This operation is used when a query involves more than one table in the FROM clause.


6. SQL Command Categories for Database Management

To implement these concepts, SQL commands are classified into two main types:

  1. Data Definition Language (DDL): Used for defining, modifying, and deleting the structure (schema) of the database.

    • CREATE DATABASE: Creates a new database container.

    • CREATE TABLE: Defines attributes, data types, and constraints for a new table.

    • ALTER TABLE: Used to add/remove columns or constraints (like Primary or Foreign keys) after a table is created.

    • DROP: Permanently removes a table or database from the system.

  2. Data Manipulation Language (DML): Used for managing the data stored within the relations.

    • INSERT INTO: Populates the table with new records.

    • SELECT: The primary command for retrieving data and performing algebra-like operations.

    • UPDATE: Modifies existing data in a table.

    • DELETE: Removes specific records from a table.


7. Summary Table: Relational Model Components

Concept

Description

SQL Implementation Example

Relation

The entire table

CREATE TABLE STUDENT (...)

Attribute

A column (e.g., RollNumber)

RollNumber INT

Tuple

A single row/record

One set of student details

Degree

Total number of columns

Count of attributes in DESC STUDENT

Cardinality

Total number of rows

Result of SELECT COUNT(*) FROM table

Primary Key

Unique, Non-NULL identifier

PRIMARY KEY (RollNumber)

Foreign Key

References another table

FOREIGN KEY (GUID) REFERENCES GUARDIAN(GUID)