Database Management Systems
Unit 1: Introduction to Database Systems
From file-system chaos to relational elegance โ understand why databases power every โน transaction, every Aadhaar lookup, and every Swiggy order you place.
๐ข Oracle & PostgreSQL | ๐ 15 MCQs (Bloom's) | ๐ฌ 5 Lab Exercises | ๐ผ Interview Prep
Why This Chapter Pays Your Salary
Every application you've ever used โ Swiggy, PhonePe, IRCTC, Instagram โ is a pretty front-end sitting on top of a database. The front-end is what users see. The database is where the money lives. If the database goes down, the business goes down. Database skills are the highest-paid, most durable skills in IT. Oracle DBAs at Indian banks earn โน18-35 LPA. Understanding databases is not optional โ it's the foundation of every software system.
๐ข Industry Snapshot
SBI (State Bank of India) โ Operates one of Asia's largest Oracle databases. Their Core Banking Solution (CBS) processes 100+ crore transactions monthly across 22,000 branches. Every rupee passing through India's largest bank touches Oracle DB.
UIDAI (Aadhaar) โ Stores biometric + demographic data for 1.4 billion Indians. The world's largest biometric database uses a combination of Oracle, PostgreSQL, and custom NoSQL stores. 10+ billion authentication transactions since launch.
Flipkart โ Product catalog (50 crore+ items), customer data, order history, seller inventory โ all managed by a combination of PostgreSQL, MySQL, and Redis. Their database architecture handles 10 lakh+ orders on Big Billion Days.
Learning Outcomes โ Bloom's Taxonomy
| Bloom's Level | Outcome Statement |
|---|---|
| L1 โ Remember | List the limitations of file-based systems, name DBMS components, and recall the properties of a relation |
| L2 โ Understand | Explain the three-schema architecture (ANSI/SPARC), differentiate between logical and physical data independence, and describe ER model concepts |
| L3 โ Apply | Draw ER diagrams for given business domains with correct notation for entities, attributes, relationships, cardinality, and participation |
| L4 โ Analyze | Compare data models (hierarchical, network, relational, document) and analyze when to use each; identify candidate keys and superkeys from functional dependencies |
| L5 โ Evaluate | Justify the choice of relational model over alternatives for a given Indian enterprise scenario; critique an ER diagram for design flaws |
| L6 โ Create | Design a complete ER diagram and map it to a relational schema with all key types for a multi-module business system (hospital, e-commerce, banking) |
Concept Explanations
3.1 Purpose of Database Systems โ Why Files Aren't Enough
๐ From File System Chaos to Database Order
A Database Management System (DBMS) is software that stores, retrieves, and manages data in a structured, efficient, and secure way. It was invented to solve the 7 deadly problems of file-based data storage.
๐ REAL-WORLD ANALOGYImagine a hospital where patient records are kept in paper files in individual doctor's cabins (file system). Each doctor has their own copies. Patient X visits 3 doctors โ now there are 3 copies of their allergy information. One doctor updates it, the others don't know. A DBMS is like a centralized digital system (AIIMS HIS) โ one source of truth, accessible by all authorized doctors, always consistent.
โ๏ธ THE 7 PROBLEMS THAT CREATED THE NEED FOR DBMS| # | Problem | File System Example | DBMS Solution |
|---|---|---|---|
| 1 | Data Redundancy | Student's name stored in fees.csv, library.csv, hostel.csv โ 3 copies | Single students table; other tables reference it via student_id (foreign key) |
| 2 | Data Inconsistency | Student changes phone number โ updated in fees.csv but not in hostel.csv | One update in students table instantly consistent everywhere |
| 3 | Data Isolation | Data scattered across different file formats (CSV, Excel, text) โ hard to cross-query | All data in unified tables; SQL joins combine any tables instantly |
| 4 | Integrity Problems | No way to enforce "account balance โฅ 0" across all files | CHECK (balance >= 0) constraint enforced by the DBMS automatically |
| 5 | Atomicity Problems | Bank transfer: debit succeeds, credit fails (power cut) โ money vanishes | ACID transactions: both succeed or both rollback. Zero money loss. |
| 6 | Concurrent Access | Two clerks editing same file simultaneously โ one's changes get overwritten | Locking + MVCC: concurrent access without data corruption |
| 7 | Security Problems | Any user with file access can read/modify everything โ no fine-grained control | GRANT/REVOKE: control who can SELECT, INSERT, UPDATE, DELETE on each table |
IRCTC handled 25 crore+ bookings last year. Imagine managing this with flat files โ every concurrent booking, every waitlist update, every cancellation. File-based systems would collapse in minutes. IRCTC uses Oracle RDBMS to handle 25 million+ daily queries with ACID guarantee: your ticket is confirmed OR your money is refunded โ never lost in between.
3.2 Components of a DBMS
People Components
| User Type | Role | Indian Industry Example |
|---|---|---|
| Database Administrator (DBA) | Installs, configures, tunes, backs up, and secures the database. The "doctor" of the database. | Oracle DBA at SBI managing Core Banking DB (โน18-35 LPA) |
| Application Programmer | Writes code that interacts with the DB via SQL/PL-SQL or ORM frameworks | Java developer at TCS writing Spring Boot APIs that query Oracle |
| End User (Sophisticated) | Writes SQL queries directly โ analysts, data scientists | Data analyst at Flipkart writing SQL to analyze sales trends |
| Naรฏve/Casual User | Uses apps without knowing SQL โ clicks buttons that trigger queries | You booking a ticket on IRCTC app โ the app generates SQL behind the scenes |
System Components
DBMS Architecture
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ APPLICATION (Swiggy App) โ
โ โ SQL queries โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ QUERY PROCESSOR โ
โ โโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Parser โโโ Optimizer โโโ Execution Engine โ โ
โ โ(syntax) โ โ(best plan) โ โ(runs the plan) โ โ
โ โโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ STORAGE MANAGER โ
โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโ โ
โ โAuthorization โ โ Buffer โ โ File โ โ
โ โ& Integrity โ โ Manager โ โ Manager โ โ
โ โManager โ โ(RAM cache) โ โ(disk I/O) โ โ
โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Transaction Manager (ACID enforcement) โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ DISK STORAGE โ
โ Data files โ Index files โ Log files โ Data dictionary โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
| Component | Function |
|---|---|
| Query Processor | Parses SQL, checks syntax/semantics, optimizes (choose best join order, index usage), and executes the query plan |
| Storage Manager | Manages how data is stored on disk, reads/writes data blocks, handles file organization |
| Buffer Manager | Maintains a RAM cache of frequently accessed data blocks โ reduces slow disk I/O. This is why DBAs tune buffer pool size. |
| Transaction Manager | Ensures ACID properties. Manages locks, undo/redo logs, commit/rollback operations. |
| Authorization Manager | Checks user permissions before executing any operation. Enforces integrity constraints. |
3.3 Applications of DBMS in India
| Domain | Organization | DBMS Used | What It Stores |
|---|---|---|---|
| Banking | SBI (Core Banking) | Oracle 19c | 48 crore+ accounts, transactions, loans, KYC, fraud detection |
| Government | UIDAI (Aadhaar) | Oracle + PostgreSQL | 1.4 billion biometric records, demographic data, authentication logs |
| Railways | IRCTC | Oracle | PNR records, train schedules, seat maps, waitlists, payments |
| E-Commerce | Flipkart | PostgreSQL + MySQL | 50 crore+ products, sellers, orders, reviews, recommendations |
| Telecom | Jio | Oracle + Cassandra | 45 crore+ subscriber profiles, call records, billing, recharges |
| Healthcare | AIIMS | Oracle HIS | Patient records, prescriptions, lab results, bed management |
| Educational Platform | EDUARTHA ERP | MySQL/Oracle | Student records, attendance, grades, fees, hostel, placement |
| Payments | NPCI (UPI) | Oracle + custom | 10 billion+ monthly UPI transactions across 300+ banks |
3.4 Three-Tier DBMS Architecture
๐ Presentation โ Logic โ Database
A three-tier architecture separates an application into three independent layers, each responsible for a specific function. This separation allows teams to work independently and technology to be swapped without rewriting everything.
โ๏ธ HOW IT WORKS โ Swiggy Order ExampleThree-Tier Architecture
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ TIER 1 โ PRESENTATION LAYER (What you see) โ
โ Swiggy app (React Native / Flutter) โ
โ You tap "Order Butter Chicken" โ sends HTTP request โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ TIER 2 โ APPLICATION/BUSINESS LOGIC LAYER (Brain) โ
โ Swiggy backend (Node.js / Java Spring Boot) โ
โ Validates: Is restaurant open? Is item available? โ
โ Calculates: Price + GST + delivery charge = โน389 โ
โ Sends SQL query to database โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ TIER 3 โ DATABASE LAYER (Memory) โ
โ PostgreSQL / MySQL cluster โ
โ INSERT INTO orders (customer_id, restaurant_id, total, ...) โ
โ UPDATE restaurant SET current_orders = current_orders + 1 โ
โ Returns confirmation โ Tier 2 โ Tier 1 โ "Order Placed! ๐" โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
๐ข INDUSTRY USE
Why separate? Flipkart can redesign their mobile app (Tier 1) without touching the database. SBI can switch from Oracle to PostgreSQL (Tier 3) without rewriting their customer-facing website. TCS can assign frontend, backend, and DBA teams independently. This is how all modern Indian IT projects are structured.
3.5 Three-Schema Architecture (ANSI/SPARC)
๐ External โ Conceptual โ Internal
The ANSI/SPARC three-schema architecture defines three levels of data abstraction, ensuring that changes at one level don't affect the others.
Three-Schema Architecture
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ EXTERNAL SCHEMA (View Level) โ What each USER sees โ
โ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโ โ
โ โ View 1: โ โ View 2: โ โ View 3: โ โ
โ โ Student โ โ Accounts โ โ HR โ โ
โ โ Portal โ โ Dept โ โ Manager โ โ
โ โ sees: โ โ sees: โ โ sees: โ โ
โ โ name, โ โ fees, โ โ name, โ โ
โ โ grades, โ โ payments,โ โ salary, โ โ
โ โ attendanceโ โ dues โ โ dept โ โ
โ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโ โ
โ โ Logical Data Independence โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ CONCEPTUAL SCHEMA (Logical Level) โ Full logical design โ
โ Tables: students, courses, enrollments, fees, โ
โ departments, employees, salaries, attendance โ
โ Relationships, constraints, data types โ no storage info โ
โ โ Physical Data Independence โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ INTERNAL SCHEMA (Physical Level) โ How data is stored โ
โ Index structures (B-tree, hash), storage allocation, โ
โ data compression, file organization, buffer management โ
โ Specific to disk โ hidden from users and programmers โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
๐ DATA INDEPENDENCE โ THE KEY BENEFIT
| Type | Definition | Example |
|---|---|---|
| Logical Data Independence | Changes to conceptual schema don't affect external views | Adding a middle_name column to students table doesn't break the student portal view that only shows first_name, last_name |
| Physical Data Independence | Changes to internal schema don't affect conceptual schema | DBA changes storage from HDD to SSD, adds a B-tree index on student_id โ SQL queries remain identical |
"Three-tier architecture and three-schema architecture are the same." No. Three-tier is about APPLICATION architecture (presentation/logic/data layers). Three-schema is about DATA ABSTRACTION within the DBMS itself (external/conceptual/internal views of data). They work together but solve different problems.
3.6 Database Schema vs Instance
๐ Blueprint vs Current State
Schema = the structure/design of the database (table definitions, columns, types, constraints). Rarely changes. Instance = the actual data in the database at a particular moment. Changes constantly.
๐ REAL-WORLD ANALOGYSchema is like a class definition in Java: class Student { String name; int rollNo; }. Instance is like an object: new Student("Rahul", 101). The class defines structure; the object holds current data. Schema is designed once; instances change every second (new students, updated grades).
SQL
-- SCHEMA (Blueprint โ defined once)
CREATE TABLE students (
student_id NUMBER(10) PRIMARY KEY,
student_name VARCHAR2(100) NOT NULL,
dept_code CHAR(3),
cgpa NUMBER(3,2) CHECK (cgpa BETWEEN 0 AND 10)
);
-- INSTANCE (Data โ changes constantly)
-- At 10:00 AM:
-- | 101 | Rahul Sharma | CSE | 8.50 |
-- | 102 | Priya Patel | ECE | 9.10 |
-- At 10:05 AM (after new admission):
-- | 101 | Rahul Sharma | CSE | 8.50 |
-- | 102 | Priya Patel | ECE | 9.10 |
-- | 103 | Amit Kumar | CSE | 7.80 | โ new row
3.7 Data Modeling Approaches
| Model | Structure | Era | Status | Example DBMS | Indian Use |
|---|---|---|---|---|---|
| Hierarchical | Tree structure (parent-child) | 1960s | โ ๏ธ Legacy | IBM IMS | Some old government systems |
| Network | Graph (many-to-many links) | 1970s | โ ๏ธ Legacy | IDMS | Rare โ mostly replaced |
| Relational | Tables with rows and columns | 1970-now | โ Dominant | Oracle, PostgreSQL, MySQL | SBI, IRCTC, Flipkart, TCS |
| Object-Oriented | Objects with methods | 1990s | โ ๏ธ Niche | db4o, ObjectDB | CAD/CAM systems |
| Document | JSON/BSON documents | 2009-now | โ Growing | MongoDB, CouchDB | Zomato menu catalog, Ola ride logs |
| Graph | Nodes + edges (relationships) | 2010-now | โ Growing | Neo4j, Amazon Neptune | Social networks, fraud detection |
3.8 Entity-Relationship (ER) Model
๐ The ER Model โ Blueprint Before Building
The ER model is a high-level conceptual data model used to design the database structure BEFORE writing any SQL. It uses entities (things), attributes (properties), and relationships (connections) to visually represent data requirements. Think of it as the architect's floor plan before construction begins.
๐ REAL-WORLD ANALOGYBefore constructing a house, you create a blueprint (ER diagram). Before coding a database, you design the ER diagram. Skipping this step is like building a house without a plan โ walls end up in wrong places, rooms don't connect, and plumbing is a nightmare.
Entities and Entity Sets
Attributes โ Types and Notation
| Attribute Type | Definition | ER Notation | Example |
|---|---|---|---|
| Simple | Atomic, cannot be divided further | Plain oval | student_name, age |
| Composite | Can be divided into sub-attributes | Oval with sub-ovals | address โ (street, city, state, pin) |
| Multivalued | Can have multiple values | Double oval | phone_numbers (a student can have 2-3 numbers) |
| Derived | Computed from other attributes | Dashed oval | age derived from date_of_birth |
| Key Attribute | Uniquely identifies entity | Underlined oval | student_id |
| NULL | Value not applicable or unknown | โ | fax_number for most Indian students = NULL |
Relationships โ Degree, Cardinality, Participation
| Concept | Options | Meaning | Example |
|---|---|---|---|
| Degree | Unary | Entity related to itself | Employee manages Employee (manager-subordinate) |
| Binary | Between 2 entity sets | Student enrolls_in Course | |
| Ternary | Between 3 entity sets | Doctor prescribes Medicine to Patient | |
| Cardinality | 1:1 | One to One | Employee has PAN Card (one PAN per person) |
| 1:N | One to Many | Department has_many Employees | |
| M:N | Many to Many | Student enrolls_in Course (many students in many courses) | |
| Participation | Total (mandatory) | Every entity must participate | Every Employee MUST belong to a Department (double line) |
| Partial (optional) | Some entities may not participate | Not every Employee manages a Department (single line) |
Weak Entity
๐ Weak Entity โ Exists Only Through Its Owner
A weak entity cannot be uniquely identified by its own attributes alone โ it depends on a "strong" (owner) entity. It has a partial key (discriminator) that, combined with the owner's primary key, creates a unique identifier.
๐ EXAMPLEDependent is a weak entity of Employee. "Priya" (dependent name) isn't unique globally โ many employees might have a dependent named "Priya." But (employee_id=101, dependent_name="Priya") IS unique. The dependent can't exist without the employee.
Weak entity: double rectangle. Identifying relationship: double diamond. Partial key: dashed underline.
Extended ER (EER) Concepts
| Concept | Definition | Example |
|---|---|---|
| Generalization | Combining lower-level entities into a higher-level entity (bottom-up) | SavingsAccount + CurrentAccount โ Account (generalize common attributes) |
| Specialization | Dividing a higher-level entity into lower-level sub-entities (top-down) | Employee โ FullTimeEmployee + PartTimeEmployee (add specific attributes) |
| Aggregation | Treating a relationship as a higher-level entity to relate it to another entity | The relationship "Employee works_on Project" is aggregated, then related to "Equipment" (which equipment is used for which employee-project combination) |
ER Diagram โ College Management System
ER Diagram (Chen Notation)
โโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโ
โ DEPARTMENTโ โ โ โ COURSE โ
โ โโโ1:Nโโค OFFERS โโโM:1โโโ โ
โ dept_id โ โ โ โ course_id โ
โ dept_name โ โโโโโโโโโโโโโโโโโโโโ โ title โ
โ hod_id โ โ credits โ
โโโโโโโฌโโโโโโ โโโโโโโฌโโโโโโ
โ 1:N โ M:N
โ โ
โโโโโโโดโโโโโโ โโโโโโโโโดโโโโโโโโ
โ STUDENT โ โ ENROLLS_IN โ
โ โโโโโโโโโโM:Nโโโโโโโโโโโโโโโโโโโ โ
โ student_idโ โ semester โ
โ name โ โ grade โ
โ dob โ โโโโโโโโโโโโโโโโโ
โ cgpa โ
โโโโโโโฌโโโโโโ
โ 1:N (weak)
โโโโโโโงโโโโโโ Weak entity (double rectangle)
โ DEPENDENTโ Depends on Student
โ โ Partial key: dep_name
โ dep_name โ Full PK: (student_id, dep_name)
โ relation โ
โโโโโโโโโโโโโ
3.9 The Relational Model
๐ Tables, Tuples, and Attributes โ The Language of Data
Proposed by E.F. Codd (IBM) in 1970, the relational model represents data as relations (tables). Each relation has tuples (rows) and attributes (columns). All modern SQL databases implement this model.
๐ TERMINOLOGY MAPPING| Formal (Math) | Informal (Common) | SQL | Example |
|---|---|---|---|
| Relation | Table | TABLE | students |
| Tuple | Row / Record | ROW | One student's complete data |
| Attribute | Column / Field | COLUMN | student_name |
| Domain | Allowed values | Data type + CHECK | cgpa: NUMBER(3,2) BETWEEN 0 AND 10 |
| Degree | Number of columns | โ | students has 4 columns โ degree = 4 |
| Cardinality | Number of rows | COUNT(*) | 500 students โ cardinality = 500 |
- Each cell contains exactly one value (atomic โ 1NF)
- Each column has a distinct name
- Values in a column are from the same domain
- Order of rows does not matter
- Order of columns does not matter
- No two rows are identical (unique tuples)
3.10 Keys โ The Identity System of Databases
| Key Type | Definition | Example (students table) |
|---|---|---|
| Super Key | Any set of attributes that uniquely identifies a tuple | {student_id}, {student_id, name}, {student_id, name, cgpa} โ all are super keys |
| Candidate Key | Minimal super key (no proper subset is a super key) | {student_id}, {aadhaar_number} โ both uniquely identify, both minimal |
| Primary Key (PK) | The chosen candidate key โ one per table, NOT NULL + UNIQUE | student_id chosen as PK (more stable than aadhaar) |
| Alternate Key | Candidate keys NOT chosen as primary key | aadhaar_number โ it's a candidate key but not the PK |
| Foreign Key (FK) | Attribute that references the PK of another table | dept_id in students references dept_id in departments |
| Composite Key | PK or candidate key consisting of 2+ attributes | (student_id, course_id) in enrollments table |
| Surrogate Key | System-generated artificial key (no business meaning) | Auto-increment id column โ used when no natural PK exists |
SQL โ Keys in Action
-- โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
-- Example: College Management Schema
-- Business Context: Core tables for student enrollment system
-- Industry: Every Indian university ERP
-- โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
CREATE TABLE departments (
dept_id NUMBER(5) PRIMARY KEY, -- Primary Key
dept_name VARCHAR2(50) NOT NULL UNIQUE, -- Alternate Key
hod_name VARCHAR2(100)
);
CREATE TABLE students (
student_id NUMBER(10) PRIMARY KEY, -- PK (surrogate)
aadhaar_no CHAR(12) UNIQUE, -- Alternate Key
student_name VARCHAR2(100) NOT NULL,
dept_id NUMBER(5) REFERENCES departments(dept_id), -- FK
cgpa NUMBER(3,2) CHECK (cgpa BETWEEN 0 AND 10)
);
CREATE TABLE courses (
course_id VARCHAR2(10) PRIMARY KEY,
course_title VARCHAR2(100) NOT NULL,
credits NUMBER(1) CHECK (credits BETWEEN 1 AND 5),
dept_id NUMBER(5) REFERENCES departments(dept_id)
);
CREATE TABLE enrollments (
student_id NUMBER(10) REFERENCES students(student_id),
course_id VARCHAR2(10) REFERENCES courses(course_id),
semester VARCHAR2(10) NOT NULL,
grade CHAR(2),
PRIMARY KEY (student_id, course_id, semester) -- Composite PK
);
3.11 Mapping ER Diagram to Relational Tables
| ER Construct | Relational Mapping Rule | Example |
|---|---|---|
| Strong Entity | Create a table with all simple attributes. PK = key attribute. | CREATE TABLE departments (dept_id PK, dept_name, ...) |
| Weak Entity | Create table with partial key + owner's PK as composite PK. FK to owner with ON DELETE CASCADE. | CREATE TABLE dependents (emp_id FK, dep_name, PK(emp_id, dep_name)) |
| 1:1 Relationship | Add FK in either table (prefer the one with total participation). Or merge into one table. | Employee has PAN: add pan_number to employees table |
| 1:N Relationship | Add FK in the "many" side table. | Department (1) โ Employee (N): add dept_id FK in employees |
| M:N Relationship | Create a new junction/bridge table with FKs from both entities. PK = composite of both FKs. Add relationship attributes here. | CREATE TABLE enrollments (student_id FK, course_id FK, grade, PK(student_id, course_id)) |
| Multivalued Attribute | Create a separate table with FK to original entity. | CREATE TABLE student_phones (student_id FK, phone_number, PK(student_id, phone_number)) |
| Composite Attribute | Store only the leaf sub-attributes as separate columns. | address โ columns: street, city, state, pincode |
| Derived Attribute | Usually NOT stored โ computed via query or virtual column. | age computed from dob: TRUNC(MONTHS_BETWEEN(SYSDATE, dob)/12) |
| Generalization / Specialization | Option 1: One table with type column. Option 2: Separate tables per subtype. Option 3: Both (parent + child tables). | Option 3: employees(emp_id PK), full_time(emp_id FK, salary), part_time(emp_id FK, hourly_rate) |
"Store phone numbers as a comma-separated string in one column." Never. This violates 1NF (atomicity), makes searching individual numbers impossible with standard SQL, and creates maintenance nightmares. Always create a separate phone_numbers table with a foreign key.
Industry Problems
๐ข Industry Problem #1 โ Design ER Diagram for Aadhaar Enrollment Center
Organization Type: Government (UIDAI)
Scenario: An Aadhaar enrollment center captures biometric (fingerprints, iris scan, face photo) and demographic (name, address, DOB, gender) data. Each center has multiple operators. Each resident enrolls exactly once. Operators work at exactly one center at a time.
Your Task: Design an ER diagram covering: Resident, Enrollment Center, Operator, Biometric Data, Demographic Data. Identify all entities, attributes, relationships, cardinalities, participation constraints, and weak entities.
๐ก Complete Solution
ER Design
ENTITIES:
Resident (STRONG): aadhaar_no (PK), name (composite: first, last),
dob, gender, phone, email
Enrollment_Center (STRONG): center_id (PK), center_name, address,
district, state, pincode
Operator (STRONG): operator_id (PK), operator_name, certification_no
Biometric_Data (WEAK of Resident): bio_type (discriminator: fingerprint/
iris/face), bio_data (BLOB), capture_timestamp
Full PK: (aadhaar_no, bio_type)
Demographic_Data: (part of Resident โ composite attributes, not separate entity)
RELATIONSHIPS:
Resident โโenrolls_atโโ(M:1)โโ Enrollment_Center
Each resident enrolls at ONE center. Each center enrolls MANY residents.
Resident: total participation (every resident must enroll somewhere)
Center: partial participation (a new center may have no enrollments yet)
Operator โโworks_atโโ(M:1)โโ Enrollment_Center
Each operator works at ONE center. Each center has MANY operators.
Operator: total. Center: partial.
Resident โโcaptured_byโโ(M:1)โโ Operator
Each enrollment is done by ONE operator. Operator handles MANY residents.
Relationship attribute: enrollment_date, enrollment_status
SQL โ Relational Mapping
CREATE TABLE enrollment_centers (
center_id NUMBER(10) PRIMARY KEY,
center_name VARCHAR2(100) NOT NULL,
address VARCHAR2(200),
district VARCHAR2(50),
state VARCHAR2(50),
pincode CHAR(6)
);
CREATE TABLE operators (
operator_id NUMBER(10) PRIMARY KEY,
operator_name VARCHAR2(100) NOT NULL,
certification_no VARCHAR2(20) UNIQUE NOT NULL,
center_id NUMBER(10) NOT NULL REFERENCES enrollment_centers(center_id)
);
CREATE TABLE residents (
aadhaar_no CHAR(12) PRIMARY KEY,
first_name VARCHAR2(50) NOT NULL,
last_name VARCHAR2(50),
dob DATE NOT NULL,
gender CHAR(1) CHECK (gender IN ('M','F','O')),
phone VARCHAR2(10),
email VARCHAR2(100),
center_id NUMBER(10) NOT NULL REFERENCES enrollment_centers(center_id),
operator_id NUMBER(10) NOT NULL REFERENCES operators(operator_id),
enrollment_date DATE DEFAULT SYSDATE
);
CREATE TABLE biometric_data (
aadhaar_no CHAR(12) REFERENCES residents(aadhaar_no) ON DELETE CASCADE,
bio_type VARCHAR2(20) CHECK (bio_type IN ('FINGERPRINT','IRIS','FACE')),
bio_data BLOB,
capture_timestamp TIMESTAMP DEFAULT SYSTIMESTAMP,
PRIMARY KEY (aadhaar_no, bio_type) -- Weak entity: composite PK
);
DBA Insight: The actual Aadhaar DB would use partitioning by state for performance, column-level encryption on biometric data (DBMS_CRYPTO in Oracle), and row-level security so operators see only their center's data.
๐ข Industry Problem #2 โ Hospital OPD Database Design
Organization Type: Multi-specialty hospital (like Fortis/Apollo)
Scenario: Design a database for Outpatient Department (OPD): patients visit, see a doctor, get a prescription, pay bills. Support re-visits (same patient, different dates, possibly different doctors).
Entities: Patient, Doctor, Department, Appointment, Prescription, Medicine, Bill
Key Relationships:
- Patient (M:N) Doctor โ via Appointment (bridge entity with date, time, status)
- Appointment (1:N) Prescription โ one appointment generates multiple prescriptions
- Prescription (M:1) Medicine โ each prescription line references one medicine
- Patient (1:N) Bill โ one patient can have multiple bills over time
- Doctor (M:1) Department โ each doctor belongs to one department
Solution Highlights:
SQL
CREATE TABLE patients (
patient_id NUMBER(10) GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
patient_name VARCHAR2(100) NOT NULL,
aadhaar_no CHAR(12) UNIQUE,
phone VARCHAR2(10) NOT NULL,
blood_group VARCHAR2(5) CHECK (blood_group IN ('A+','A-','B+','B-','O+','O-','AB+','AB-')),
dob DATE
);
CREATE TABLE appointments (
appointment_id NUMBER(10) GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
patient_id NUMBER(10) NOT NULL REFERENCES patients(patient_id),
doctor_id NUMBER(10) NOT NULL REFERENCES doctors(doctor_id),
appointment_date DATE NOT NULL,
appointment_time TIMESTAMP,
status VARCHAR2(15) DEFAULT 'SCHEDULED'
CHECK (status IN ('SCHEDULED','COMPLETED','CANCELLED','NO_SHOW'))
);
DBA Insight: Hospital databases require HIPAA-equivalent compliance in India (DISHA Act draft). This means audit trails on patient data access, encryption at rest, and role-based access (doctor sees medical history, billing staff sees only bill amounts).
๐ข Industry Problem #3 โ E-Commerce Product Catalog (Flipkart-scale)
Organization Type: E-commerce platform
Scenario: Design the core product catalog schema for a Flipkart-like platform. Products belong to categories (hierarchical: Electronics โ Mobiles โ Smartphones). Products have multiple images, multiple sellers (same product, different sellers/prices), and customer reviews.
Key Design Challenges:
- Category hierarchy: Self-referencing FK (
parent_category_id REFERENCES categories(category_id)) - Multiple images: Separate
product_imagestable (1:N) โ NOT comma-separated URLs in one column - Multiple sellers for same product:
product_sellersbridge table (M:N) with price, stock, and seller_rating - Reviews:
reviewstable with FK to bothproductsandcustomers
SQL
CREATE TABLE categories (
category_id NUMBER(10) PRIMARY KEY,
category_name VARCHAR2(100) NOT NULL,
parent_category_id NUMBER(10) REFERENCES categories(category_id) -- Self-ref!
);
CREATE TABLE products (
product_id NUMBER(10) PRIMARY KEY,
product_name VARCHAR2(200) NOT NULL,
description CLOB,
category_id NUMBER(10) NOT NULL REFERENCES categories(category_id),
brand VARCHAR2(50),
avg_rating NUMBER(2,1) CHECK (avg_rating BETWEEN 0 AND 5)
);
CREATE TABLE product_sellers (
product_id NUMBER(10) REFERENCES products(product_id),
seller_id NUMBER(10) REFERENCES sellers(seller_id),
price NUMBER(10,2) NOT NULL CHECK (price > 0),
stock_qty NUMBER(10) DEFAULT 0 CHECK (stock_qty >= 0),
PRIMARY KEY (product_id, seller_id)
);
DBA Insight: Flipkart deliberately denormalizes some product data for read performance โ product name, primary image URL, and price are cached in a single "product_listing" denormalized table for homepage display. The normalized tables are the source of truth; the denormalized version is for speed.
Lab Exercises
Exercise 1: Identify File System Problems in a Real Scenario
Schema: No SQL โ conceptual exercise
Task: A small college manages student data using Excel files: admissions.xlsx, fees.xlsx, library.xlsx, hostel.xlsx. Each file has student name, roll number, phone, and department โ plus domain-specific columns.
- List all 7 file system problems (from Section 3.1) that this college faces
- For EACH problem, give a specific example using the college scenario
- Explain how a DBMS would solve each one โ mention specific features (FK, transactions, GRANT, etc.)
Expected Output: A 7-row table: Problem | Excel Example | DBMS Solution
Extension: The college adds an online student portal. How does the three-tier architecture apply?
Exercise 2: Draw ER Diagram for a Library Management System
Schema: Library with: Books, Members, Authors, Publishers, Book Issues, Fines
Task:
- Identify all entities with their attributes (mark PK, composite, multivalued, derived)
- Identify relationships with cardinality (1:1, 1:N, M:N) and participation (total/partial)
- Identify at least ONE weak entity (Book Copy โ same book can have multiple copies, identified by copy_number within a book)
- Draw the ER diagram using Chen notation
- List all candidate keys for each entity
Hints: Book โ Author is M:N (one book can have multiple authors). A "fine" depends on an "issue" (weak entity pattern). Derived attribute: overdue_days from due_date and return_date.
Extension: Add a "Reservation" entity โ a member can reserve a currently-borrowed book.
Exercise 3: Map ER Diagram to Relational Schema
Schema: Use the Library ER diagram from Exercise 2
Task:
- Apply all ER-to-Relational mapping rules from Section 3.11
- Write complete
CREATE TABLEstatements with: PK, FK (with ON DELETE action), NOT NULL, CHECK, DEFAULT, UNIQUE - For the M:N relationship (Book โ Author), create the bridge table
- For the weak entity (Book Copy), implement the composite PK
- Insert at least 5 rows per table with realistic Indian data
Expected Output: Complete runnable SQL script (Oracle or PostgreSQL)
Hints: Use ON DELETE CASCADE for weak entities, ON DELETE SET NULL for optional relationships.
Exercise 4: Identify All Key Types from a Given Table
Given Table:
Table: employees
emp_id | aadhaar_no | pan_no | name | dept_id | email
101 | 234567890123 | ABCDE1234F | Rahul | 10 | rahul@tcs.com
102 | 345678901234 | BCDEF2345G | Priya | 20 | priya@tcs.com
103 | 456789012345 | CDEFG3456H | Amit | 10 | amit@tcs.com
Task:
- List ALL super keys (there are many โ identify at least 8)
- Identify ALL candidate keys (minimal super keys)
- Choose a primary key and justify WHY
- Identify alternate keys
- Identify the foreign key and what it references
- Is there a composite key scenario here?
Expected Output: Categorized key table with justifications
Exercise 5: Design Complete ER for a Banking System
Scenario: Design the database for a bank (like SBI) supporting: Customers, Accounts (Savings, Current โ use specialization), Branches, Transactions (deposit, withdrawal, transfer), Loans, Loan Payments.
Task:
- Draw complete ER diagram with all entities, attributes, relationships, cardinalities
- Use generalization/specialization for Account types
- Model "transfer" as a relationship between TWO accounts (from_account, to_account)
- Map the ER diagram to relational tables (full CREATE TABLE SQL)
- Insert sample data for at least 3 customers, 5 accounts, and 10 transactions
Expected Output: ER diagram + complete runnable SQL script + sample data
Extension: Add a "Joint Account" entity โ an account owned by 2+ customers (M:N between Customer and Account). How does this change your schema?
MCQ Assessment Bank โ 15 Questions
Hover to reveal answer and explanation.
Which of the following is NOT a problem of file-based data storage?
- Data redundancy
- Data inconsistency
- ACID transactions
- Concurrent access anomalies
๐ข GATE CS 2019 โ similar question asked. TCS NQT Database section frequently tests this.
In the three-schema architecture, which level deals with how data is physically stored on disk?
- External schema
- Conceptual schema
- Internal schema
- Application schema
๐ข Oracle OCA (1Z0-071) exam covers this. GATE CS regularly asks about data independence.
A weak entity is represented in an ER diagram using:
- Single rectangle
- Dashed rectangle
- Double rectangle
- Rounded rectangle
๐ข This is a guaranteed question in GATE CS and university exams. Know ALL ER notation symbols.
Physical data independence means:
- Changing the application code doesn't affect the database
- Changing the internal schema (storage, indexing) doesn't require changes to the conceptual or external schemas
- Changing the database vendor doesn't affect SQL queries
- Data can be physically moved between servers
๐ข When SBI migrated from HDD to SSD storage for their Oracle database, zero SQL queries needed modification โ this is physical data independence in action.
Why is a foreign key important in the relational model?
- It automatically encrypts data
- It enforces referential integrity โ ensuring that a value in one table must exist as a primary key in the referenced table
- It makes queries run faster
- It prevents NULL values in columns
dept_id=99 in the students table if dept_id=99 doesn't exist in the departments table. This prevents orphan records โ data pointing to non-existent parents. FKs don't encrypt data (that's DBMS_CRYPTO) or speed up queries directly (that's indexes).๐ข UPI payment systems enforce FK between transaction and account โ you can't debit a non-existent account.
What is the difference between a candidate key and a primary key?
- They are the same thing
- A candidate key is any super key; a primary key is the smallest one
- A candidate key is a minimal super key; a primary key is the ONE candidate key chosen as the main identifier for the table. Other candidate keys become alternate keys
- A primary key can have NULL values but a candidate key cannot
students table has candidate keys: {student_id} and {aadhaar_no}. Choose student_id as PK โ aadhaar_no becomes alternate key.๐ข Infosys SP SQL test asks this distinction. Also common in GATE CS.
A university database has: Students (student_id PK), Courses (course_id PK), and the relationship "enrolls_in" is M:N. How do you represent this in the relational model?
- Add a course_id column in the Students table
- Add a student_id column in the Courses table
- Create a new junction table enrollments(student_id FK, course_id FK) with composite PK (student_id, course_id)
- Merge Students and Courses into one table
enrollments has FKs to both tables and a composite PK. Relationship attributes (grade, semester) are added to this junction table.๐ข This is the most common ER-to-relational mapping question in exams and interviews.
Given: Employee entity with attributes: emp_id, name, phone_numbers (multiple), address (street, city, state, pin). How should this be represented in a relational schema?
- One table with comma-separated phone numbers and address as one column
- One table with phone1, phone2, phone3 columns and address as one column
- employees(emp_id, name, street, city, state, pin) + employee_phones(emp_id FK, phone_number). Composite attribute decomposed, multivalued attribute in separate table
- All attributes in one row including arrays
๐ข This directly tests ER-to-relational mapping knowledge โ asked at Wipro NLTH and TCS NQT.
An Aadhaar enrollment center has Residents and Biometric Data. A biometric record (fingerprint/iris/face) cannot exist without a resident. What ER construct models this?
- Strong entity with FK
- Weak entity with identifying relationship โ Biometric Data has a partial key (bio_type) and depends on Resident's PK (aadhaar_no) for full identification
- M:N relationship between Resident and Biometric
- Generalization/Specialization
๐ข UIDAI's actual schema uses this pattern. GATE CS 2020 had a similar weak entity question.
A database schema uses: accounts(account_id PK, customer_id FK, balance). The constraint CHECK (balance >= 0) exists. A UPI payment of โน500 from account with balance โน300 is attempted. What happens?
- The balance becomes -200
- The DBMS rejects the UPDATE because the CHECK constraint would be violated โ the transaction is rolled back, and balance stays โน300
- The DBMS ignores the CHECK and processes the payment
- The database crashes
UPDATE accounts SET balance = balance - 500 WHERE account_id = ... would result in balance = -200, violating CHECK (balance >= 0). The DBMS raises an error, the statement is rolled back, and the original balance (โน300) is preserved. This is domain integrity in action.๐ข NPCI (UPI) relies on such constraints to prevent overdrafts. This is a practical interview question at Indian fintech companies.
A company stores product reviews in a relational DB. Each review has: review_id, product_id (FK), customer_id (FK), rating (1-5), review_text, helpful_votes. A document database (MongoDB) stores the same as a JSON document embedded within the product document. What are the trade-offs?
- Relational is always better
- Document is always better
- Relational: better for analytical queries across reviews (JOIN, GROUP BY, AVG), enforces referential integrity. Document: faster reads for a single product page (all data in one document), no JOINs needed, but harder to query across products and no FK enforcement
- Both are identical in performance
๐ข This is a system design interview question at Indian product companies (Flipkart, Meesho, Swiggy).
A startup is building an e-commerce platform with 10,000 daily orders. The CTO proposes using MongoDB for everything including payment transactions. Critique this decision.
- Good decision โ MongoDB is faster
- Poor decision โ financial transactions require ACID properties (atomicity for debit-credit, consistency for balance rules). MongoDB historically lacked full ACID support. Use PostgreSQL/Oracle for payments (ACID-compliant) and MongoDB for product catalog (flexible schema). This hybrid approach is industry standard
- Use neither โ build custom file storage
- Use only Oracle for everything
๐ข RBI mandates ACID-compliant database systems for financial transaction processing. This is a real architectural decision every Indian fintech startup faces.
A senior DBA proposes using natural keys (Aadhaar number) as the primary key for all tables in a hospital management system. Evaluate this approach.
- Excellent โ Aadhaar is unique and government-issued
- Risky โ Aadhaar numbers can be entered incorrectly, policy may change (government could restructure), and storing Aadhaar as PK in every related table increases storage and exposes sensitive data. Use a surrogate key (auto-increment patient_id) as PK with Aadhaar as a UNIQUE constraint
- Use patient name as PK instead
- PKs are not important
๐ข This is a real architectural debate in Indian healthcare IT. DISHA Act compliance requires minimizing Aadhaar exposure in database schemas.
Design an ER diagram for an online food delivery app (Zomato/Swiggy). Identify: entities, key attributes, relationships with cardinality, at least one weak entity, and one M:N relationship.
- Single entity: Orders with all data in one table
- Entities: Customer, Restaurant, Menu_Item, Order, Order_Item (weak entity of Order), Delivery_Partner, Address. Relationships: Customer places Order (1:N), Order contains Order_Items (1:N, identifying), Restaurant has Menu_Items (1:N), Order assigned to Delivery_Partner (M:1), Customer has Addresses (1:N). M:N: Restaurant โ Cuisine_Type
- Only two tables: Users and Food
- Use a spreadsheet instead
๐ข This is a common system design question at Swiggy, Zomato, and Dunzo interviews.
Design the relational schema for a ride-sharing app (Ola/Rapido). Must support: drivers, riders, rides (with pickup/drop locations), payments (multiple methods), and driver ratings. Write the CREATE TABLE SQL for the core tables.
- One big table with everything
- drivers(driver_id PK, name, phone UNIQUE, license_no UNIQUE, vehicle_type, rating). riders(rider_id PK, name, phone UNIQUE). rides(ride_id PK, rider_id FK, driver_id FK, pickup_lat, pickup_lng, drop_lat, drop_lng, fare, status CHECK('REQUESTED','ONGOING','COMPLETED','CANCELLED'), ride_date). payments(payment_id PK, ride_id FK UNIQUE, amount, method CHECK('UPI','CARD','WALLET','CASH'), status). ratings(ride_id FK, rider_id FK, rating CHECK(1-5), PK(ride_id))
- Use only NoSQL for everything
- This cannot be modeled relationally
๐ข This level of schema design is expected in system design rounds at Indian product companies.
Chapter Summary
๐ฏ 3 DB Skills This Chapter Unlocks in Industry
- ER Diagram Design โ Every project at TCS/Infosys starts with ER design. If you can design a correct ER diagram for a hospital or e-commerce system, you can handle the first phase of any database project.
- Key Identification โ Knowing PK, FK, candidate key, and composite key is tested in every campus placement SQL test. You'll use FK constraints in every CREATE TABLE you ever write.
- Three-Tier Understanding โ When debugging "why is the website slow?", knowing whether the bottleneck is in Tier 1 (frontend), Tier 2 (backend logic), or Tier 3 (database query) saves hours of wasted investigation.
๐ Quick Reference Card
ER NOTATION (Chen):
Strong entity: [Rectangle] Weak entity: [[Double Rectangle]]
Attribute: (Oval) Key attr: (Underlined Oval)
Multivalued: ((Double Oval)) Derived: ((Dashed Oval))
Relationship: <Diamond> Identifying rel: <<Double Diamond>>
Total participation: โโโ Partial participation: โโโ
KEYS:
Super Key: any uniquely identifying set
Candidate Key: minimal super key
Primary Key: chosen candidate key (one per table)
Alternate Key: non-chosen candidate keys
Foreign Key: references another table's PK
Composite Key: PK with 2+ columns
Surrogate Key: system-generated (auto-increment)
MAPPING RULES:
Strong Entity โ CREATE TABLE (PK = key attribute)
Weak Entity โ composite PK (owner_PK + discriminator) + ON DELETE CASCADE
1:1 โ FK in either table
1:N โ FK in "many" side
M:N โ bridge table with composite PK of both FKs
Multivalued attr โ separate table with FK
Composite attr โ decompose into leaf sub-attributes
๐ What to Practice Next
- Draw ER diagrams for: Railway reservation, Online exam portal, Zomato, UPI payment
- Practice: db-fiddle.com (run SQL online), Oracle Live SQL (free Oracle environment)
- Read: "Database System Concepts" by Silberschatz (the standard textbook for GATE CS)
Interview & Career Preparation
Q1: What is a DBMS and why is it needed?
Model Answer: A DBMS is software that manages structured data โ storing, retrieving, updating, and securing it. It's needed because file-based storage suffers from 7 critical problems: data redundancy, inconsistency, isolation, integrity issues, lack of atomicity, concurrent access conflicts, and weak security. DBMS solves all of these through structured schemas, ACID transactions, constraint enforcement, locking mechanisms, and granular access control.
Q2: Explain the three-schema architecture.
Model Answer: The ANSI/SPARC three-schema architecture has three levels: (1) External schema โ individual user views (student portal sees grades, accounts sees fees). (2) Conceptual schema โ the full logical design (all tables, relationships, constraints). (3) Internal schema โ physical storage details (indexes, file organization). The key benefit is data independence: changing storage (internal) doesn't affect queries (conceptual), and adding columns (conceptual) doesn't break existing views (external).
Q3: What is the difference between a primary key and a unique key?
Model Answer: Primary Key: one per table, NOT NULL + UNIQUE, defines the table's identity. Unique Key: multiple allowed per table, allows one NULL (in most DBMS), enforces uniqueness but doesn't define identity. Example: student_id is PK (never null, never duplicate), email is UNIQUE (unique but some students might not have email โ NULL allowed).
Q4: What is a weak entity? Give an example.
Model Answer: A weak entity cannot be uniquely identified by its own attributes alone โ it depends on a strong (owner) entity. Example: "Dependent" of an Employee. "Priya" (dependent name) isn't globally unique, but (employee_id=101, dependent_name="Priya") is. In ER: double rectangle for weak entity, double diamond for identifying relationship. In relational model: composite PK = owner's PK + discriminator, FK with ON DELETE CASCADE.
Q5: What is the difference between logical and physical data independence?
Model Answer: Logical data independence: changes to the conceptual schema (adding a column, creating a new table) don't affect external views or application queries. Physical data independence: changes to storage (new index, different file organization, SSD migration) don't affect the conceptual schema. Physical independence is easier to achieve; logical independence is harder because views depend on table structure.
Q6: How do you map an M:N relationship to tables?
Model Answer: Create a junction/bridge table containing: FKs referencing both entity tables' PKs, a composite PK made of both FKs, and any relationship attributes. Example: Students (M:N) Courses โ Create enrollments(student_id FK, course_id FK, grade, semester, PK(student_id, course_id, semester)). This is the ONLY correct way โ adding course_id to Students creates 1:N, not M:N.
Q7: What is a surrogate key? When do you use one?
Model Answer: A surrogate key is a system-generated artificial key (auto-increment integer or UUID) with no business meaning. Use it when: (1) no natural candidate key exists, (2) natural keys are too long (Aadhaar = 12 chars vs INTEGER = 4 bytes), (3) natural keys might change (email, phone), (4) natural keys are sensitive (PII like Aadhaar โ shouldn't appear as FK in 20 tables). Industry standard: surrogate PK + natural key as UNIQUE constraint.
Q8: Name 5 real-world applications of DBMS in India.
Model Answer: (1) SBI Core Banking (Oracle) โ processes 100 crore+ monthly transactions. (2) UIDAI Aadhaar (Oracle + PostgreSQL) โ 1.4 billion biometric records. (3) IRCTC (Oracle) โ 25 crore+ annual ticket bookings. (4) UPI/NPCI โ 10 billion+ monthly payment transactions across 300+ banks. (5) Flipkart (PostgreSQL + MySQL) โ 50 crore+ product catalog with real-time inventory management.
Q9: What is the relational model? Who proposed it?
Model Answer: The relational model was proposed by E.F. Codd at IBM in 1970. It represents data as relations (tables) with tuples (rows) and attributes (columns). Properties: atomic values (no multivalued cells), unique tuples, no inherent row/column ordering. All modern SQL databases โ Oracle, PostgreSQL, MySQL, SQL Server โ implement this model. It replaced hierarchical and network models due to simplicity, mathematical foundation (relational algebra), and data independence.
Q10: Compare Oracle and PostgreSQL โ when would you use each?
Model Answer: Oracle: proprietary, enterprise-grade, used by SBI/IRCTC/government (strong Indian presence), excellent support + RAC clustering + partitioning. Costly (lakhs per CPU). PostgreSQL: open-source, free, used by startups (Flipkart, Razorpay), excellent MVCC, JSON support, extensible. Zero licensing cost. Choose Oracle when: enterprise compliance needed, vendor support required, existing Oracle investment. Choose PostgreSQL when: cost matters, modern features needed (JSONB, window functions), cloud-native deployment.
๐ Certification Roadmap
- Oracle Database SQL Certified Associate (1Z0-071) โ Validates SQL fundamentals. Recognized at every Indian IT company. ~$250 exam.
- PostgreSQL Associate Certification โ By EDB (EnterpriseDB). Validates PostgreSQL skills. Growing demand in Indian startups.
- Microsoft DP-900 โ Azure Data Fundamentals. Good entry-level cloud + database cert. ~$100 exam.