Phase 2 β’ EduArtha
Programming & Software Engineering
Python is the language of AI. You must also understand how to write efficient, scalable code. This book covers Python mastery, scientific computing, software engineering practices, and hardware fundamentals.
β± 2β4 months | 13 Chapters | 50+ Exercises
Python Mastery
Core language skills every AI engineer needs
Data Structures & Algorithms
Learning Objectives
- Choose the right data structure for each problem (lists, dicts, sets, tuples)
- Implement stacks, queues, and linked lists
- Understand Big-O notation and analyze algorithm complexity
- Implement binary search, merge sort, and quicksort
Built-in Data Structures
| Structure | Ordered | Mutable | Duplicates | Lookup | Best For |
|---|---|---|---|---|---|
| List | β | β | β | O(n) | Ordered collections |
| Tuple | β | β | β | O(n) | Immutable records |
| Set | β | β | β | O(1) | Membership testing |
| Dict | β* | β | Keys: β | O(1) | Key-value mapping |
Python
# Performance comparison β why choosing right structure matters
import time
data_list = list(range(1_000_000))
data_set = set(data_list)
# Searching for an element
target = 999_999
start = time.time()
_ = target in data_list # O(n) β scans every element
print(f"List: {time.time()-start:.6f}s")
start = time.time()
_ = target in data_set # O(1) β hash lookup
print(f"Set: {time.time()-start:.6f}s")
# Set is ~1000x faster for membership testing!
Big-O Notation
| Notation | Name | Example | 1M items |
|---|---|---|---|
| O(1) | Constant | Dict lookup | 1 op |
| O(log n) | Logarithmic | Binary search | 20 ops |
| O(n) | Linear | List scan | 1M ops |
| O(n log n) | Linearithmic | Merge sort | 20M ops |
| O(nΒ²) | Quadratic | Bubble sort | 1T ops |
Searching & Sorting
Python
# Binary Search β O(log n)
def binary_search(arr, target):
left, right = 0, len(arr) - 1
while left <= right:
mid = (left + right) // 2
if arr[mid] == target:
return mid
elif arr[mid] < target:
left = mid + 1
else:
right = mid - 1
return -1
# Quick Sort β O(n log n) average
def quicksort(arr):
if len(arr) <= 1:
return arr
pivot = arr[len(arr) // 2]
left = [x for x in arr if x < pivot]
middle = [x for x in arr if x == pivot]
right = [x for x in arr if x > pivot]
return quicksort(left) + middle + quicksort(right)
print(quicksort([38, 27, 43, 3, 9, 82, 10]))
# Stack implementation
class Stack:
def __init__(self): self.items = []
def push(self, item): self.items.append(item)
def pop(self): return self.items.pop()
def peek(self): return self.items[-1]
def is_empty(self): return len(self.items) == 0
Project: Task Scheduler with Priority Queue
Python
import heapq
class TaskScheduler:
def __init__(self):
self.heap = []
self.counter = 0
def add_task(self, task, priority):
heapq.heappush(self.heap, (priority, self.counter, task))
self.counter += 1
def get_next(self):
if self.heap:
priority, _, task = heapq.heappop(self.heap)
return task
return None
scheduler = TaskScheduler()
scheduler.add_task("Fix critical bug", 1)
scheduler.add_task("Write docs", 5)
scheduler.add_task("Deploy to prod", 2)
scheduler.add_task("Code review", 3)
while (task := scheduler.get_next()):
print(f"Executing: {task}")
Exercises
Exercise 1.1: Implement merge sort and explain its time complexity
def merge_sort(arr):
if len(arr) <= 1: return arr
mid = len(arr) // 2
left = merge_sort(arr[:mid])
right = merge_sort(arr[mid:])
return merge(left, right)
def merge(left, right):
result, i, j = [], 0, 0
while i < len(left) and j < len(right):
if left[i] <= right[j]:
result.append(left[i]); i += 1
else:
result.append(right[j]); j += 1
result.extend(left[i:]); result.extend(right[j:])
return result
Time: O(n log n) always. Space: O(n). Divides array in half each time (log n levels), merges n elements at each level.
Exercise 1.2: When would you use a dict over a list?
Use dict when you need fast O(1) key-based lookup, counting occurrences, or mapping relationships. Use list when you need ordered elements, indexed access, or iteration in sequence. Example: counting word frequencies β dict. Storing sorted scores β list.
Exercise 1.3: Implement a queue using two stacks
class QueueFromStacks:
def __init__(self):
self.in_stack = []
self.out_stack = []
def enqueue(self, item):
self.in_stack.append(item)
def dequeue(self):
if not self.out_stack:
while self.in_stack:
self.out_stack.append(self.in_stack.pop())
return self.out_stack.pop()
Exercise 1.4: What is the time complexity of checking if an element exists in a list vs a set?
List: O(n) β must scan linearly. Set: O(1) amortized β uses hash table. For 1M elements, list takes ~1M comparisons, set takes ~1. Always use sets for membership tests.
Chapter Summary
- Choose data structures by access pattern: O(1) lookup β dict/set, ordered β list
- Binary search (O(log n)) requires sorted data; quicksort/mergesort are O(n log n)
- Big-O describes worst-case growth rate β crucial for scalable code
- Stacks (LIFO) and queues (FIFO) solve specific ordering problems
Object-Oriented Programming
Learning Objectives
- Design classes with encapsulation, inheritance, and polymorphism
- Implement magic/dunder methods for Pythonic objects
- Apply abstract classes and design patterns
- Build a complete OOP project
Classes & Objects
Python
class NeuralLayer:
def __init__(self, input_size, output_size, activation='relu'):
self.weights = [[0.0] * input_size for _ in range(output_size)]
self.bias = [0.0] * output_size
self.activation = activation
self._name = f"Layer({input_size}β{output_size})" # private
def __repr__(self):
return f"NeuralLayer({self._name}, act={self.activation})"
def __len__(self):
return len(self.weights)
def param_count(self):
return len(self.weights) * len(self.weights[0]) + len(self.bias)
layer = NeuralLayer(128, 64)
print(layer) # NeuralLayer(Layer(128β64), act=relu)
print(len(layer)) # 64
print(layer.param_count()) # 8256
Inheritance & Polymorphism
Python
from abc import ABC, abstractmethod
class Shape(ABC):
@abstractmethod
def area(self): pass
@abstractmethod
def perimeter(self): pass
def describe(self):
return f"{self.__class__.__name__}: area={self.area():.2f}"
class Circle(Shape):
def __init__(self, radius):
self.radius = radius
def area(self): return 3.14159 * self.radius ** 2
def perimeter(self): return 2 * 3.14159 * self.radius
class Rectangle(Shape):
def __init__(self, w, h):
self.w, self.h = w, h
def area(self): return self.w * self.h
def perimeter(self): return 2 * (self.w + self.h)
# Polymorphism β same interface, different behavior
shapes = [Circle(5), Rectangle(4, 6), Circle(3)]
for s in shapes:
print(s.describe())
Design Patterns
Python
# Singleton β only one instance ever
class DatabaseConnection:
_instance = None
def __new__(cls):
if cls._instance is None:
cls._instance = super().__new__(cls)
return cls._instance
# Factory β create objects without specifying exact class
class ModelFactory:
@staticmethod
def create(model_type):
models = {'linear': LinearModel, 'tree': TreeModel}
return models[model_type]()
Project: Library Management System
Python
from datetime import datetime, timedelta
class Book:
def __init__(self, title, author, isbn):
self.title, self.author, self.isbn = title, author, isbn
self.is_available = True
def __str__(self): return f"'{self.title}' by {self.author}"
class Member:
def __init__(self, name, member_id):
self.name, self.id = name, member_id
self.borrowed = []
class Library:
def __init__(self, name):
self.name = name
self.books, self.members, self.loans = {}, {}, []
def add_book(self, book):
self.books[book.isbn] = book
def borrow(self, isbn, member_id):
book = self.books.get(isbn)
member = self.members.get(member_id)
if book and member and book.is_available:
book.is_available = False
due = datetime.now() + timedelta(days=14)
self.loans.append({'book': book, 'member': member, 'due': due})
member.borrowed.append(book)
print(f"β {member.name} borrowed {book}. Due: {due:%Y-%m-%d}")
def return_book(self, isbn):
book = self.books.get(isbn)
if book:
book.is_available = True
print(f"β {book} returned")
lib = Library("City Library")
lib.add_book(Book("Deep Learning", "Goodfellow", "978-0"))
lib.members["M1"] = Member("Alice", "M1")
lib.borrow("978-0", "M1")
Exercises
Exercise 2.1: Implement __add__ and __eq__ for a Vector2D class
class Vector2D:
def __init__(self, x, y): self.x, self.y = x, y
def __add__(self, other): return Vector2D(self.x+other.x, self.y+other.y)
def __eq__(self, other): return self.x==other.x and self.y==other.y
def __repr__(self): return f"Vec({self.x},{self.y})"
v = Vector2D(1,2) + Vector2D(3,4) # Vec(4,6)
Exercise 2.2: What is the difference between @staticmethod and @classmethod?
@staticmethod: No access to class or instance. Just a function namespaced inside the class. def method():
@classmethod: Receives the class as first arg (cls). Can access/modify class state. Used for alternative constructors like Date.from_string("2024-01-15").
Exercise 2.3: Why is composition often preferred over inheritance?
"Favor composition over inheritance" β instead of Car extends Engine (a car IS an engine? No!), use Car has-a Engine. Composition is more flexible: you can swap components at runtime, avoid deep inheritance hierarchies, and follow the Single Responsibility Principle.
Chapter Summary
- Classes encapsulate data + behavior; __init__ initializes state
- Inheritance enables code reuse; ABC enforces interfaces
- Dunder methods (__str__, __add__, __len__) make objects behave like built-ins
- Design patterns (Singleton, Factory) solve recurring OOP problems
Functional Programming Concepts
Learning Objectives
- Use higher-order functions: map, filter, reduce, and lambdas
- Build closures and understand their practical applications
- Leverage functools for memoization and partial application
- Write clean data processing pipelines in functional style
First-Class Functions & Lambdas
Python
# Functions are objects β can be passed, returned, stored
def apply_twice(func, value):
return func(func(value))
print(apply_twice(lambda x: x * 2, 3)) # 12
print(apply_twice(lambda x: x + 10, 5)) # 25
# map, filter, reduce
nums = [1, 2, 3, 4, 5, 6, 7, 8]
squares = list(map(lambda x: x**2, nums)) # [1,4,9,16,25,36,49,64]
evens = list(filter(lambda x: x%2==0, nums)) # [2,4,6,8]
from functools import reduce
total = reduce(lambda a, b: a + b, nums) # 36
# List comprehension alternative (more Pythonic)
squares = [x**2 for x in nums]
evens = [x for x in nums if x % 2 == 0]
Closures & functools
Python
# Closure β function that remembers its enclosing scope
def make_multiplier(factor):
def multiply(x):
return x * factor # 'factor' is captured from outer scope
return multiply
double = make_multiplier(2)
triple = make_multiplier(3)
print(double(5)) # 10
print(triple(5)) # 15
# Memoization with lru_cache
from functools import lru_cache
@lru_cache(maxsize=None)
def fibonacci(n):
if n < 2: return n
return fibonacci(n-1) + fibonacci(n-2)
print(fibonacci(100)) # Instant! Without cache: years
# Partial application
from functools import partial
import json
pretty_json = partial(json.dumps, indent=2, sort_keys=True)
print(pretty_json({"name": "AI", "type": "ML"}))
Project: Functional Data Pipeline
Python
from functools import reduce
# Pipeline: compose functions left-to-right
def pipeline(*funcs):
return lambda x: reduce(lambda v, f: f(v), funcs, x)
# Data processing steps
clean = lambda data: [s.strip().lower() for s in data]
remove_empty = lambda data: [s for s in data if s]
remove_dupes = lambda data: list(dict.fromkeys(data))
sort_alpha = lambda data: sorted(data)
process = pipeline(clean, remove_empty, remove_dupes, sort_alpha)
raw = [" Python ", " java", "", "PYTHON", " Go ", "java"]
result = process(raw)
print(result) # ['go', 'java', 'python']
Exercises
Exercise 3.1: Rewrite this loop using map and filter: [x**2 for x in range(20) if x % 3 == 0]
result = list(map(lambda x: x**2, filter(lambda x: x%3==0, range(20))))
# [0, 9, 36, 81, 144, 225, 324]
# List comprehension is more Pythonic, but map/filter is more composable
Exercise 3.2: Implement a memoized factorial using lru_cache
@lru_cache(maxsize=None)
def factorial(n):
if n <= 1: return 1
return n * factorial(n - 1)
print(factorial(100)) # Computed instantly with caching
Exercise 3.3: What is the advantage of closures over global variables?
Closures encapsulate state without polluting global scope. Each closure gets its own private copy of captured variables. They are thread-safe (no shared mutable state), testable (no hidden dependencies), and composable. Global variables create hidden coupling and make debugging harder.
Chapter Summary
- Functions are first-class objects β pass, return, and store them
- map/filter/reduce enable declarative data transformation
- Closures capture enclosing scope β great for factories and callbacks
- lru_cache provides free memoization for recursive functions
Memory Management & Generators
Learning Objectives
- Understand Python's memory model: reference counting and garbage collection
- Build generators with yield for memory-efficient data processing
- Use itertools for powerful lazy iteration patterns
- Optimize memory with __slots__ and generator expressions
Generators: Lazy Evaluation
Python
import sys
# List vs Generator β memory comparison
big_list = [x**2 for x in range(1_000_000)]
big_gen = (x**2 for x in range(1_000_000))
print(f"List: {sys.getsizeof(big_list):>10,} bytes") # ~8.5 MB
print(f"Generator: {sys.getsizeof(big_gen):>10,} bytes") # ~200 bytes!
# Custom generator with yield
def read_large_file(filepath):
with open(filepath) as f:
for line in f:
yield line.strip() # One line at a time, not whole file
# itertools β lazy iteration toolkit
from itertools import chain, islice, count, cycle
# Chain multiple iterables
combined = chain([1,2], [3,4], [5,6]) # 1,2,3,4,5,6
# Take first N from infinite generator
first_10_evens = list(islice((x for x in count() if x%2==0), 10))
# __slots__ β reduce memory per instance
class PointSlots:
__slots__ = ['x', 'y']
def __init__(self, x, y): self.x, self.y = x, y
# Uses ~40% less memory than regular class with __dict__
Project: Process 1M-Row CSV with Generators
Python
import csv
def read_csv_rows(filepath):
with open(filepath) as f:
reader = csv.DictReader(f)
for row in reader:
yield row
def filter_active(rows):
for row in rows:
if row.get('status') == 'active':
yield row
def extract_emails(rows):
for row in rows:
yield row['email']
# Pipeline β processes 1M rows using constant memory!
# rows = read_csv_rows('users_1M.csv')
# active = filter_active(rows)
# emails = extract_emails(active)
# for email in emails:
# send_newsletter(email)
Exercises
Exercise 4.1: Write a generator that yields Fibonacci numbers infinitely
def fibonacci():
a, b = 0, 1
while True:
yield a
a, b = b, a + b
from itertools import islice
print(list(islice(fibonacci(), 10))) # [0,1,1,2,3,5,8,13,21,34]
Exercise 4.2: When would you use __slots__?
Use __slots__ when creating millions of instances of the same class (e.g., nodes in a graph, particles in simulation). It eliminates per-instance __dict__, saving ~40% memory. Don't use it for classes with few instances or when you need dynamic attributes.
Exercise 4.3: What is the difference between yield and return?
return exits the function permanently and returns a value. yield pauses the function, returns a value, and remembers its state β next call resumes from where it paused. yield turns a function into a generator that produces values lazily, one at a time.
Chapter Summary
- Generators use yield for lazy evaluation β constant memory regardless of data size
- Generator expressions (x for x in ...) are memory-efficient alternatives to list comprehensions
- itertools provides powerful lazy tools: chain, islice, product, combinations
- __slots__ reduces memory for classes with many instances
Decorators & Context Managers
Learning Objectives
- Build decorators from scratch and understand the decorator pattern
- Use @property, @staticmethod, @classmethod effectively
- Create context managers with __enter__/__exit__ and contextlib
- Build a reusable timing and logging framework
Building Decorators
Python
import time
from functools import wraps
# Timer decorator
def timer(func):
@wraps(func)
def wrapper(*args, **kwargs):
start = time.perf_counter()
result = func(*args, **kwargs)
elapsed = time.perf_counter() - start
print(f"β± {func.__name__} took {elapsed:.4f}s")
return result
return wrapper
# Retry decorator with arguments
def retry(max_attempts=3, delay=1):
def decorator(func):
@wraps(func)
def wrapper(*args, **kwargs):
for attempt in range(max_attempts):
try:
return func(*args, **kwargs)
except Exception as e:
print(f"Attempt {attempt+1} failed: {e}")
time.sleep(delay)
raise Exception(f"Failed after {max_attempts} attempts")
return wrapper
return decorator
@timer
@retry(max_attempts=3)
def fetch_data(url):
print(f"Fetching {url}...")
return {"status": "ok"}
Context Managers
Python
# Class-based context manager
class DatabaseTransaction:
def __enter__(self):
print("BEGIN TRANSACTION")
return self
def __exit__(self, exc_type, exc_val, exc_tb):
if exc_type:
print(f"ROLLBACK β Error: {exc_val}")
else:
print("COMMIT")
return False
# contextlib β simpler decorator-based approach
from contextlib import contextmanager
@contextmanager
def temp_directory():
import tempfile, shutil
dirpath = tempfile.mkdtemp()
try:
yield dirpath
finally:
shutil.rmtree(dirpath)
with temp_directory() as tmpdir:
print(f"Working in {tmpdir}")
# Directory is automatically cleaned up
Project: Timing & Logging Framework
Python
import time, logging
from functools import wraps
logging.basicConfig(level=logging.INFO)
def log_and_time(logger=None):
def decorator(func):
log = logger or logging.getLogger(func.__module__)
@wraps(func)
def wrapper(*args, **kwargs):
log.info(f"βΆ {func.__name__} started")
start = time.perf_counter()
try:
result = func(*args, **kwargs)
elapsed = time.perf_counter() - start
log.info(f"β {func.__name__} completed in {elapsed:.3f}s")
return result
except Exception as e:
elapsed = time.perf_counter() - start
log.error(f"β {func.__name__} failed after {elapsed:.3f}s: {e}")
raise
return wrapper
return decorator
@log_and_time()
def train_model(epochs):
time.sleep(0.5)
return {"accuracy": 0.95}
train_model(10)
Exercises
Exercise 5.1: Write a decorator that caches results in a dictionary (manual memoize)
def memoize(func):
cache = {}
@wraps(func)
def wrapper(*args):
if args not in cache:
cache[args] = func(*args)
return cache[args]
wrapper.cache = cache
return wrapper
Exercise 5.2: What does functools.wraps do and why is it important?
Without @wraps, the decorated function loses its original __name__, __doc__, and __module__. help(func) would show "wrapper" instead of the real function name. @wraps copies these attributes from the original function to the wrapper, preserving introspection and debugging ability.
Exercise 5.3: Build a context manager that suppresses specific exceptions
@contextmanager
def suppress(*exceptions):
try:
yield
except exceptions:
pass
with suppress(FileNotFoundError):
open('nonexistent.txt') # Silently ignored
Chapter Summary
- Decorators wrap functions to add behavior: timing, logging, caching, retrying
- Always use @functools.wraps to preserve function metadata
- Context managers (with statement) ensure cleanup: files, connections, locks
- contextlib.contextmanager simplifies context manager creation with yield
Scientific Computing
The data science toolkit
NumPy β Array Operations
Learning Objectives
- Create and manipulate NumPy arrays efficiently
- Master broadcasting, vectorization, and boolean masking
- Perform linear algebra operations for ML
- Benchmark NumPy vs pure Python performance
Array Creation & Indexing
Python
import numpy as np
# Creation
a = np.array([1,2,3,4,5])
zeros = np.zeros((3,4))
rand = np.random.randn(3,3)
grid = np.arange(0, 100, 5)
space = np.linspace(0, 1, 50)
# Boolean masking β powerful filtering
data = np.random.randn(1000)
positives = data[data > 0] # All positive values
outliers = data[np.abs(data) > 2] # Beyond 2 std devs
# Vectorized operations β 100x faster than loops
import time
size = 1_000_000
a = np.random.randn(size)
start = time.time()
result_loop = [x**2 + 2*x + 1 for x in a]
print(f"Loop: {time.time()-start:.3f}s")
start = time.time()
result_np = a**2 + 2*a + 1
print(f"NumPy: {time.time()-start:.3f}s") # ~50-100x faster
Broadcasting & Linear Algebra
Python
# Broadcasting: (3,3) + (3,1) β auto-expands
matrix = np.ones((3,3))
col_vec = np.array([[10],[20],[30]])
result = matrix + col_vec # Each row gets different offset
# Linear algebra for ML
X = np.random.randn(100, 3)
w = np.random.randn(3)
y = X @ w # Matrix-vector multiplication (predictions)
# Normal equation: w = (Xα΅X)β»ΒΉXα΅y
X_b = np.c_[np.ones((100,1)), X] # Add bias column
w_optimal = np.linalg.inv(X_b.T @ X_b) @ X_b.T @ y
Project: Linear Regression from Scratch with NumPy
Python
import numpy as np
# Generate data: y = 3xβ + 5xβ + 7 + noise
np.random.seed(42)
X = np.random.randn(200, 2)
y = 3*X[:,0] + 5*X[:,1] + 7 + np.random.randn(200)*0.5
# Add bias, split data
X_b = np.c_[np.ones((200,1)), X]
X_train, X_test = X_b[:160], X_b[160:]
y_train, y_test = y[:160], y[160:]
# Gradient descent
w = np.zeros(3)
lr, epochs = 0.01, 1000
for i in range(epochs):
preds = X_train @ w
error = preds - y_train
gradient = (2/len(y_train)) * X_train.T @ error
w -= lr * gradient
print(f"Learned weights: bias={w[0]:.2f}, w1={w[1]:.2f}, w2={w[2]:.2f}")
# Expected: ~7, ~3, ~5
rmse = np.sqrt(np.mean((X_test @ w - y_test)**2))
print(f"Test RMSE: {rmse:.4f}")
Exercises
Exercise 6.1: Normalize a matrix so each column has mean=0 and std=1
X = np.random.randn(100, 5) * 10 + 50
X_norm = (X - X.mean(axis=0)) / X.std(axis=0)
print(X_norm.mean(axis=0).round(10)) # ~[0, 0, 0, 0, 0]
print(X_norm.std(axis=0).round(10)) # ~[1, 1, 1, 1, 1]
Exercise 6.2: Why is np.dot(a,b) faster than sum(a[i]*b[i] for i in range(n))?
NumPy uses compiled C/Fortran code (BLAS libraries) that operates on contiguous memory blocks with CPU SIMD instructions. Python loops have interpreter overhead per iteration, dynamic type checking, and poor cache utilization. For 1M elements, NumPy can be 100-500x faster.
Exercise 6.3: Explain broadcasting rules with an example of incompatible shapes
Rules: dimensions are compared right-to-left. Each must be equal OR one must be 1. (3,4) + (4,) works: (3,4)+(1,4)β(3,4). (3,4) + (3,) FAILS: 4β 3 and neither is 1. Fix: reshape to (3,1) to broadcast across columns.
Chapter Summary
- NumPy arrays are 100x faster than Python lists for numerical operations
- Broadcasting auto-expands dimensions for element-wise operations
- Boolean masking enables powerful data filtering without loops
- Linear algebra functions (dot, inv, eig) are essential for ML implementations
Pandas β Data Wrangling
Learning Objectives
- Create, explore, and manipulate DataFrames
- Handle missing data, merge datasets, and use groupby
- Apply transformations with apply(), map(), and lambda
- Perform a complete data analysis project
Python
import pandas as pd
import numpy as np
# Create from dict
df = pd.DataFrame({
'name': ['Alice','Bob','Charlie','Diana','Eve'],
'dept': ['ML','Web','ML','Data','Web'],
'salary': [95000,82000,105000,78000,88000],
'exp_years': [5, 3, 8, 2, 4]
})
# Selecting & filtering
ml_team = df[df['dept'] == 'ML']
senior = df.query('exp_years >= 5 and salary > 90000')
# GroupBy β split-apply-combine
dept_stats = df.groupby('dept').agg(
avg_salary=('salary', 'mean'),
headcount=('name', 'count'),
max_exp=('exp_years', 'max')
)
# Missing data handling
df.loc[1, 'salary'] = np.nan
df['salary'] = df['salary'].fillna(df['salary'].median())
# Apply custom function
df['tax_bracket'] = df['salary'].apply(
lambda s: 'High' if s > 90000 else 'Standard'
)
Project: Titanic Survival Analysis
Python
df = pd.read_csv('https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv')
# Exploration
print(df.shape) # (891, 12)
print(df['Survived'].value_counts())
print(df.isnull().sum()) # Age: 177, Cabin: 687 missing
# Survival rate by class and gender
survival = df.groupby(['Pclass', 'Sex'])['Survived'].mean()
print(survival.round(2))
# 1st class females: 97% survived, 3rd class males: 14%
# Feature engineering
df['Age'] = df['Age'].fillna(df['Age'].median())
df['FamilySize'] = df['SibSp'] + df['Parch'] + 1
df['IsAlone'] = (df['FamilySize'] == 1).astype(int)
Exercises
Exercise 7.1: Merge two DataFrames on a common column
orders = pd.DataFrame({'id':[1,2,3], 'product':['A','B','A']})
prices = pd.DataFrame({'product':['A','B'], 'price':[100,200]})
merged = orders.merge(prices, on='product')
print(merged)Exercise 7.2: Find the top 3 departments by average salary using groupby
top3 = df.groupby('dept')['salary'].mean().nlargest(3)Exercise 7.3: What is the difference between loc and iloc?
loc: label-based indexing β df.loc[0:5, 'name':'salary'] includes endpoint. iloc: integer-based indexing β df.iloc[0:5, 0:3] excludes endpoint (like Python slicing). Use loc with column names, iloc with column positions.
Chapter Summary
- Pandas DataFrames are the standard for tabular data manipulation in Python
- GroupBy enables split-apply-combine analysis patterns
- Handle missing data with dropna/fillna before modeling
- merge/join combines datasets; apply transforms columns with custom logic
Matplotlib & Seaborn β Visualization
Learning Objectives
- Create publication-quality plots with Matplotlib
- Build statistical visualizations with Seaborn
- Design multi-panel dashboards with subplots
- Customize colors, labels, annotations, and themes
Python
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
# Multi-panel dashboard
fig, axes = plt.subplots(2, 2, figsize=(12, 10))
# 1. Line plot
x = np.linspace(0, 10, 100)
axes[0,0].plot(x, np.sin(x), '#6366f1', linewidth=2, label='sin')
axes[0,0].plot(x, np.cos(x), '#f59e0b', linewidth=2, label='cos')
axes[0,0].legend(); axes[0,0].set_title('Trigonometric Functions')
# 2. Histogram
data = np.random.normal(0, 1, 1000)
axes[0,1].hist(data, bins=30, color='#10b981', edgecolor='white')
axes[0,1].set_title('Normal Distribution')
# 3. Scatter
x = np.random.randn(100)
y = 2*x + np.random.randn(100)*0.5
axes[1,0].scatter(x, y, alpha=0.7, c='#ec4899')
axes[1,0].set_title('Correlation')
# 4. Bar chart
categories = ['Python', 'JS', 'Java', 'C++']
values = [85, 72, 68, 55]
axes[1,1].barh(categories, values, color='#6366f1')
axes[1,1].set_title('Language Popularity')
plt.tight_layout()
plt.savefig('dashboard.png', dpi=150)
# Seaborn β statistical plots
sns.set_theme(style='whitegrid')
tips = sns.load_dataset('tips')
sns.boxplot(data=tips, x='day', y='total_bill', hue='time')
plt.title('Bill Distribution by Day')
Exercises
Exercise 8.1: Create a heatmap of a correlation matrix using Seaborn
df = sns.load_dataset('iris')
corr = df.select_dtypes(include='number').corr()
sns.heatmap(corr, annot=True, cmap='coolwarm', center=0)Exercise 8.2: When would you use a violin plot instead of a box plot?
Violin plots show the full distribution shape (kernel density), while box plots only show quartiles. Use violin when you care about multimodality (e.g., bimodal distributions that box plots miss). Box plots are better for comparing medians across many groups quickly.
Exercise 8.3: How do you add annotations to highlight key data points?
plt.annotate('Peak', xy=(5, 100), xytext=(6, 120),
arrowprops=dict(arrowstyle='->', color='red'),
fontsize=12, color='red')Chapter Summary
- Matplotlib gives full control with figure/axes architecture
- Seaborn provides high-level statistical plots with beautiful defaults
- Always label axes, add titles, and use tight_layout for clean plots
- Save with dpi=150+ for publication quality
SciPy & Jupyter Notebooks
Learning Objectives
- Use scipy.optimize for function minimization and curve fitting
- Perform statistical tests with scipy.stats
- Master Jupyter Notebook best practices and magic commands
- Conduct an A/B test analysis
Python
from scipy import optimize, stats
import numpy as np
# Curve fitting β find best parameters
def model(x, a, b, c):
return a * np.exp(-b * x) + c
x_data = np.linspace(0, 4, 50)
y_data = model(x_data, 2.5, 1.3, 0.5) + np.random.normal(0, 0.1, 50)
params, cov = optimize.curve_fit(model, x_data, y_data)
print(f"Fitted: a={params[0]:.2f}, b={params[1]:.2f}, c={params[2]:.2f}")
# T-test: are two groups statistically different?
group_a = np.random.normal(100, 10, 50)
group_b = np.random.normal(105, 10, 50)
t_stat, p_value = stats.ttest_ind(group_a, group_b)
print(f"t={t_stat:.3f}, p={p_value:.4f}")
print("Significant!" if p_value < 0.05 else "Not significant")
Jupyter Magic Commands
%timeit β benchmark code execution time. %matplotlib inline β show plots in notebook. %%writefile β save cell to file. %who β list variables. !pip install β run shell commands. %load_ext autoreload β auto-reload changed modules.
Project: A/B Test Analysis
Python
import numpy as np
from scipy import stats
# Control: old button, Treatment: new button
np.random.seed(42)
control_clicks = np.random.binomial(1, 0.12, 1000) # 12% CTR
treatment_clicks = np.random.binomial(1, 0.15, 1000) # 15% CTR
print(f"Control CTR: {control_clicks.mean():.1%}")
print(f"Treatment CTR: {treatment_clicks.mean():.1%}")
# Chi-squared test for proportions
from scipy.stats import chi2_contingency
table = np.array([[control_clicks.sum(), 1000-control_clicks.sum()],
[treatment_clicks.sum(), 1000-treatment_clicks.sum()]])
chi2, p, dof, expected = chi2_contingency(table)
print(f"\nChiΒ² = {chi2:.3f}, p-value = {p:.4f}")
print("β Deploy new button!" if p < 0.05 else "β Keep old button")
Exercises
Exercise 9.1: Minimize f(x) = (x-3)Β² + 2 using scipy.optimize
result = optimize.minimize(lambda x: (x-3)**2+2, x0=0)
print(f"Minimum at x={result.x[0]:.4f}, f(x)={result.fun:.4f}")Exercise 9.2: What does a p-value of 0.03 mean?
There's a 3% probability of observing results this extreme if the null hypothesis (no difference) were true. Since 0.03 < 0.05, we reject the null hypothesis and conclude the difference is statistically significant. Note: p-value does NOT tell you the magnitude of the effect β use effect size for that.
Chapter Summary
- SciPy extends NumPy with optimization, statistics, and signal processing
- curve_fit finds optimal parameters for any model function
- t-tests and chi-squared tests determine statistical significance
- Jupyter notebooks are the standard interactive environment for data science
Software Engineering
Building production-ready code
Version Control with Git
Learning Objectives
- Master core Git commands: init, add, commit, branch, merge
- Work with remote repositories: clone, push, pull
- Handle merge conflicts and use feature branch workflows
- Write good commit messages and maintain .gitignore
Bash
# Initialize & first commit
git init
git add .
git commit -m "Initial commit: project structure"
# Branching workflow
git checkout -b feature/add-login # Create & switch
# ... make changes ...
git add -A
git commit -m "feat: add user login endpoint"
git checkout main
git merge feature/add-login # Merge feature into main
git branch -d feature/add-login # Clean up branch
# Working with remotes
git remote add origin https://github.com/user/repo.git
git push -u origin main
git pull origin main # Fetch + merge
# Undo mistakes
git stash # Save uncommitted changes
git reset --soft HEAD~1 # Undo last commit, keep changes
git log --oneline --graph -10 # Visual history
Commit Message Convention
feat: new feature, fix: bug fix, docs: documentation, refactor: code restructuring, test: adding tests, chore: maintenance. Example: feat: add batch prediction endpoint with caching
Exercises
Exercise 10.1: What is the difference between git merge and git rebase?
Merge creates a new "merge commit" with two parents, preserving full history. Rebase replays your commits on top of the target branch, creating a linear history. Rebase is cleaner but rewrites history β never rebase shared branches. Use merge for team branches, rebase for local cleanup.
Exercise 10.2: How do you resolve a merge conflict?
Git marks conflicts with <<<< HEAD, ====, >>>> markers. Open the file, choose the correct code (or combine both), remove markers, then git add and git commit. Use git diff to review, and test before committing.
Exercise 10.3: Write a .gitignore for a Python ML project
__pycache__/
*.pyc
.env
*.sqlite
node_modules/
dist/
*.egg-info/
.ipynb_checkpoints/
data/*.csv
models/*.pkl
wandb/Chapter Summary
- Git tracks every change β you can always undo mistakes
- Feature branches isolate work; merge integrates it
- Good commit messages document project evolution
- .gitignore prevents secrets and large files from being tracked
Testing & Debugging
Learning Objectives
- Write unit tests with pytest and understand test types
- Use fixtures, parametrize, and measure code coverage
- Debug with pdb, breakpoints, and logging
- Build a fully tested module
Python
# calculator.py
def add(a, b): return a + b
def divide(a, b):
if b == 0: raise ValueError("Cannot divide by zero")
return a / b
# test_calculator.py
import pytest
from calculator import add, divide
def test_add():
assert add(2, 3) == 5
assert add(-1, 1) == 0
assert add(0, 0) == 0
def test_divide_by_zero():
with pytest.raises(ValueError):
divide(10, 0)
# Parametrize β test multiple inputs at once
@pytest.mark.parametrize("a,b,expected", [
(10, 2, 5), (9, 3, 3), (7, 2, 3.5)
])
def test_divide(a, b, expected):
assert divide(a, b) == expected
# Fixtures β shared setup
@pytest.fixture
def sample_data():
return [1, 2, 3, 4, 5]
def test_sum(sample_data):
assert sum(sample_data) == 15
Bash
# Run tests with coverage
pytest test_calculator.py -v --cov=calculator --cov-report=term-missing
Debugging & Logging
Python
import logging
logging.basicConfig(level=logging.INFO,
format='%(asctime)s %(levelname)s %(message)s')
logger = logging.getLogger(__name__)
def train_model(data):
logger.info(f"Training on {len(data)} samples")
try:
# Training logic
logger.info("Training complete")
except Exception as e:
logger.error(f"Training failed: {e}")
raise
# Debug with breakpoint() β drops into pdb
def buggy_function(x):
result = x * 2
breakpoint() # Execution pauses here β inspect variables
return result + 1
Exercises
Exercise 11.1: What is the testing pyramid (unit vs integration vs E2E)?
Unit tests (base, most tests): Test individual functions in isolation. Fast, cheap. Integration tests (middle): Test components working together (API + database). E2E tests (top, fewest): Test full user workflows. Slow, expensive. The pyramid shape means: write many unit tests, fewer integration, fewest E2E.
Exercise 11.2: Write a test for a function that reads a file (using tmp_path fixture)
def test_read_file(tmp_path):
f = tmp_path / "test.txt"
f.write_text("hello world")
assert f.read_text() == "hello world"Exercise 11.3: Why use logging instead of print statements?
Logging provides: severity levels (DEBUG/INFO/WARNING/ERROR), timestamps, configurable output (file/console/remote), can be disabled in production without removing code, and supports structured formatting. Print statements must be manually removed and provide no filtering or context.
Chapter Summary
- pytest is the standard Python testing framework β simple, powerful, extensible
- Parametrize tests multiple inputs; fixtures share setup code
- Aim for 80%+ code coverage; test edge cases and error paths
- Use logging over print; use breakpoint() for interactive debugging
Code Modularization, REST APIs & Docker
Learning Objectives
- Structure Python projects with modules and packages
- Build REST APIs with Flask and FastAPI
- Containerize applications with Docker
- Deploy an ML prediction API
Clean Code & Modules
Python
# Project structure
# ml_project/
# βββ src/
# β βββ __init__.py
# β βββ model.py
# β βββ preprocess.py
# β βββ api.py
# βββ tests/
# βββ Dockerfile
# βββ requirements.txt
FastAPI β Modern Python API
Python
from fastapi import FastAPI
from pydantic import BaseModel
import pickle
import numpy as np
app = FastAPI(title="ML Prediction API")
class PredictionInput(BaseModel):
features: list[float]
class PredictionOutput(BaseModel):
prediction: str
confidence: float
# Load model on startup
with open("model.pkl", "rb") as f:
model = pickle.load(f)
@app.post("/predict", response_model=PredictionOutput)
async def predict(data: PredictionInput):
X = np.array(data.features).reshape(1, -1)
pred = model.predict(X)[0]
proba = model.predict_proba(X).max()
return PredictionOutput(prediction=str(pred), confidence=float(proba))
@app.get("/health")
async def health():
return {"status": "healthy"}
Docker β Containerization
Dockerfile
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
EXPOSE 8000
CMD ["uvicorn", "src.api:app", "--host", "0.0.0.0", "--port", "8000"]
Bash
# Build and run
docker build -t ml-api .
docker run -p 8000:8000 ml-api
# docker-compose for multi-service
# docker-compose.yml:
# services:
# api: build: . ports: ["8000:8000"]
# db: image: postgres:15
Exercises
Exercise 12.1: What is the difference between Flask and FastAPI?
Flask: Synchronous, mature, huge ecosystem, minimal by design. FastAPI: Async (ASGI), auto-generates OpenAPI docs, uses Pydantic for validation, 3-5x faster. Use FastAPI for new APIs; Flask for existing projects or simpler needs.
Exercise 12.2: Why containerize ML models with Docker?
Docker ensures "it works on my machine" β "it works everywhere." Packages Python version, libraries, system dependencies, and model files into one portable image. Eliminates version conflicts, enables horizontal scaling, and works with Kubernetes for orchestration.
Exercise 12.3: Explain SOLID principles with one-line examples
Single Responsibility: One class = one job. Open/Closed: Extend via inheritance, don't modify existing code. Liskov Substitution: Subclass should be usable wherever parent is. Interface Segregation: Many specific interfaces > one fat interface. Dependency Inversion: Depend on abstractions, not concrete classes.
Chapter Summary
- Structure code into modules/packages for maintainability
- FastAPI provides modern, fast, auto-documented REST APIs
- Docker containerizes your app with all dependencies for portable deployment
- SOLID principles guide clean, maintainable architecture
Hardware & Computing
Understanding the machine beneath the code
GPU vs CPU, CUDA & Distributed Computing
Learning Objectives
- Understand CPU vs GPU architecture and when GPUs win
- Learn CUDA programming basics and GPU-accelerated Python
- Grasp distributed computing concepts: data and model parallelism
- Understand memory bandwidth bottlenecks and mixed precision training
CPU vs GPU Architecture
| Feature | CPU | GPU |
|---|---|---|
| Cores | 4-64 complex cores | 1000-16000 simple cores |
| Clock Speed | 3-5 GHz | 1-2 GHz |
| Strength | Sequential, complex logic | Massively parallel, simple ops |
| Memory | System RAM (64-512 GB) | VRAM (8-80 GB) |
| Best For | Control flow, I/O, OS | Matrix math, convolutions |
A GPU's thousands of cores can execute the same operation on thousands of data elements simultaneously (SIMD β Single Instruction Multiple Data). Matrix multiplication β the core of neural networks β is perfectly parallel: each output element is an independent dot product.
GPU-Accelerated Python
Python
# CuPy β NumPy on GPU (drop-in replacement)
import cupy as cp
import numpy as np
import time
size = 10000
# CPU (NumPy)
a_cpu = np.random.randn(size, size).astype(np.float32)
b_cpu = np.random.randn(size, size).astype(np.float32)
start = time.time()
c_cpu = a_cpu @ b_cpu
print(f"CPU: {time.time()-start:.3f}s")
# GPU (CuPy)
a_gpu = cp.array(a_cpu)
b_gpu = cp.array(b_cpu)
start = time.time()
c_gpu = a_gpu @ b_gpu
cp.cuda.Stream.null.synchronize()
print(f"GPU: {time.time()-start:.3f}s")
# GPU is typically 10-50x faster for large matrix multiplications
# Numba β JIT compile Python to GPU kernels
from numba import cuda
import math
@cuda.jit
def vector_add_gpu(a, b, result):
idx = cuda.grid(1)
if idx < a.size:
result[idx] = a[idx] + b[idx]
CUDA Concepts
CUDA organizes parallel execution into a hierarchy:
| Level | Description | Analogy |
|---|---|---|
| Thread | Single execution unit | One worker |
| Block | Group of threads (up to 1024) | One team |
| Grid | Group of blocks | Entire workforce |
Distributed Computing
Python
# PyTorch Data Parallel β split batches across GPUs
import torch
import torch.nn as nn
model = nn.Linear(1000, 100)
# Data Parallelism: same model, split data across GPUs
if torch.cuda.device_count() > 1:
model = nn.DataParallel(model)
model = model.cuda()
# Model Parallelism: split model layers across GPUs
# layer1 β GPU0, layer2 β GPU1 (for models too large for one GPU)
Memory Bottlenecks & Mixed Precision
Memory Bandwidth is the Real Bottleneck
GPUs can compute faster than memory can feed data to them. Key optimizations: (1) Mixed precision (FP16): halves memory, doubles throughput with tensor cores. (2) Gradient checkpointing: recompute activations instead of storing them. (3) Data prefetching: load next batch while computing current one. (4) Model quantization: INT8 inference for 4x speedup on edge devices.
Python
# Mixed Precision Training with PyTorch
from torch.cuda.amp import autocast, GradScaler
scaler = GradScaler()
for batch in dataloader:
optimizer.zero_grad()
with autocast(): # FP16 forward pass
output = model(batch)
loss = criterion(output, labels)
scaler.scale(loss).backward() # Scaled FP16 backward
scaler.step(optimizer)
scaler.update()
# ~2x faster training, ~50% less memory!
Project: Benchmark CPU vs GPU Matrix Operations
Python
import numpy as np
import time
sizes = [100, 500, 1000, 2000, 5000]
results = []
for n in sizes:
A = np.random.randn(n, n).astype(np.float32)
B = np.random.randn(n, n).astype(np.float32)
start = time.time()
C = A @ B
cpu_time = time.time() - start
gflops = (2 * n**3) / cpu_time / 1e9
results.append((n, cpu_time, gflops))
print(f"Size {n:>5}Γ{n:<5} β {cpu_time:.3f}s ({gflops:.1f} GFLOPS)")
# With GPU (CuPy):
# try:
# import cupy as cp
# A_gpu = cp.array(A); B_gpu = cp.array(B)
# start = time.time()
# C_gpu = A_gpu @ B_gpu
# cp.cuda.Stream.null.synchronize()
# gpu_time = time.time() - start
# print(f"GPU: {gpu_time:.3f}s β {cpu_time/gpu_time:.1f}x speedup")
Exercises
Exercise 13.1: Why can't GPUs replace CPUs entirely?
GPUs excel at data parallelism β the same operation on many elements. But they are poor at: (1) complex branching/control flow, (2) sequential algorithms, (3) operating system tasks, (4) low-latency single-thread operations, (5) irregular memory access patterns. The CPU handles orchestration while the GPU handles computation.
Exercise 13.2: What is the difference between data parallelism and model parallelism?
Data parallelism: Same model replicated across GPUs, each processes different data batches. Gradients are averaged. Works for most models. Model parallelism: Different parts of the model on different GPUs. Required when model doesn't fit in one GPU's memory (e.g., GPT-4 with 1.7T parameters). More complex to implement.
Exercise 13.3: Why does mixed precision training work without losing accuracy?
FP16 has less precision but neural networks are robust to small rounding errors. The trick: (1) Forward pass in FP16 (fast, small). (2) Loss scaling prevents tiny gradients from rounding to zero. (3) Weight updates in FP32 (full precision master copy). (4) Only the final update step needs precision. Result: nearly identical accuracy with 2x speed and 50% memory.
Exercise 13.4: What is MapReduce and how does it relate to distributed ML?
Map: Apply a function to each data chunk independently (parallel). Reduce: Combine results into a single output. In ML: Map = compute gradients on each GPU's data batch. Reduce = average gradients across all GPUs (AllReduce). This is the foundation of distributed SGD used by PyTorch DDP and Horovod.
Chapter Summary
- GPUs have thousands of simple cores ideal for parallel matrix operations
- CuPy and Numba bring GPU acceleration to Python with minimal code changes
- Data parallelism splits batches across GPUs; model parallelism splits the model
- Mixed precision (FP16) doubles throughput with minimal accuracy loss
- Memory bandwidth, not compute, is often the real bottleneck
π Congratulations!
You've completed Programming & Software Engineering. You now have the skills to write clean, efficient, production-ready Python code β from algorithms and OOP to APIs and GPU computing.
Β© 2025 EduArtha β Programming & Software Engineering Complete Guide