Final Exam Topics
Career and Professional Life
- Professional Ethics in Software Engineering — Engineers regularly face decisions with ethical weight (privacy, bias, security, attribution); applying a principled framework (e.g., ACM Code of Ethics) and not outsourcing moral judgment to tools or managers is a career-long obligation.
- The Changing Entry-Level SWE Landscape — AI coding assistants are raising the baseline productivity expectation for new engineers; the competitive advantage shifts toward engineers who can architect systems, reason about tradeoffs, review AI-generated code critically, and communicate clearly with both technical and non-technical stakeholders.
Design Process
Requirements and User Research
- User-Centered Design: Personas and User Stories — Before writing code, identify who you are building for (personas) and what they need (user stories with acceptance criteria); stories answer "As a [persona], I want [action] so that [benefit]" and make requirements testable and human-readable.
- Epics as Organizing Units of Work — An epic groups related user stories under a single feature theme; epics help teams scope a feature, communicate it to stakeholders, and break it into deliverable increments across sprints.
- Functional vs. Non-Functional Requirements — Functional requirements describe what a system does; non-functional (cross-functional) requirements describe how well it does it (performance, security, reliability, accessibility).
Technical Design
- Wireframes Before Implementation — Low, Mid, and High-fidelity wireframes communicate UI intent to the team before a line of code is written; they surface design decisions cheaply and align frontend/backend teams on what the product should actually do.
- Data Modeling First — Designing your database tables before writing code forces clarity about what data your feature actually needs; as Fred Brooks observed, once you see the tables, the rest of the design often becomes obvious. Applications are projections of their data structures.
- API-First / Middle-Out Development — Teams move faster when they agree on an API contract first; the frontend can build against mock responses while the backend implements the real service, letting both sides work in parallel without blocking each other.
Software Engineering Skills
Documentation and Code Quality
- Architectural Decision Records (ADRs) — ADRs capture the context, options considered, and rationale behind a significant technical choice; they create a durable record that helps future team members understand why the codebase is the way it is.
- Refactoring vs. Rewriting — Replacing an implementation underneath a stable interface is refactoring; an incremental, method-by-method approach with tests at each step reduces risk.
Testing Strategy and Practice
- Automated Testing as a Safety Net — Automated tests catch regressions (unintended breakage of existing behavior) early and cheaply; a project without them accumulates invisible risk with every change.
- Unit, Integration, and End-to-End Tests Each Have a Role — Unit tests prove a component's logic in isolation; integration tests prove that two or more real components coordinate correctly; E2E tests prove the full request/response cycle works together.
- Arrange-Act-Assert (AAA) — Every test has three logical phases: set up preconditions, perform the action under test, and verify the outcome; this structure keeps tests readable and maintainable.
- Test Isolation and Reproducibility — Each test should arrange its own state, act independently, and leave no side effects; function-scoped fixtures that create and tear down a fresh schema per test achieve this for database-backed tests.
- Test Coverage as a Signal — 100% line coverage indicates all code paths were exercised, but coverage alone does not prove correctness; it highlights untested paths and edge cases that still need assertions.
- Testing with Dependency Injection — In unit tests, inject test doubles directly; in end-to-end tests that go through the DI framework (e.g., FastAPI's
TestClientusesdependency_overrides) to swap out real dependencies — because the framework, not your test code, calls the route handler. - Patching at the Right Location — When mocking, patch the name where it is used, not where it is defined; patching the wrong location leaves the real implementation in place and produces misleading test results.
System Design
Application Architecture
- Layered Architecture — Software systems should be organized into distinct layers (routing, services, persistence/entities) with clear responsibilities; changes to implementations of one layer should almost never impact others; changes to the interface of a layer should only impact the layer directly above it integrating with its interface.
- Separation of Concerns across Layers — In our application stack: routes handle HTTP concerns; services handle business logic; persistence handles storage. Keeping these responsibilities separate makes each layer easier to test and change independently. Other domains (robotics, graphics, machine learning, and so on) will have different layers but the same concepts apply.
- Dependency Injection (DI) — DI externalizes the construction of collaborators so components don't hard-code their dependencies; this improves modularity, replaceability, and testability across any language or framework.
HTTP and APIs
- HTTP as a Universal Protocol — HTTP is the foundation of modern web APIs; you should understand requests/responses, methods (GET, POST, PUT, PATCH, DELETE), status codes, headers, and bodies.
- Resource-Oriented API Design (REST) — Good APIs are organized around nouns (resources), not verbs; URLs identify resources, HTTP methods express intent, and stateless request handling keeps servers scalable.
- Path Parameters vs. Query Parameters — Path parameters identify a specific resource; query parameters filter, sort, or refine a collection. Choosing correctly keeps APIs intuitive and consistent.
- Request/Response Contracts with Schemas — Defining precise request and response schemas (e.g., with Pydantic) makes an API self-documenting, enables automatic validation, and reduces integration bugs.
- HTTP Status Codes Communicate Outcomes — Returning the right status code (200, 201, 204, 400, 404, 422, 500, etc.) is part of the API contract; clients rely on these codes to handle success and failure correctly.
- OpenAPI / Interactive Docs as a Communication Tool — Auto-generated API documentation (e.g.,
/docs) is a living contract between backend and frontend teams and a first-class deliverable of good API design.
Databases and ORM
- SQL Fundamentals — The relational model, primary keys, foreign keys, unique constraints, indexes, filtering/sorting/aggregating queries, and ACID transactions are foundational knowledge for any backend engineer.
- Object-Relational Mapping (ORM) — ORMs bridge the gap between Python objects and relational database tables, letting developers work in the language of their domain rather than raw SQL while still leveraging relational storage.
- Sessions as Units of Work — A database session represents a bounded unit of work; changes accumulate in memory and are only persisted to the database on
commit(). - Querying with an ORM — Building a query (
select(),.where()) is separate from executing it (session.exec()); use.all(),.first(), or.one()intentionally, and prefersession.get()for primary-key lookups. - Entity Relationships — One-to-many and many-to-many relationships are modeled with foreign keys and join tables;
Relationship()declarations andback_populateskeep both sides of a relationship in sync in memory. - N+1 Query Problem — Naively loading a collection and then lazily fetching each related entity in a loop issues one query per row; recognizing and addressing this with joins or eager loading is critical for production performance.
Containers and Deployment
- Containers and Reproducible Environments — A Docker image is an immutable snapshot; a running container is an isolated process from that image; containers reproduce the same environment across development, CI, and production.
- Multi-Container Environments with Docker Compose — Docker Compose coordinates multiple services (e.g., application + database) so they start, stop, and communicate together; service names become DNS hostnames within the compose network.
- Engines, Connections, and Connection Strings — An engine is the application's entry point to a database; connection strings encode host, port, credentials, and database name, and in containerized environments the service name acts as the hostname.
- Volumes for Durable Data — A container's writable layer is ephemeral; named volumes persist data across container restarts, which is essential for database storage.
- Container Orchestration at a High Level (Kubernetes) — Kubernetes (and platforms like OKD) schedule and manage containers across a cluster, handle restarts, scaling, and networking; knowing that this layer exists and what problem it solves is essential context for any engineer deploying to the cloud.
- Deployment as Part of the Feature — A feature is not done when the code passes tests locally; it is done when it is running in production, which requires understanding deployment pipelines, environment parity, and the path from a commit to a live system.
Teamwork
- Agile Sprint Cadence — Many leading companies prioritize shipping software in short, time-boxed iterations (sprints); each sprint produces working software, surfaces surprises early, and keeps stakeholders informed with something tangible rather than a status update.
- Code Review as a Team Practice — Code review (per Google Engineering Practices) is not gatekeeping — it is the primary mechanism for spreading knowledge, catching bugs before they ship, maintaining consistent standards, and building collective ownership of the codebase.
- Pair Programming as a Quality Practice — Pairing (driver/navigator, ping-pong, strong-style) produces continuous code review, shared ownership, and faster onboarding — not just faster typing.
Tool Bench for Individual Contributors
Source Control
- Git Branching Model — Commits form a directed acyclic graph; branches are lightweight pointers to commits; understanding branch creation, switching, merging, fast-forwarding vs. merge commits, and the
HEADpointer is foundational for collaborating on any codebase. - Syncing with Remote Repositories —
fetchdownloads new remote history without changing local branches;mergeintegrates it;pulldoes both. Knowing the difference prevents accidental overwrites and merge surprises.
Environment and Configuration
- Dependency / Package Managers — Package managers pin and reproduce exact dependency trees across machines and environments; understanding lockfiles, virtual environments, and version constraints is essential for reproducible builds.
- Environment Variables and Secrets Management — API keys, credentials, and other secrets must never be hard-coded or committed to source control; environment variables (and secret stores in production) are the standard way to inject sensitive configuration at runtime.
AI Tools
- LLMs and AI Agents in Software Development — LLMs predict likely next tokens given a context window; AI agents layer tool-calling on top of that loop to take actions. Understanding these mechanics helps engineers use AI tools effectively and audit their output critically.
- Context Engineering — The quality of an AI agent's output is directly shaped by what is in its context window; engineers who structure prompts, supply relevant code/docs, and constrain scope get better results than those who treat the model as a magic oracle.
- Auditing and Owning AI-Generated Code — Generating code with an agent is only the first step; the engineer is responsible for verifying correctness, security, style, and fitness before committing it.
Key Terminology
ACID — A set of guarantees for database transactions: Atomicity (all-or-nothing), Consistency (rules are preserved), Isolation (concurrent transactions don't interfere), Durability (committed data survives crashes).
Architectural Decision Record (ADR) — A short document that records a significant technical decision, the context that motivated it, the options considered, and the rationale for the choice made.
Arrange-Act-Assert (AAA) — A pattern for structuring tests into three phases: set up the preconditions, perform the action under test, then verify the expected outcome.
Branch (git) — A lightweight, movable pointer to a specific commit; branches allow parallel lines of development without affecting each other.
Commit (git) — A snapshot of the repository at a point in time; each commit records changes, a message, and a pointer to its parent(s), forming a directed acyclic graph of history.
Connection String — A URL-formatted string that encodes the driver, host, port, credentials, and database name needed for an application to connect to a database.
Container — An isolated, lightweight process that packages an application with its dependencies using a shared OS kernel; defined by a Docker image and run as an instance of it.
Context Window — The maximum amount of text (tokens) an LLM can consider at once; the quality of generated output is bounded by what fits in and is included in this window.
Dependency Injection (DI) — A design pattern where a component's collaborators are supplied from outside rather than constructed internally, improving modularity and testability.
Docker Compose — A tool for defining and running multi-container applications; services, networks, and volumes are declared in a YAML file and started together with a single command.
Docker Image — An immutable, layered snapshot of a filesystem and runtime configuration used as the blueprint for creating containers.
Engine (SQLAlchemy/SQLModel) — The object that manages a pool of database connections and serves as the application's single entry point to the database.
Epic — A large body of related work (typically a full feature or capability) that is broken down into individual user stories for planning and delivery.
Entity (ORM) — A Python class mapped to a database table; instances of the entity correspond to rows, and class attributes correspond to columns.
Environment Variable — A named value injected into a process from the operating system environment, used to supply configuration (especially secrets) without hard-coding them in source code.
Fast-Forward Merge — A merge where the target branch pointer simply advances to the source branch tip because no divergent history exists; no merge commit is created.
Foreign Key — A column in one table whose value must match a primary key in another table, enforcing a referential integrity constraint between the two.
HEAD (git) — A special pointer that indicates the currently checked-out commit or branch; it moves automatically as you commit or switch branches.
HTTP Method — A verb in an HTTP request (GET, POST, PUT, PATCH, DELETE) that expresses the intended operation on a resource.
HTTP Status Code — A three-digit numeric code in an HTTP response that communicates the outcome of a request (e.g., 200 OK, 404 Not Found, 422 Unprocessable Entity).
Integration Test — A test that exercises two or more real components working together (e.g., a service method against a real database) to verify their coordination.
Kubernetes — An open-source container orchestration platform that automates deployment, scaling, networking, and lifecycle management of containerized applications across a cluster.
Large Language Model (LLM) — A neural network trained on large text corpora that generates likely next tokens given an input context; the foundation of modern AI coding assistants and chat tools.
Layered Architecture — A system design where responsibilities are separated into distinct tiers (e.g., routing → service → persistence) so that each layer only interacts with the one directly below it.
Lockfile — A file generated by a package manager (e.g., poetry.lock, requirements.txt) that records the exact resolved versions of all dependencies for reproducible installs.
Merge Commit — A commit with two parents created when integrating two diverged branches; it preserves the full history of both lines of development.
Mock / Test Double — An object that stands in for a real dependency in a test, allowing the test to control and inspect the dependency's behavior without invoking the real implementation.
N+1 Query Problem — A performance anti-pattern where loading N items and then lazily fetching a related object for each item results in N+1 total database queries instead of one.
OpenAPI — A standard specification format for describing HTTP APIs; frameworks like FastAPI auto-generate an OpenAPI document (visible at /docs) from route and schema definitions.
ORM (Object-Relational Mapper) — A library that maps Python objects to relational database rows, allowing developers to read and write data using their domain model rather than raw SQL.
Package Manager — A tool (e.g., pip, poetry) that resolves, installs, and tracks the library dependencies of a project.
Patching — Replacing a name in a module's namespace with a test double during a test; must be applied at the location where the name is used, not where it is originally defined.
Persona — A named, archetypal description of a target user — their role, goals, and pain points — used to keep design decisions grounded in real human needs.
Primary Key — A column (or set of columns) in a table whose value uniquely identifies each row; every table should have one.
Refactoring — Changing the internal structure of code without changing its observable behavior, typically to improve clarity, reduce duplication, or swap an implementation.
Regression — A bug introduced by a change that breaks behavior that was previously working correctly.
REST (Representational State Transfer) — An architectural style for HTTP APIs centered on stateless requests, resource-oriented URLs (nouns, not verbs), and standard HTTP methods to express intent.
Session (SQLAlchemy/SQLModel) — A unit-of-work object that tracks changes to entities in memory and flushes them to the database on commit(); should be scoped to a single request in web applications.
Sprint — A short, time-boxed iteration (typically 1–2 weeks) in Agile development that ends with a working, demonstrable increment of the product.
SQL — Structured Query Language; the standard language for defining schemas and querying, inserting, updating, and deleting data in relational databases.
Test Coverage — A metric expressing what percentage of lines (or branches) in the production code were executed during the test suite; high coverage is necessary but not sufficient for quality.
Test Fixture — Setup code (often a pytest fixture) that provides a known, isolated starting state for a test, such as a fresh database schema or a pre-populated object.
Tool Calling (AI Agents) — The ability of an LLM-based agent to invoke external functions or APIs (e.g., run a shell command, read a file) as part of generating a response, enabling autonomous multi-step workflows.
Unit Test — A test that exercises a single component in isolation, with all dependencies replaced by test doubles, to verify its logic independently of the rest of the system.
User Story — A short, user-focused requirement written as "As a [persona], I want [action] so that [benefit]," paired with concrete acceptance criteria that define when the story is complete.
Virtual Environment — An isolated Python environment with its own interpreter and installed packages, preventing dependency conflicts between different projects on the same machine.
Volume (Docker) — A named, managed storage location that persists data outside a container's writable layer, surviving container restarts and replacements.
Wireframe — A low-to-mid-fidelity visual sketch of a UI screen that communicates layout, content, and user flow without committing to final visual design.