Cloud Models: IaaS, PaaS, SaaS, FaaS, CaaS

Overview

Cloud computing is fundamentally a shift in how computing resources are provisioned, owned, and operated. Rather than building and maintaining physical infrastructure, organizations consume compute, storage, and networking as a utility — paying for what they use, scaling elastically, and delegating operational concerns to the provider. The critical insight is not just technical but economic: the transformation from capital expenditure (CapEx) to operational expenditure (OpEx) changes the financial risk profile of technology investment entirely.

This file covers the taxonomy of cloud service models, the economics that drive cloud adoption, the architectural distinction between cloud-native and cloud-hosted workloads, and the tradeoffs between public, private, hybrid, and multi-cloud deployments.

Prerequisites

Familiarity with TCP/IP networking and DNS
Understanding of virtualization concepts (hypervisors, VMs)
Basic knowledge of operating systems and process isolation
General awareness of enterprise IT organizational structures

Service Model Pyramid

The classic abstraction ladder from raw hardware to fully managed software:

       +------------------------------+
       |           SaaS               |  <- Gmail, Salesforce, Slack
       |  (Software as a Service)     |     You manage: nothing (just data)
       +------------------------------+
       |           PaaS               |  <- Heroku, Cloud Run, App Engine
       |  (Platform as a Service)     |     You manage: code + config
       +------------------------------+
       |      FaaS / CaaS             |  <- Lambda, Cloud Functions (FaaS)
       |  (Function / Container)      |     EKS, GKE, AKS (CaaS)
       +------------------------------+
       |           IaaS               |  <- EC2, GCE, Azure VMs
       |  (Infrastructure as a Srvc)  |     You manage: OS + runtime + app
       +------------------------------+
       |       On-Premises            |  <- Your data center
       |    (Physical Hardware)       |     You manage: everything
       +------------------------------+

   PROVIDER manages everything BELOW your layer.
   YOU manage everything WITHIN your layer and above.

Service Model Definitions

IaaS — Infrastructure as a Service

The provider supplies virtual machines, storage volumes, and network primitives. You install and manage the OS, runtime, middleware, and application.

Real examples: AWS EC2, Google Compute Engine, Azure Virtual Machines, DigitalOcean Droplets.

Use when you need full OS control, custom kernel parameters, non-standard runtimes, or when migrating existing applications without refactoring (lift-and-shift).

PaaS — Platform as a Service

The provider manages OS, runtime, middleware, and scaling. You deploy application code and configuration. No server management.

Real examples: Heroku, Google App Engine (Standard), Azure App Service, Fly.io.

Use when you want to focus entirely on application logic and accept the platform's constraints on runtime versions and scaling behavior.

SaaS — Software as a Service

The provider delivers a complete application over the network. You consume functionality; you manage only your data and user configuration.

Real examples: Gmail, Salesforce, Slack, GitHub, Datadog, Snowflake.

The buyer is typically a business, not a developer. SaaS eliminates all infrastructure and application maintenance concerns.

FaaS — Function as a Service

Execute individual functions in response to events. No server provisioning; billing is per-invocation and per-GB-second of execution. Scale to zero means zero cost when idle.

Real examples: AWS Lambda, Google Cloud Functions, Azure Functions, Cloudflare Workers.

The unit of deployment is a function, not a service. Cold starts are the primary operational challenge.

CaaS — Container as a Service

Run containerized workloads on a managed container orchestration platform. The provider manages the control plane (scheduler, API server); you manage container images and deployment configurations.

Real examples: AWS EKS, Google GKE, Azure AKS, Amazon ECS.

You get container portability without managing Kubernetes infrastructure. The line between CaaS and PaaS is blurring as managed Kubernetes adds more abstraction layers.

Shared Responsibility Model

Security obligations shift with each abstraction layer. Understanding where provider responsibility ends and customer responsibility begins is essential for compliance.

Layer              On-Prem     IaaS        PaaS        SaaS
-----------------------------------------------------------------
Physical security  Customer    Provider    Provider    Provider
Hypervisor/Host    Customer    Provider    Provider    Provider
Network controls   Customer    Shared      Provider    Provider
Operating system   Customer    Customer    Provider    Provider
Runtime/Middleware Customer    Customer    Provider    Provider
Application code   Customer    Customer    Customer    N/A
Data encryption    Customer    Customer    Customer    Customer
Access management  Customer    Customer    Customer    Customer
Data governance    Customer    Customer    Customer    Customer

The most dangerous assumption is treating IaaS as a security guarantee. AWS secures the hypervisor; you are fully responsible for misconfigured S3 buckets, unpatched OS instances, and overprivileged IAM roles. The vast majority of cloud security breaches originate from customer-side misconfigurations, not provider-side failures.

Cloud Economics

CapEx to OpEx Transformation

Traditional IT: buy servers (capital expenditure), depreciate over 3-5 years, pay whether or not they're used. Peak capacity must be provisioned for worst-case load, meaning average utilization at large enterprises often sits at 10-20%.

Cloud IT: pay per hour of compute, per GB of storage, per request. Capacity scales with actual demand. Failed experiments cost only the compute time spent failing. This transforms unpredictable large capital outlays into predictable (and variable) operating costs.

Economies of Scale

Cloud providers operate at a scale that enables procurement economics impossible for individual enterprises: bulk hardware discounts, custom silicon (AWS Graviton, Google TPU), negotiated power contracts, and large-scale automation. These savings are partially passed to customers.

AWS estimates typical enterprises achieve 30-50% infrastructure cost reduction by moving workloads to cloud — though this figure is contested and depends heavily on workload characteristics, team cloud maturity, and whether costs are properly attributed.

Pricing Models

On-Demand: Full price, no commitment. Use for unpredictable or short-lived workloads.

Reserved Instances / Committed Use: 1 or 3-year commitment for 40-72% discount. Use for steady-state base load. Risk: over-committing to instance types that become obsolete.

Spot / Preemptible: Bid on unused capacity at 60-90% discount. Provider can reclaim with 2-minute warning. Use for fault-tolerant batch jobs, ML training, stateless workers. Never use for databases, stateful services, or latency-sensitive user traffic.

Savings Plans (AWS): More flexible than RIs — commit to a dollar amount of spend per hour rather than a specific instance type.

Cloud-Native vs Cloud-Hosted

This distinction matters architecturally, not just semantically.

Cloud-Hosted (Lift-and-Shift)

Take an existing application designed for on-premises, run it on cloud VMs without modification. You gain elastic provisioning and OpEx billing, but not elasticity or resilience. A stateful monolith on a single EC2 instance is not cloud-native — it fails exactly like an on-prem server and cannot auto-scale.

Characteristics: stateful, expects persistent local storage, single-instance, assumes stable IP addresses, requires long initialization time.

Cloud-Native

Applications designed from scratch for cloud primitives: treat infrastructure as ephemeral, state lives in managed services (RDS, S3, DynamoDB), instances are fungible, configuration comes from environment variables, horizontal scaling is automatic.

The Twelve-Factor App methodology (Heroku, 2011) codified these principles: stateless processes, treat backing services as attached resources, config in environment, disposable processes (fast start/stop).

Characteristics: stateless processes, external state storage, horizontal scaling, health check endpoints, graceful shutdown handling, structured logging to stdout.

Public vs Private vs Hybrid vs Multi-Cloud

Public Cloud

Shared infrastructure operated by a provider (AWS, GCP, Azure). Resources are logically isolated via virtualization. Benefits: scale, breadth of services, no infrastructure management. Concerns: regulatory compliance for sensitive data, vendor lock-in, unpredictable egress costs.

Private Cloud

Cloud-like infrastructure operated exclusively for one organization, either on-premises (VMware vSphere, OpenStack) or in a dedicated hosted facility. Provides compliance control and predictable performance at the cost of scale and operational overhead.

Hybrid Cloud

Combination of private and public cloud with workload portability between them. Typical pattern: sensitive data processing on-premises, burst compute on public cloud. Requires consistent networking (VPN or Direct Connect), consistent identity (SSO/SAML federation), and consistent tooling (Terraform, Kubernetes).

Complexity is the primary cost. Maintaining two operational environments means maintaining two sets of skills, runbooks, and failure modes.

Multi-Cloud

Running workloads across multiple public cloud providers (e.g., both AWS and GCP). Motivations: avoid vendor lock-in, leverage best-of-breed services, regulatory requirements for geographic redundancy.

The operational complexity cost is severe. Each provider has different APIs, IAM models, networking primitives, and SLA definitions. Truly portable workloads require lowest-common-denominator abstractions that sacrifice cloud-native optimizations. Most "multi-cloud" strategies in practice mean "primary cloud + secondary for specific services."

Major Cloud Provider Comparison

Service Category  | AWS                    | GCP                   | Azure
------------------+------------------------+-----------------------+------------------------
Compute (VMs)     | EC2                    | Compute Engine        | Virtual Machines
Managed K8s       | EKS                    | GKE                   | AKS
Serverless FaaS   | Lambda                 | Cloud Functions       | Azure Functions
Object Storage    | S3                     | Cloud Storage         | Blob Storage
Block Storage     | EBS                    | Persistent Disk       | Managed Disks
Managed RDBMS     | RDS (Aurora, Postgres) | Cloud SQL / AlloyDB   | Azure SQL / Cosmos DB
NoSQL             | DynamoDB               | Firestore / Bigtable  | Cosmos DB
Data Warehouse    | Redshift               | BigQuery              | Synapse Analytics
CDN               | CloudFront             | Cloud CDN             | Azure CDN / Front Door
DNS               | Route 53               | Cloud DNS             | Azure DNS
VPN/Direct        | Direct Connect         | Cloud Interconnect    | ExpressRoute
Identity          | IAM / Cognito          | Cloud IAM / Identity  | Azure AD / Entra ID
Container Reg.    | ECR                    | Artifact Registry     | Azure Container Reg.
Service Mesh      | App Mesh               | Traffic Director      | Azure Service Mesh

AWS leads in market share (~32%) and service breadth. GCP leads in data analytics and ML infrastructure (TPUs, BigQuery). Azure leads in enterprise integration (Active Directory, Office 365 ecosystem).

Colocation vs Cloud

Colocation (colo): you own physical servers, rent rack space and power in a third-party data center. You get physical access, predictable performance, no hypervisor overhead, and no egress costs. You sacrifice elasticity and must provision for peak capacity.

Cloud makes sense when: workloads are variable, time-to-provision matters, managed services add significant value (RDS vs self-managing Postgres), team lacks data center operational expertise.

Colo makes sense when: workloads are stable and predictable, egress costs at cloud scale are prohibitive (large data analytics companies), latency-sensitive workloads require dedicated hardware (HFT, real-time audio processing), regulatory requirements mandate physical control over hardware, cost at scale is lower (large companies often break even vs cloud at $1-5M/year cloud spend).

The "cloud is always cheaper" argument breaks down at scale. Dropbox saved approximately $75M over two years by repatriating workloads from AWS to their own infrastructure in 2016-2017.

Debugging Notes

Cloud billing anomalies are almost always caused by data egress, NAT Gateway, or forgotten running resources (idle Elastic IPs, unused EBS volumes). Enable billing alerts and AWS Cost Anomaly Detection.
IaaS security misconfigurations: audit S3 bucket policies, Security Group rules (0.0.0.0/0 ingress), and IAM roles with * actions on * resources using tools like ScoutSuite, Prowler, or AWS Trusted Advisor.
PaaS cold start debugging: instrument with distributed tracing (OpenTelemetry) to isolate initialization overhead.
Multi-cloud network debugging: inter-cloud latency is typically 5-20ms (both on US-East), but packet loss characteristics differ. Use MTR, not just ping.

Security Implications

Cloud IAM is the new perimeter. Overprivileged IAM roles are the most common attack vector in cloud environments. Apply least privilege; use instance roles, not static credentials.
Secrets in environment variables (common PaaS pattern) are accessible to anyone with process introspection. Use a secrets manager (AWS Secrets Manager, HashiCorp Vault) with dynamic short-lived credentials.
Shared responsibility is a legal/compliance concept, not just a technical one. Understanding it is essential for SOC2, HIPAA, and PCI DSS compliance audits.
Multi-cloud increases attack surface: each provider adds IAM systems, credential stores, and network controls that must all be properly configured.

Performance Implications

IaaS VM performance varies based on noisy neighbor effects. Use dedicated instances or hosts for consistent performance (database workloads, latency-sensitive services).
PaaS platforms add layers of abstraction that introduce latency. Heroku routing mesh adds ~1-3ms. Test with realistic traffic patterns.
FaaS cold starts can add hundreds of milliseconds to P99 latency. For latency-sensitive paths, Provisioned Concurrency (Lambda) or minimum instances (Cloud Run) eliminate cold starts at a cost.
Object storage (S3, GCS) has high throughput but ~100ms baseline latency per operation. Never use object storage for hot database access patterns.

Failure Modes

Single-AZ deployment: any AZ-level failure takes down your service. Always span at least 2 AZs for production.
Control plane vs data plane: cloud service control planes (creating/modifying resources) are less reliable than data planes (serving traffic). Design so that your application continues serving traffic even if you cannot make API calls to create new resources.
Cloud provider SLAs cover infrastructure availability, not application correctness. A 99.99% SLA means 52 minutes downtime/year — and SLA credits are a fraction of revenue lost.
Egress cost surprises: inter-region and internet-facing data transfer costs are often underestimated in architecture planning. Model egress costs explicitly.

Modern Usage

Serverless-first architectures are increasingly common for new applications: Lambda + API Gateway + DynamoDB + S3 eliminates all server management. The operational model shifts from "manage servers" to "manage IAM policies and event routing."

FinOps (cloud financial operations) has emerged as a discipline specifically to manage cloud spend — with dedicated tools (CloudHealth, Apptio Cloudability), practices (tagging standards, showback/chargeback), and organizational roles.

Future Directions

Confidential computing (AWS Nitro Enclaves, Azure Confidential VMs, Google Confidential Computing) — process sensitive data in hardware-encrypted enclaves invisible even to the cloud provider.
Edge computing extending cloud primitives to the network edge: Cloudflare Workers, AWS Lambda@Edge, Fastly Compute@Edge.
Sovereign cloud regions: localized deployments meeting national data residency requirements (EU, China, GCC).
AI/ML as primary cloud workload driver: GPU/TPU availability becoming a differentiating factor in cloud provider selection.

Exercises

Calculate the break-even point between on-demand and reserved pricing for a steady-state workload using your cloud provider's pricing calculator.
Audit an existing application's architecture against the Twelve-Factor App criteria. Identify which factors it violates and what changes would be required to make it cloud-native.
Build a shared responsibility matrix for a HIPAA-regulated application deployed on AWS EKS. Identify exactly which controls you are responsible for.
Estimate the monthly egress cost for an application serving 1TB/day of video content from S3 to end users across US, EU, and APAC.
Design a multi-AZ architecture for a stateful PostgreSQL-backed web application that survives a single AZ failure with RTO < 60 seconds and RPO < 5 seconds.

References

Amazon Web Services: Overview of Amazon Web Services (AWS Whitepaper)
NIST Special Publication 800-145: The NIST Definition of Cloud Computing
Twelve-Factor App: https://12factor.net (Wiggins, Heroku, 2011)
Dropbox Infrastructure: https://dropbox.tech/infrastructure/open-sourcing-our-go-libraries
Google SRE Book, Chapter 26: Data Integrity
Werner Vogels, "A Decade of AWS" (re:Invent 2016)
FinOps Foundation: https://www.finops.org