Container-Optimized OS: A Pragmatic Approach to Running Containers on Google Cloud

As cloud computing continues to evolve, so does the need for efficient, secure, and scalable ways to run containerized applications. In this essay, we’ll delve into what makes COS stand out, its security advantages—particularly root filesystem immutability, seamless updates, and how it can be smartly utilized with regional managed instances for scaling. We’ll also explore the role of startup scripts and other practical considerations for engineering teams.


A Brief Overview of Container-Optimized OS

Container-Optimized OS is a lightweight, secure operating system image designed by Google specifically for running containers on GCP. Based on the open-source Chromium OS project, COS is tailored to offer a minimal footprint, reducing potential attack surfaces and simplifying maintenance. It comes pre-installed with essential tools like Docker and containerd, enabling teams to deploy containers out of the box without additional setup.

One of the key benefits of COS is its tight integration with GCP services. It’s the default node OS for Kubernetes Engine (GKE) and is optimized for Google’s infrastructure, providing automatic updates and security patches directly from Google. For teams already invested in GCP, COS offers an easy path to deploying and managing containerized workloads efficiently.


Security Review: RootFS Immutability and Easy Updates

Security is often a paramount concern when running applications in the cloud. COS addresses this head-on with several built-in features, notably root filesystem immutability and automatic updates.

Root Filesystem Immutability

The root filesystem in COS is mounted as read-only and is immutable. This design choice significantly enhances security by preventing unauthorized or accidental modifications to the core operating system files. The kernel computes checksums of the root filesystem at build time and verifies them on each boot, ensuring the integrity of the system. This approach minimizes the risk of persistent attacks, as any changes to the filesystem do not persist across reboots.

Here, you could leverage Shield VMs for additional security. Read more about the importance of that policy here

Furthermore, COS employs a stateless configuration for directories like /etc/, which are writable but do not retain changes after a reboot. This means that every time a COS instance restarts, it starts from a clean state, reducing the chances of configuration drift and ensuring consistency across instances.

If you wish to decrease the overall deployment complexity, anything that you deem should be persisted should go to a bucket, which can be mounted to /mnt/disks/.

Automatic and Seamless Updates

COS is configured to automatically download weekly updates in the background. These updates include security patches and performance improvements, which are applied upon reboot. The automatic update mechanism is designed to be non-intrusive, allowing workloads to continue running without interruption until a reboot is scheduled.

For organizations, this means less overhead in managing updates and patches. Since the updates are provided and maintained by Google, teams can rely on timely patches for vulnerabilities without manual intervention. This is particularly advantageous in large-scale deployments where manually updating each instance would be impractical.

This can be configured with a metadata tag cos-update-strategy=update_enabled|update_disabled to control the update behavior.


Scalable use case with Regional Managed Instances

While Kubernetes is often the go-to solution for scaling containerized applications, there are scenarios where a simpler setup might suffice. COS can be effectively used with regional managed instance groups (MIGs) to handle scaling without the complexity of Kubernetes.

Scaling with Managed Instance Groups

Managed Instance Groups allow you to deploy a group of identical instances that you can control as a single entity. By using COS as the base image for instances in a MIG, you can leverage its fast boot times and security features while managing scaling policies at the instance group level.

For example, if you have a stateless application packaged in a container that doesn’t require the orchestration features of Kubernetes, deploying it on a COS-based MIG can simplify your architecture. You can set up autoscaling policies based on CPU usage, HTTP load balancing, or custom metrics, allowing your application to scale out and in based on demand.

Regional Distribution for High Availability

By deploying your MIG across multiple zones within a region, you enhance the availability of your application. COS’s quick startup times mean that new instances can be brought online rapidly in response to scaling events or in the case of zone failures.

This approach provides a balance between simplicity and scalability. You get the benefits of automated scaling and high availability without the overhead of managing a full Kubernetes cluster.

Example of a Regional MIG

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
resource "google_compute_region_instance_group_manager" "test_rigm" {
  name    = "test-rigm"
  region  = var.region
  project = module.project.project_id

  base_instance_name = "test-instance"
  target_size        = 3 # Consider this similar to replicas in k8s

  # Example of a rolling update policy
  update_policy {
    type                         = "PROACTIVE"
    minimal_action               = "REPLACE"
    max_surge_fixed              = 1
    max_unavailable_fixed        = 0
    replacement_method           = "SUBSTITUTE"
    instance_redistribution_type = "PROACTIVE"
  }

  version {
    instance_template = var.instance_template_version
  }

  named_port {
    # If using templates this can be even more dynamic (similar to the `health_check_id` below)
    name = var.named_port_name
    port = var.named_port_number
  }

  auto_healing_policies {
    health_check      = local.instance_templates["test_template"].health_check_id
    initial_delay_sec = 200
  }

  # If you wish to keep it predictable with zoning
  distribution_policy_zones = [
    "${var.region}-b",
    "${var.region}-c"
  ]

  lifecycle {
    ignore_changes = [
      update_policy,
      distribution_policy_zones
    ]
  }

  depends_on = [
    # ...
  ]
}

Startup Scripts Explained

Startup scripts are a powerful feature in GCP that allow you to run commands when an instance boots. With COS’s immutable filesystem and stateless design, startup scripts become essential for configuring instances at runtime.

Using Startup Scripts with COS

Since you cannot install software packages directly onto the COS instance due to the lack of a package manager and the immutable root filesystem, startup scripts are used to set up the necessary environment for your containers.

For instance, you might use a startup script to:

  • Pull the latest version of your container image from a registry.
  • Configure environment variables or secrets needed by your application.
  • Set up system configurations that are required at runtime.

Example of a Startup Script

Here’s a simplified example of a startup script that pulls a container image from Artifact Registry and runs it:

1
2
3
4
5
6
#!/bin/bash
docker-credential-gcr configure-docker --registries="gcr.io,us-west1-docker.pkg.dev,docker.europe-west3.rep.pkg.dev"
docker-credential-gcr gcr-login

docker pull gcr.io/your-project/your-image
docker run --rm gcr.io/your-project/your-image

This script configures Docker to authenticate with Artifact Registry and then runs your container image. By including this script in the instance metadata, you ensure that every instance in your MIG starts with the correct configuration.

Persistent Configuration with cloud-init

COS supports cloud-init, allowing you to define your startup scripts in a cloud-config format. This is particularly useful for more complex configurations or when you need to write files, define systemd services, or perform other initialization tasks.

Better way to define a Startup Script

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
locals {
  secure_startup_script = <<-EOF
    #!/bin/bash
    set -e

    # Fetch ENVIRONMENT from instance metadata
    ENVIRONMENT=$(curl -H "Metadata-Flavor: Google" \
      http://metadata.google.internal/computeMetadata/v1/instance/attributes/ENVIRONMENT)
    export ENVIRONMENT

    # Check if ENVIRONMENT is set
    if [ -z "$ENVIRONMENT" ]; then
        echo "Error: ENVIRONMENT is not set."
        exit 1
    fi

    # Create directory for application files
    mkdir -p /mnt/disks/app

    echo "Fetching secrets from Secret Manager for env $${ENVIRONMENT}"
    docker run --rm \
      -v /mnt/disks/app:/app \
      gcr.io/google.com/cloudsdktool/cloud-sdk:alpine \
      sh -c "\
        gcloud secrets versions access latest --secret='your_sec_per_$${ENVIRONMENT}' > /app/config.json && \
        gcloud secrets versions access latest --secret='another_sec_$${ENVIRONMENT}' > /app/pk \
      "
  EOF
}

Two important things to note here:

  • The script fetches the ENVIRONMENT variable from the instance metadata, which can be set when creating the instance.
  • Execution of the cloud-sdk container will, by default, inherit the instance’s service account, allowing it to access Secrets Manager.

Additional Considerations

Beyond the core features, there are several other aspects of COS that are worth understanding.

Monitoring and Logging with Node Problem Detector

COS includes the Node Problem Detector (NPD) agent, which monitors the system’s health and reports metrics to Cloud Monitoring. NPD can help you detect issues like disk pressure, memory leaks, or kernel problems. While NPD doesn’t monitor individual containers, it provides valuable insights into the underlying VM’s health, which can be critical for diagnosing issues in production environments.

Securing Containers with AppArmor

Security profiles are essential for enforcing the least privilege and preventing containers from performing unauthorized actions. COS supports AppArmor, a Linux kernel security module that restricts the capabilities of processes. You can apply default Docker AppArmor profiles or define custom profiles to tailor the security settings for your containers.

For example, you might create a custom AppArmor profile that prevents a container from accessing raw network sockets, enhancing security for sensitive applications.

Immutable Infrastructure and Deployment Strategies

COS’s design aligns well with immutable infrastructure principles. Since the root filesystem is read-only and instances start fresh on each boot, you can be confident that the environment is consistent across deployments. This reduces the “it works on my machine” problem and simplifies troubleshooting.

For deployment strategies, this means you can adopt patterns like blue-green deployments or rolling updates with greater confidence. By updating your container image and redeploying instances, you ensure that all instances are running the same code in the same environment.

Things do go south on occasion, and at such a moment there is a GCP toolbox to help you debug. But my personal experience was that it’s still not production grade nor really helpful in most cases.


When to Choose Container-Optimized OS

While COS offers many benefits, it’s important to assess whether it’s the right choice for your specific needs.

Ideal Use Cases

  • K8S: You would like to avoid introduction of Kubernetes, but still want to run containers in a scalable and secure manner.
  • Containerized Applications: If your workloads are already containerized and you don’t require additional software installations on the host OS.
  • Security-Conscious Deployments: Environments where security is a top priority, and the immutable filesystem and automatic updates are beneficial.
  • Simplified Management: Teams that prefer a managed OS experience with minimal maintenance overhead.

Limitations to Consider

  • No Package Manager: You cannot install additional software directly on the OS, which may be a limitation if your containers depend on host-level software.
  • Limited Customization: The locked-down nature of COS means less flexibility in modifying the OS environment.
  • Not Suitable for Non-Containerized Workloads: If your applications are not containerized, COS is not the appropriate choice.

Example of a COS Instance

This one can easily be turned into a Terraform module.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
#main.tf
resource "google_compute_instance_template" "this" {
  name         = "${var.name_prefix}-template-${var.template_version}"
  machine_type = var.machine_type
  region       = var.region
  project      = var.project_id

  tags   = var.tags
  labels = var.labels

  service_account {
    email  = var.service_account_email
    scopes = var.service_account_scopes
  }

  network_interface {
    network    = var.network_self_link
    subnetwork = var.subnetwork_self_link

    access_config {
	# Recommendation here is to target VPCs for internal comms
	# WARNING: Otherwise, if left empty or non-defined, this will use GCPs premium network and expose a public IP which can and will affect your bill
    }
  }

  metadata = var.metadata

  metadata_startup_script = var.startup_script

  disk {
    source_image = var.source_image
    auto_delete  = true
    boot         = true
  }

  shielded_instance_config {
    enable_secure_boot          = var.enable_secure_boot
    enable_vtpm                 = var.enable_vtpm
    enable_integrity_monitoring = var.enable_integrity_monitoring
  }

  lifecycle {
    create_before_destroy = true
  }
}

⚠️ Do not forget to update access_config {}

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
#variables.tf
variable "name_prefix" {
  description = "Prefix for the instance template name."
  type        = string
}

variable "template_version" {
  description = "Version identifier for the instance template."
  type        = string
}

variable "machine_type" {
  description = "Machine type for the instance."
  type        = string
  default     = "e2-micro"
}

variable "region" {
  description = "GCP region."
  type        = string
}

variable "project_id" {
  description = "GCP project ID."
  type        = string
}

variable "tags" {
  description = "Network tags for the instance."
  type        = list(string)
  default     = []
}

variable "labels" {
  description = "Labels for the instance."
  type        = map(string)
  default     = {}
}

variable "service_account_email" {
  description = "Service account email."
  type        = string
}

variable "service_account_scopes" {
  description = "Scopes for the service account."
  type        = list(string)
  default     = ["https://www.googleapis.com/auth/cloud-platform"]
}

variable "network_self_link" {
  description = "Self link of the VPC network."
  type        = string
}

variable "subnetwork_self_link" {
  description = "Self link of the subnetwork."
  type        = string
}

variable "metadata" {
  description = "Metadata key-value pairs."
  type        = map(string)
  default     = {}
}

variable "startup_script" {
  description = "Startup script for the instance."
  type        = string
}

variable "source_image" {
  description = "Source image for the boot disk."
  type        = string
  default     = "cos-cloud/cos-stable"
}

variable "enable_secure_boot" {
  description = "Enable Secure Boot."
  type        = bool
  default     = true
}

variable "enable_vtpm" {
  description = "Enable vTPM."
  type        = bool
  default     = true
}

variable "enable_integrity_monitoring" {
  description = "Enable Integrity Monitoring."
  type        = bool
  default     = true
}

And if you structure it in that manner, then it will be a breeze to change startup scripts, or metadata depending on the instance:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# Excerpt for a simple approach with leveraging locals
locals {
  instance_templates = {
    test_template = {
      name_prefix           = "test"
      template_version      = var.template_version
      tags                  = ["test-template"]
      labels                = { "service" = "test-svc", "environment" = var.env_name }
      service_account_email = google_service_account.test_svc_sa.email
      metadata = {
        "ENVIRONMENT"                  = var.env_name
        "google-logging-enabled"       = "true"
        "google-logging-use-fluentbit" = "true"
        "enable-oslogin"               = "true" # Required if you wish to leverage IAP!
      }
      startup_script    = local.secure_startup_script
      target_size       = 3
      named_port_name   = "secure-named-port"
      named_port_number = 1337
      health_check_id   = google_compute_health_check.test_instance_health_check_igm.id
    }
  }
}

Conclusion

Container-Optimized OS presents a robust, secure, and efficient platform for running containerized applications on Google Cloud. Its design principles of immutability, minimalism, and tight integration with GCP services make it a strong candidate for teams looking to simplify their infrastructure and focus on delivering value through their applications.

As with any technology choice, it’s crucial to evaluate how COS aligns with your application’s requirements and your team’s expertise. For many, it provides a pragmatic solution that balances security, performance, and simplicity—key considerations in today’s fast-paced engineering landscape (rarely do I meet a team that’s not on some Rapid Development cycle).