Ryan Scott Brown

I build cloud-based systems for startups and enterprises. My background in operations gives me a unique focus on writing observable, reliable software and automating maintenance work.

I love learning and teaching about Amazon Web Services, automation tools such as Ansible, and the serverless ecosystem. I most often write code in Python, TypeScript, and Rust.

B.S. Applied Networking and Systems Administration, minor in Software Engineering from Rochester Institute of Technology.

Carpe Access: AWS IAM for People & Systems

One rogue * can bypass security controls for an S3 bucket with petabytes of data if you aren’t paying close attention. AWS has 150 services and counting, apps need more external services and therefore granular RBAC. As engineers, we now have to worry about access for team members, LLM agents, and services we run. Managing the right levels of access is great for limiting the blast radius of a bug. Developers (and their agents) can’t ruin what they can’t change.

CARPE is a method that helps me write better IAM policies. It’s extra useful when an app needs a group of services like DynamoDB, S3, and Kinesis. CARPE is a backronym based on the five sections of an IAM policy statement.

Writing Policies

CARPE started out as a mnemonic to use when writing policy statements, standing for:

  • Condition
  • Action
  • Resource
  • Principal
  • Effect

Serverless means writing more IAM policies than usual, and remembering CARPE lets me looking up the policy structure in the docs every single time.

Like ogres, onions, and parfaits IAM execution is layered: in a single AWS account, there are 6 layers of policies that access filters through before activity is allowed.

  • Deny by default
  • Organization resource control policy (RCP)
  • Organization service control policy (SCP)
  • Resource policy
  • Permission boundary
  • Identity policy

Each of these policy layers may have have dozens of statements that may or may not apply to you.

CARPE Case Study

At first, a single role per “service” might seem right. This might work if you have a microservice that’s small enough, but granular roles are easier to manage. Each function didn’t have its own role, and it felt familiar to how I’d delegate permissions to a regular web service. It’s also pretty low-management, since a full application might only have a couple roles along these lines:

  • A role for frontend type functions that talk to DynamoDB or a SQL database. Typically this would be extra-restrictive since untrusted input could be coming in from the outside world.
  • A role for backend functions that handle Kinesis or DynamoDB stream events, but don’t take untrusted input. Data handled here would have been sanitized and validated by public-facing frontend functions.
  • A role for administrivia functions that handled infra tasks like scheduled jobs. These sometimes need super-privileges like managing ECS containers or kicking off tasks.

That works well enough, but there will be functions that need a special (somewhat powerful) permission. Usually it gets tacked on to the administrivia role. Unfortunately, that builds into a grossly overpowered administrivia role. Not even close to the principle of least privilege.

Now CARPE is part of how I divide permission statements between managed policies. When redesigning the permission scheme to break up a role like the overpowered administrivia: make one IAM role per function and use a regular naming scheme. I like {app}-{environment}-{function}, for example: myapp-prod-dbbackup. Each role can have up to 5 managed policies attached, plus inlines. permissions can be quite granular.

The four parts of an individual IAM statement are a nice lens to look through when building a shared policy.

Before writing one, I describe it in CARPE terms so it covers everything about the policy. Take the ApplicationDataStoreAccess policy for example:

  • Condition: The environment has to be production, and the function has to be in the VPC for RDS access.
  • Action: Read and write to DynamoDB tables, a Kinesis stream, and an RDS database.
  • Resource: The prod-users and prod-queues DynamoDB tables and the prod-clickstream Kinesis stream. RDS is excluded because that’s handled by the security group, since RDS connections aren’t IAM.
  • Principal: For IAM roles/users the P(rincipal) in CARPE is implied by whatever the policy gets attached to.
  • Effect: Allow
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowProdDataAccess",
      "Effect": "Allow",
      "Action": [
        "dynamodb:GetItem",
        "dynamodb:PutItem",
        "dynamodb:UpdateItem",
        "dynamodb:BatchGetItem",
        "dynamodb:BatchWriteItem",
        "dynamodb:Query",
        "dynamodb:Scan",
        "kinesis:PutRecord",
        "kinesis:PutRecords",
        "kinesis:GetRecords",
        "kinesis:GetShardIterator",
        "kinesis:DescribeStream"
      ],
      "Resource": [
        "arn:aws:dynamodb:*:*:table/prod-users",
        "arn:aws:dynamodb:*:*:table/prod-queues",
        "arn:aws:kinesis:*:*:stream/prod-clickstream"
      ],
      "Condition": {
        "StringEquals": {
          "aws:ResourceTag/environment": "production"
        },
        "StringEqualsIfExists": {
          "aws:SourceVpce": "vpce-xxxxxxxx"
        }
      }
    },
    {
      "Sid": "ExplicitDenyEverythingElse",
      "Effect": "Deny",
      "NotAction": [
        "dynamodb:GetItem",
        "dynamodb:PutItem",
        "dynamodb:UpdateItem",
        "dynamodb:BatchGetItem",
        "dynamodb:BatchWriteItem",
        "dynamodb:Query",
        "dynamodb:Scan",
        "kinesis:PutRecord",
        "kinesis:PutRecords",
        "kinesis:GetRecords",
        "kinesis:GetShardIterator",
        "kinesis:DescribeStream"
      ],
      "Resource": "*"
    }
  ]
}

For resources, the principal matters but now the resource is implied. A * in a policy attached to a bucket can only affect that bucket. Below, we have a bucket policy forcing a specific role to connect via a VPC endpoint.

    {
      "Sid": "UseVpc",
      "Effect": "Allow",
      "Action": [
        "s3:GetObject*",
        "s3:PutObject*",
      ],
      "Principal": {
        "AWS": [
          "arn:aws:iam::111122223333:role/..." // to affect a specific principal
      },
      "Resource": "*",
      "Condition": {
        "StringEquals": {
          "aws:SourceVpce": "vpce-xxxxxxxx"
        }
      }
    }

Wrapping Up

And that’s CARPE. Remember: one role per task/function, but reuse policies where it makes sense. The fewer places you need to update security rules, the more likely you are to get it right.

Design by Sam Lucidi (samlucidi.com)