On Jan 29, 2022

IAM a Developer - What Do I Really Need to Know About AWS IAM?

Ivan Dwyer
Ivan DwyerFounder at IAM Pulse

As a software developer, you're constantly in demand... and being demanded to learn new things constantly. AWS IAM is a topic that keeps coming up – but how much should you really learn about it?

The following article is part of a series explaining AWS IAM through the lens of specific job functions. IAM (Identity & Access Management) is the underlying permissions system that dictates who can do what under which conditions.

Welcome to the IAM Party

As every company becomes a software company in some form or fashion, developers are the lifeblood of the business. Demand for developers has never been higher, and will continue to grow at a rapid pace for the foreseeable future. This puts you in an excellent career position with a lot of bargaining chips at your disposal. Nice work!

You're a tinkerer, and enjoy digging into learning new technologies, but I'm guessing that IAM isn't a 3-letter acronym that gets you that excited. Ever since your company went all in on AWS, though, it's a term that keeps coming up – seemingly at the most inconvenient times. You know you need to learn more about it, but how far should you really take it?

Why IAM Matters To You

Look, you're a responsible developer who cares about security – you wouldn't be reading this article if you weren't. Whether you're a senior engineer today, or on the path to becoming one, carrying ownership of your code is crucial to your personal and professional growth. At the same time, you want to remain focused on the core aspects of your job, and not branch out too far from your primary responsibilities.

It's a delicate line to walk, but one way to remain balanced is to gain a fundamental understanding of how cloud permissions work. You see, IAM isn't just another service in the AWS catalog, it's the thing that ties it all together – all of the people, data, apps, and services. Learning the internals of computers helped you get here, learning the internals of the cloud will help you continue – that's IAM.

You also have a job to do – and a hard one, nonetheless. You don't want to get bogged down in back and forth discussions around what's allowed and what's not. When you're constantly on the hook to deliver software, the last thing you want to do is argue over permissions. Gaining a fundamental understanding of how IAM works helps you navigate these conversations – your requests will be better received, and you'll be able to better respond to feedback.

What You Really Need to Know About AWS IAM

There's a wealth of documentation for AWS IAM out there that explains the entities such as users, groups, and roles, structural elements such as organizations and accounts, attributes such as tags, and the policy specifications themselves. I'm going to assume a basic understanding of the terminology and core concepts here. Let's dig into the realness with a Top 5 list for the ages.

1. Knowing What "Right Sized" Means

There's a good chance you've heard the term Least Privilege Access, or the Principle of Least Privilege (POLP). This is a fancy way of saying that users should only be granted the bare minimum permissions to be able to perform the task at hand. While a good security outcome to strive for, the term as defined leads to a natural inclination to lock everything down.

Much of the tension that occurs between dev, ops, and security teams stems from a differing opinion as to what should be allowed versus not allowed. While you understand why least privilege is important, you shouldn't be expected to know the details of every possible action, but you should at least be able to describe what data needs to be accessible by which services. With this knowledge, a more practical framing of permissions that balances the speed of the business with the safety of security is "Right Sized".

Think of the wide range of cloud environments, workloads, and data across your organization – some efforts are meant to be more exploratory in the name of R&D, while others are required to fall under strict control in the name of compliance. Different classifications should mean different levels of access, not necessarily least privilege to the purest letter of the law. A super power for developers is being able to speak "Right Sized" with your security counterparts by correctly identifying data classifications, consistently tagging resources according to the nature of the workload, and making requests within the context of the target environment. Saying things like the following will work wonders...

"I know that this S3 bucket contains financial data subject to PCI, but this Lambda function needs to be able to read the contents. It's resource policy only needs to allow List permissions for this specific Lambda function under the specific condition that it's executed on behalf of a direct customer request."

"I was reviewing the Terraform repo for that new internal dashboard tool we built for the Marketing team, and noticed there were no resource tags on any of the EC2 instances being deployed. I wanted to make sure those weren't confused for production instances, so I added an ‘Application=Campaign Dashboard' tag to the repo and submitted a PR for your review."

"We're testing a new app and could use another dedicated AWS account. We're not working with any sensitive info, only mock data. Because we're not yet entirely sure what actions the app will need to perform in a real world setting, I'd like to request wider permissions for the team, but only within the boundaries of this specific account."

As you get more comfortable with "Right Sizing", you'll gain a greater vote of confidence from your security team that, yes, you do in fact know what it is you're doing!

Further Reading: IAM Access Analyzer makes it easier to implement least privilege permissions

2. Navigating Boundaries

Every permission system is built with guard rails that prevent people from going out of bounds. The challenge with IAM lies in the multi-dimensional nature of AWS, and the various types of overlapping boundaries that can exist. There's not a single boundary that can represent the totality of an environment, so it's easy to get yourself tied into knots trying to reason with all the possible dimensions and how permissions could overlap. In keeping with the spirit of this post, here's a simple and logical grouping of boundaries and how they're applied.

Account Boundaries

An AWS account acts as a logical boundary in itself, where IAM users are constrained to accessing resources within the same account. In a one-dimensional world, you could stop here, but alas it's never that simple. Companies very quickly outgrow a single AWS account. This leads to scenarios where users can access many accounts, services in one account need to access resources in another account, or 3rd parties need access granted to resources within an account.

For developers, account boundaries come up when you're frequently context switching, or you need to find a way to cross boundaries. For example:

When using the CLI or interacting with the API, make sure the principal you're acting as is in the right account for the context of the task at hand. It's easy to get tripped up here if you're not the principal you think you are.

When you need to cross accounts to gain access to a service and/or resource in another account, you'll need to establish a trust relationship. This will likely be configured in conjunction with your security counterparts, so be sure to clarify what you need to do.

Organizational Boundaries

In multi-account scenarios, it's a best practice to leverage AWS Organizations to consolidate and manage all of your account configurations. Administrators can set boundaries at the organizational level that cascade down to all member accounts. These are called Service Control Policies (SCPs), and their function is to dictate the maximum permissions allowed by any principal in any account within that Organization. SCPs do not in themselves allow or deny access, they are applied in an overlapping situation with IAM policies during a request evaluation.

For developers, organizational boundaries often come up when you find yourself getting "access denied" errors, but you can't figure out why. This is because SCPs aren't readily viewable at the account or identity level. For example:

When you try to spin up a resource in a region that the Organization admin has restricted any member account access to. You might dig through all of your applicable identity policies that tell you that you should be good to go, only to learn that it was an SCP that nobody told you about. (Note that AWS has improved error messaging for these scenarios)

When you try to spin up an EC2 instance and select an instance type that isn't in a list of approved types as individually specified in an SCP. You could have full EC2 permissions, but still get blocked, ruining your hopes of mining crypto on the company dollar (please don't try that).

Identity Boundaries

Permissions Boundaries are very similar to SCPs, but in this case the maximum permissions allowed are attributed to a specific identity instead of cascading down to all member accounts. Again, permissions boundaries do not allow or deny access in themselves, they are used as guard rails around IAM policies. Why would one use a Permission Boundary over an SCP? It's an individual over the organization consideration, and one that frankly might ruffle some feathers.

But looking beyond the surface, these boundaries are a good thing if done well. They are best used to prevent unwanted privilege escalation paths, that you as a developer wouldn't want to accidentally find yourself doing anyway. For example:

Of the many thousands of possible actions, they can be generally classified as List, Read, Write, and Permissions Management. As a developer, you can all but call full access the List, Read, and Write actions. You won't need any Permissions Management actions, and should be okay if there's a boundary around you preventing that from being allowed.

Similarly, you as a developer won't need to interface with the IAM service itself, better to delegate that to security & operations administrators. Having a permissions boundary around the IAM service won't impact your day to day in any way, and prevent bad things from accidentally happening.

Of the 3 flavors of IAM boundaries specified here, permissions boundaries are the most personal, yet most welcome boundary. Account and Organizational boundaries can make your job harder, Identity boundaries (if done well) should not.

Note: There are also non-IAM-related boundaries such as VPCs, but we'll stay focused for the purposes of this article.

Further Reading: Policies and permissions in IAM

3. Taking a Walk in the PARC

AWS IAM Policy documents remind me of the tagline for the classic board game Othello – A Minute To Learn, A Lifetime to Master. One of the reasons why I believe IAM is so challenging is that you can very quickly read the specification and think to yourself, "I got this", only to get schooled by an "Access Denied" error. The simplicity of it IAM lies in the following statement definitions, known as the PARC Model:

  • Principal: who does this statement apply to?
  • Action: what are the actions being allowed or denied?
  • Resource: what specific Amazon resources are affected?
  • Condition: which things can make this statement evaluate to true or false?

Thinking back to the original purpose of a permissions system, this model aligns well to the phrase, who can do what under which conditions. When you look at a single policy through this lens, IAM feels simple… almost too simple. Why is this all so hard then? When you think about all the things we've covered so far, the true challenge is that it's nearly impossible to form a mental model that covers the totality of the IAM specification – there's tens of thousands of potential actions across hundreds of services, resources are ephemeral in nature across environments, there's many possible policies & boundaries to account for, and there's a whole bunch of context & conditions that needs to be factored in. That's hard to do by looking at a JSON document alone.

As a developer, you primarily want to know what your workload needs to be able to do in the target environment. And let's be honest... as a developer, you probably won't be the one setting boundaries or writing deny statements. Thankfully, we can use the same PARC model to reason with outcomes by orienting ourselves to the context of the job-to-be-done.

Say you want to deploy a new Lambda function and you get tasked with authoring the corresponding policy:

  • Principal: I'm orienting myself in the context of this Lambda function. When I spin it up, I'll attach this Role.
  • Action: does this function need to perform any AWS API calls? If so, are there any dependencies that I need to consider as well?
  • Resource: does this function have any data sources or destinations to connect to?
  • Condition: does it matter how this function gets triggered - when or from where?

With Lambda, the policy gets attached to the service role that the function spins up with. If your function just needs to be able to put objects into a specific S3 bucket, you could write an identity-based policy as follows.

1{
2  "Version": "2012-10-17",
3  "Statement": [
4    {
5      "Sid": "AllowPutS3",
6      "Effect": "Allow",
7      "Action": [
8        "s3:PutObject"
9      ],
10      "Resource": [
11        "arn:aws:s3:::mybucket*"
12      ]
13    },
14  ]
15}

If you change the orientation, you change the line of questioning, but in the same format. Say you want to spin up a new S3 bucket and get tasked with writing the corresponding resource-based policy:

  • Principal: who needs to be able to access this bucket?
  • Action: what things do users need to be able to do with this bucket and to their objects?
  • Resource: which buckets should this policy apply to, and which contents?
  • Condition: does it matter how this bucket gets accessed - when or from where?

With a service like S3, you have to factor in the bucket itself as well as the contents. An example bucket policy that allows full access to select IAM users, only when coming from a specific IP range could be as follows:

1{
2  "Version": "2012-10-17",
3  "Statement": [
4    {
5      "Sid": "AllowFullS3",
6      "Effect": "Allow",
7      "Principal": {
8        "AWS": [
9          "arn:aws:iam::111122223333:user/ivan.dwyer",
10          "arn:aws:iam::111122223333:user/fortyfivan"
11        ]
12      },
13      "Action": "s3:*",
14      "Resource": [
15        "arn:aws:s3:::mybucket",
16        "arn:aws:s3:::mybucket/*"
17      ]
18    },
19    {
20      "Sid": "LimitIPAllow",
21      "Effect": "Deny",
22      "Principal": "*",
23      "Action": "s3:*",
24      "Resource": [
25        "arn:aws:s3:::mybucket",
26        "arn:aws:s3:::mybucket/*"
27      ],
28      "Condition": {
29        "NotIpAddress": {
30          "aws:SourceIp": "54.240.143.0/24"
31        }
32      }
33    }
34  ]
35}

These are both simple examples, but this article is meant to be a simple lesson. There's an infinite number of possible IAM Policy document combinations, and there's a wealth of information to become a "grandmaster" should you so desire.

4. Making Sense of the Policy Evaluation Logic

Once you gain a fundamental understanding of individual policy documents, you'll next need to know how many policies come together to form an access decision. Like every good computer, AWS IAM is a deterministic system. There's only two possible outcomes for every request – allow or deny. What goes into that decision is a multi-dimensional array of context and combinations that can quickly turn your brain into a giant bowl of soup. For reference, here's a diagram from AWS that explains the logic – great for deterministic systems, not so great for human understanding.

As a developer, you primarily care about the right things being allowed for your workloads to function, so you'll encounter this evaluation logic any time you face an "Access Denied" error - either in the console, CLI, or via API. While AWS has recently improved their inline messaging when returning an error as mentioned earlier, you're still left with figuring out the root cause. Rather than go through this sequence like a game of permissions whack-a-mole, I prefer to break it down into a sequential series of questions I can better wrap my head around:

Are there any allows? All decisions start with a default deny, so for an action to be allowed, it must be explicitly stated somewhere. If there's no policy attached to the principal explicitly granting access, you can stop your sleuthing right in its tracks. If there is an explicit allow and the result is deny, you're left to figure out where that is coming into play.

What are the principal's boundaries? First, you need to know who is making the request – user, role, or service. From here, I find it most useful to walk backwards from the session to the user to the account to the organization. Along with all of the applicable policies, are there any boundaries that impact the final decision? Put your newfound knowledge of boundaries to the test!

Did any conditions trigger? As we've covered, policies can contain conditional statements, which rely on context to evaluate. Additionally, target resources can have their own resource-based policies attached – an example could be an S3 bucket policy. Now this might be controversial, but I'm grouping resource-based policies and policy conditions together in the same framework here. Why? As a developer, those are related pieces of context that lead to the same outcome, despite being different pieces of the IAM specification that get evaluated independently. We're completing the permissions system sentence with, under which conditions.

This is an area where you'll find your own personal preference as collecting all applicable information and context can be tedious, and often requires a lot of traversal and cross-reference. My general guidance is to find a pattern that makes sense to you, and stick with it. It doesn't necessarily have to follow the exact same logic pattern as the core service itself – it's more important that you and your team understands it, and hopefully follows the same patterns.

Further Reading: Policy evaluation logic - AWS Identity and Access Management

5. Security Token Service

Last, but not least (privilege), let's go back... way back to the beginning. What's the first thing that comes to mind when you think of IAM? Identity, of course. But the irony of AWS IAM is that identity isn't necessarily who you are in the traditional sense. An identity in AWS IAM is an entity that performs actions. As we've learned so far, an identity could be a human user like yourself, a service account like Terraform, or a given role like Billing Admin.

The AWS IAM service performs two key functions independently – authentication and authorization. These functions are triggered on every single API request, which as we learned from Werner Vogels during his most recent re:Invent keynote happens half a billion times per second. Say it with me… per second. To be properly authenticated, principals need credentials – traditionally, a user has an access key and secret token they use to authenticate themselves. But as long-lived credentials that can easily be lost, stolen, or misused, they are incredibly dangerous. Thankfully, there's a new pattern that mitigates the risk of credential theft.

Security Token Service (STS) is the service that issues short-lived session credentials for assumed roles to perform select actions. It's becoming more commonplace (and highly recommended) to use Roles over Users or Groups because there's no long-lived credentials involved. Users assume a role through an established trust relationship, or services (i.e. EC2 & Lambda) have service roles attached to them. There's two key actions that every developer should be aware of.

sts:AssumeRole

It's unlikely that you'd be the one who has to setup the trust relationship between your user account and a target role, but you will have to know what roles are available to you, and how to assume them. If your organization is using AWS SSO, then you have a nice and clean way to authenticate yourself and assume a role that is available to you. It's commonplace for AWS SSO to be integrated with your corporate identity provider, so if you're using Okta to login to everything else, for example, it'll be a familiar end user experience.

If not, you may want to politely recommend it to your counterparts on the security and operations teams. It'll save a lot of headaches all around. In the interim, the action to get familiar with is sts:AssumeRole, which is how an authenticated user gets a short-lived session token. To know which roles are available to you as a user, you can get a list via the AWS CLI:

1$ aws iam list-roles

Then to assume a target role, run the following CLI command, substituting the account, role, and session values:

1$ aws sts assume-role –role-arn "arn:aws:iam:111122223333/role/role-name" –role-session-name CLI-Session

Further Reading: AssumeRole - AWS Security Token Service

sts:GetCallerIdentity

As a developer, you're context switching all the time – not just around AWS, but with everything you work with. One of the easiest mistakes when working with IAM is running things as the wrong principal and/or with wrong context. Harmless in a development mode where you get "access denied" errors, but potentially dangerous in production config scenarios. To avoid making a gaffe, get to know GetCallerIdentity.

If you ever find yourself lost while racing around the CLI or console, and you no longer know who you are or what direction is up, first take a deep breath… then run sts:GetCallerIdentity. This will return helpful information about the account and principal for the context you're in. You can use this as a gut check before performing any sensitive actions, or just to remind yourself of your surroundings.

1$ aws sts get-caller-identity
2{
3    "UserId": "ABCDEFGHIJKLMNOPQRSTU",
4    "Account": "111122223333",
5    "Arn": "arn:aws:iam::111122223333:user/fortyfivan"
6}

Further reading: GetCallerIdentity - AWS Security Token Service

While STS is technically a separate service from IAM, they're so closely related that it's worth mentioning in the same breath. As a developer, you never want to have to worry about managing your own credentials, but you also never want to find yourself in painful privileged access management workflows every time you need to do something. A strong role configuration is the best way to keep your accounts secure while working quickly.

Showing Off Your Newly Found IAM Skills

Congrats, you made it! We've taken quite the journey. Now where to put all this newfound knowledge to the test?! We've covered a few scenarios along the way where specific knowledge helps give you an edge, but if you really want to shine ‘em (I rewatched Terminator 2 recently), try these on for size:

Dig through your code to see if there are any places where a managed policy is being used – these are defaults provided by AWS that are often overpermissioned. Write a custom least privilege policy for the workload and submit a pull request. Say what?! That's what your security and operations counterparts will think when they see it.

Classify and tag the resources powering your workloads if they're not already. Think about the sensitivity of the data, and what pieces of context might be important. AppName, TeamName, DataLevel, etc. are all things that your security and operations counterparts care about. Be proactive about it and get a similar response – say what?!

Let Me Take You Higher

Hopefully this article has given you enough knowledge to deal with IAM through the course of your development work. If you like what you've seen and want to keep going, that's great. Further learning only makes you stronger.

  • AWS IAM Documentation: you can find it all in the docs (for better or for worse). Navigating the docs is a skill in itself, but you can surely find a reference doc or guide for every aspect of IAM and the surrounding AWS service catalog.
  • IAM Pulse Resources: we've scoured the Internet and back to curate a helpful collection of books & guides, open source tools, blogs & podcasts, training courses, and more.
  • IAM Pulse Articles: our members graciously contribute their own knowledge and experience with the community via tutorials, best practice guides, editorial content, and more.
  • IAM Policy Catalog: we're also assembling a catalog of reusable IAM policy documents that anyone can customize and copy for their own environment. Browse the samples, maybe you'll find what you need right away!

IAM is a complex domain, but it doesn't have to be painful or distracting – it can be fun and informative. To connect with your peers in the cloud space, join our community where we bring together professionals like yourself in a shared, positive learning experience. Sign up for free at: https://www.iampulse.com/signup

    Get the IAM Pulse Check Newsletter

    We send out a periodic newsletter full of tips & tricks, contributions from the community, commentary on the industry, relevant social posts, and more.

    Checkout past issues for a sampling of the goods.