Cloud-native developer. Distributed systems wannabe. DevOps and continuous delivery. 10x troublemaker. DevOps Manager at VHT.
13307 stories
·
3 followers

Introducing the Next-Generation Postman URL Processor

2 Shares

URL processor GIFThe essential ingredient of any Postman API request is the Uniform Resource Locator (URL). The URL provides all the details that your browser, servers, and developers need to find the API resource they’re requesting. A URL contains eight individual parts that collectively provide a uniform way to locate any desired resources on the web; that might sound pretty straightforward, but in reality it presents some very big c
hallenges. At Postman, we’re constantly working on tackling these challenges, and this work has culminated in our recent release of the next-generation URL processor.

But let’s back up a bit and take a closer look at those challenges, because therein lies the value of the new URL processor.

What Are The Moving Parts of a URL?

To help understand the challenges involved with working with URLs, it helps to understand each of the moving parts of a URL, what service providers are putting into them, and what many of the assumptions are which are being made on the staggering number of URLs that exist across the web.

Here are the eight core elements of any URL, which provide the building blocks of each request that is made on the web:

  • Scheme: The HTTP or HTTPS beginnings
  • UserInfo: Any username or password
  • Subdomain: The beginning of an address
  • Domain: The Internet domain of the URL
  • Port: The machine port to route through
  • Path: The folder path to the resource
  • Query: The parameters defining resources
  • Fragment: The pieces of each document

All of these moving parts result in a complete URL that typically looks something like this:

http://sub.domain.com:80/path/to/the/resourceparameter1=value&parameter2=value#Fragment

Browsers, Postman, and other web services and tools use these URLs to navigate the web, and it’s how web and mobile application providers make their data, content, media, and algorithms accessible. As the name suggests, URLs provide us all with a uniform way to locate valuable resources that are made available in a public or private way, using a common, agreed-upon approach. While URLs have done amazing things to standardize how we make digital resources available online, there are two main challenges that Postman (and all other service providers) faces when it comes to using URLs in the wild, which exponentially increase the number of problems we might encounter:

  • Parsing: Ensuring that Postman understands how to break down each part of any single URL
  • Encoding: Understanding the different types of encoding that are possible for each of the parts

Each part of a URL has to be properly parsed and then encoded using a different set of rules that may be interpreted in different ways across different service providers, which introduces a great deal of complexity. Historically, Postman has had just a single approach to processing an entire URL, but with our next-generation URL processor, the Postman platform is more flexible while parsing and encoding URLs. This allows us to establish a clearer boundary for each part of a URL, which then results in a cleaner encoding.

How the Next-Generation URL Processor Works

Postman’s next-generation URL processor leverages knowledge gleaned from the way that a URL is constructed and then provides the least-ambiguous and most-forgiving URL parsing, based on learnings from the countless different URL combinations we’ve encountered over the years. Here are some of the items that make up this next-generation experience:

  • Non-regex parser: This generates meta information about the URL while parsing, keeping it broken down as a data structure (object) through to the lowest depths of our code layers.
  • Path-specific encoding: Postman can enforce the following characters as part of the path: “, <, >, #, ?, {, }, and SPACE. It can then apply precise encoding rules to just the path.
  • Query-specific encoding: Postman can enforce the following characters as part of the path: “, #, &, ‘, <, =, >, and SPACE. It can then apply precise encoding rules to just the query.
  • User Info-specific encoding: Postman can enforce the following characters as part of the path: ” “, <, >, #, ?, {, }, /, :, ;, =, @, [, \, ], ^, |, and SPACE. It can then apply precise encoding rules to just the user info.
  • Encoding unsafe character: When unsafe charters are included as part of the URL, Postman will now identify and encode them depending on which part they are found.
  • Trailing white spaces: Postman will not remove trailing white spaces, allowing more flexibility when it comes to using white spaces as a true troubleshooting and debugging solution.
  • Postman variables: This enables awareness of Postman variables {{this}}), and addressing of potential parsing conflicts, while also allowing the usage of nested variable names in URLs.
  • IPv6 support: We have begun laying the foundation for wider support of IPv6 as part of the overall URL troubleshooting and debugging.
  • IP shorthand: This gives you more flexibility when it comes to allowing IP addresses to be used as part of the URL structure, expanding how the addressing for APIs can work.
  • Double slashes: You now have support for \\ in the path for Windows compatibility, ensuring seamless functionality across different platforms, and tools and services that operate on them.
  • Newlines and whitespace: You’ll get better handling of newlines and whitespace characters, sticking with Postman’s philosophy of making sure we help developers troubleshoot APIs. See the full blog post about this topic.
  • Configurability: With the new approach to parsing URLs, we can now make specific parsing and encoding patterns and algorithms an on/off switch within Postman settings.

Postman’s new approach to parsing and encoding URLs has taken into consideration the widest possible number of scenarios we’ve encountered over the years, while also making sure we are still aligned with the latest web standards. With this significant update, we’re uncertain of the impact the recent changes will have on legacy applications and integrations, which is why we’ve made the URL parser as an optional switch you can turn on or off within your Postman settings. More usage, feedback, and work will be required before we fully understand what additional configuration settings will be needed to further fine-tune how and what the new processor does when working with URLs out in the wild. That said, the foundation has definitely been laid for a more flexible and powerful approach to working with URLs, which are the core element of every API request being made within Postman.

Conclusion

This next-generation URL processor is designed to address a whole spectrum of challenges API developers face. It will provide more flexibility regarding the types of URLs used, while staying true to Postman’s commitment to providing developers with superior visibility into what happens behind the scenes with each API request for debugging and troubleshooting purposes.

The best way to see the new URL processor in action is to turn on the feature in your Postman app settings. You can also browse the multiple GitHub issues that were resolved with this release of the URL processor, so you can see how Postman community feedback drove this (and every) improvement in Postman.

And finally, are you making your voice heard? If you have a bug or an idea, we’d always like to hear your feedback on our GitHub issue tracker.

GitHub issues resolved with this release of the Postman URL processor:

  • Variables with reserved characters are not resolved – #5779, #7316
  • Nested variables are not resolved – #7965
  • Line breaks are not handled properly – #4267
  • Trailing whitespaces gets trimmed – #6097
  • Encoding issues in path – #3469 #7432 #6126 #7155 #7300 #7123
  • Encoding issues in query params – #4555 #4171 #5752

The post Introducing the Next-Generation Postman URL Processor appeared first on Postman Blog.

Read the whole story
alvinashcraft
7 hours ago
reply
West Grove, PA
sbanwart
8 hours ago
reply
Akron, OH
Share this story
Delete

Recap of Amazon RDS and Aurora features launched in 2019

1 Share

Amazon Relational Database Service (Amazon RDS) makes it easy to set up, operate, and scale a relational database in the cloud. It provides cost-efficient and resizable capacity. At the same time, it automates time-consuming administration tasks such as hardware provisioning, database setup, patching, and backups. It frees you to focus on your applications so you can give them the fast performance, high availability, security, and compatibility they need.

Moving self-managed databases to managed database services is the new norm, and we’re continuing to add features to Amazon RDS at a rapid pace. 2019 was a busy year, so let’s do a recap of the features launched across the different database engines.

Amazon RDS first launched back in October of 2009, over 10 years ago! We started with Amazon RDS for MySQL; since then we’ve reached a total of seven database engine options: Amazon Aurora MySQL, Amazon Aurora PostgreSQL, PostgreSQL, MySQL, MariaDB, Oracle Database, and Microsoft SQL Server.

In 2019, we launched over 100 features across all Amazon RDS and Aurora database engines. For a quick reference, visit the 2018 recap, and the 2017 recap. We’ll start by covering each database engine and the key releases that we think will have the biggest impact on your database strategy and operations. We then list all of the features that we launched in 2019, categorized for convenience:

  • New instance types, Regions, and versions– Providing you with a variety of database deployment options
  • Manageability – Simplifying database management and providing expert recommendations
  • Developer productivity – Enabling builders to focus on tasks meaningful to the business
  • Performance – Improving database performance and scale to meet the application’s needs
  • Availability/Disaster Recovery – Deploying highly available databases across Availability Zones and AWS Regions
  • Security – Facilitating secure database operation

Key feature launches of 2019

Amazon Aurora MySQL

Aurora Global Database, originally launched at re:Invent 2018, expands your database into multiple Regions for disaster recovery and faster global reads. In 2019, this feature gained support for up to five secondary Regions, MySQL 5.7, and in-place upgrades from single-region databases.

The other two key features that we launched for the Aurora MySQL in 2019 were Aurora Multi-Master and Aurora Machine Learning. Aurora Multi-Master increases availability by enabling you to create multiple read/write instances of your Aurora database across multiple Availability Zones. Aurora Machine Learning lets you add machine learning (ML) based predictions to your applications directly in your SQL queries. Read more on the AWS News Blog.

Amazon Aurora PostgreSQL

For Aurora PostgreSQL, we announced support for Serverless, where the database automatically starts up, shuts down, and scales capacity up or down based on your application’s needs. Aurora Serverless benefits use cases such as infrequently used applications, new applications, variable workloads, unpredictable workloads, development and test databases, and multi-tenant applications. We also launched support for Logical Replication using PostgreSQL replication slots, enabling you to use the same logical replication tools that they use with RDS for PostgreSQL. We launched support for Database Activity Streams to provide detailed auditing information in an encrypted JSON stream, Cluster Cache Management to resume the previous database performance after a failover, S3 import to make it easy and fast to load data from CSV files (S3 export was added in early 2020), export of logs to CloudWatch to make it easy to monitor PostgreSQL logs, support for PostgreSQL version 11 to give you access to the latest features from the PostgreSQL community, and FedRAMP HIGH compliance.

Amazon RDS for Oracle

We improved availability and disaster recovery by launching In-Region and Cross-Region Read Replicas using Oracle Active Data Guard. With read replicas, you can easily create up to five fully managed Oracle Active Data Guard standby databases that can be used for read scaling and offloading of long-running analytical queries. You can create read replicas in the same Region or a different Region from the primary instance, and replicas can be promoted into full read/write databases for disaster recovery purposes.

Last year, we also simplified migrations to RDS for Oracle with Amazon S3 Integration for Data Ingress and Egress Capabilities. With the S3 Integration option, you can easily set up fast and secure file transfer between an Amazon RDS for Oracle instance and Amazon Simple Storage Service (S3), significantly reducing the complexity of loading and unloading data.

Amazon RDS for SQL Server

By increasing the maximum number of databases per database instance from 30 to 100, we enable you to further consolidate database instances to save on costs.

Another exciting enhancement was around migrations. When some of our customers perform Native Backups and Restores when migrating to RDS SQL Server, they sometimes experience longer downtime during the final stages of the migration process than they would prefer. With support for Native Differential and Log Backups in conjunction with Full Native Backups, you can reduce downtime to as little as 5 minutes.

Last but not least, we launched Always On Availability Groups for SQL Server 2017 Enterprise Edition. With Always On, we also launched the Always On Listener Endpoint, which supports faster failover times.

Releases across multiple Amazon RDS database engines

We expanded deployment options for Amazon RDS by launching Amazon RDS on VMware for the following database engines: Microsoft SQL Server, MySQL, and PostgreSQL. If you need to run a hybrid (cloud and on-premises) database environment, this gives you the option to use the automation behind Amazon RDS in an on-premises VMware vSphere environments. Jeff Barr wrote a detailed AWS News Blog post highlighting the available features and how to get started.

For simplified single sign-on, you can use Microsoft Active Directory (AD) via AWS Managed Active Directory Service for Amazon RDS PostgreSQL, RDS Oracle, and RDS MySQL (AD is also supported on SQL Server, and was launched on MySQL in early 2020). Now you can use the same AD for different VPCs within the same AWS Region. You can also join instances to a shared Active Directory domain owned by different accounts.

Lastly, we announced the public preview of Amazon RDS Proxy for Amazon RDS MySQL and Aurora MySQL. As its name implies, Amazon RDS Proxy sits between your application and its database to pool and share database connections, improving database efficiency and application scalability. In case of a database failover, Amazon RDS Proxy automatically connects to a standby database instance while preserving connections from your application and reducing failover times for Amazon RDS and Aurora Multi-AZ databases by up to 66%. Lastly, database credentials and access can be managed through AWS Secrets Manager and AWS Identity and Access Management (IAM), eliminating the need to embed database credentials in application code. Support for Amazon RDS for PostgreSQL and Amazon Aurora with PostgreSQL compatibility is coming soon. You can learn more by reading Using Amazon RDS Proxy with AWS Lambda.

Features by database engine

Amazon Aurora MySQL

New instances types, Regions, and versions

Manageability

Developer productivity

Performance

Availability/Disaster Recovery

Security

Amazon Aurora PostgreSQL

New instances types, Regions, and versions

Manageability

Developer productivity

Performance

Availability/Disaster Recovery

Security

Amazon RDS for MySQL/MariaDB

New Instances Types, Regions, and Versions

Manageability

Developer productivity

Performance

Security

Amazon RDS for PostgreSQL

New instances types, Regions, and versions

Manageability

Developer productivity

Performance

Security

Amazon RDS for Oracle

New instances types, Regions, and versions

Manageability

Developer productivity

Performance

Availability/Disaster Recovery

Security

Amazon RDS for SQL Server

New instances types, Regions, and versions

Manageability

Developer productivity

Performance

Availability/Disaster Recovery

Security

Across Amazon RDS database engines

New instances types, Regions, and versions

Manageability

Performance

Security

Summary

While the last 10 years have been extremely exciting, it is still Day 1 for our service and we are excited to keep innovating on behalf of our customers! If you haven’t tried Amazon RDS yet, you can try it for free via the Amazon RDS Free Tier. If you have any questions, feel free to comment on this blog post!

 


About the Authors

 

Justin Benton is a Sr Product Manager at Amazon Web Services.

 

 

 

 

Yoav Eilat is a Sr Product Manager at Amazon Web Services.

 

 

 

 

Read the whole story
sbanwart
8 hours ago
reply
Akron, OH
Share this story
Delete

HashiCorp Nomad Task Dependencies

1 Share

Nomad 0.11 introduces the lifecycle stanza for tasks which can be used to express task dependencies. This can be used to express task dependencies between tasks in a task group, and it can even be leveraged to express inter-job task dependencies with Consul.

In this blog post, we’ll go over the lifecycle stanza, task dependency patterns, and how to implement a sidecar task dependency in your own job.

The lifecycle stanza

The lifecycle stanza enables users to control when a task is run within the lifecycle of a task group allocation.

lifecycle Parameters

  • hook - Specifies when a task should be launched within the lifecycle of a group. Hooks break up the allocation lifecycle into phases. If a task lifecycle specifies Prestart Hook, then those tasks will be started before the main tasks are started. Currently only Prestart Hooks are supported, but later 0.11 releases will add PostStart, PreStop and PostStop hooks.
  • sidecar - Indicates if a task should run for the full duration of an allocation (sidecar = true), or whether the task should be ephemeral and run until completion (sidecar = false) before starting the next lifecycle phase in the allocation.

Task Dependency Patterns

The combination of prestart hooks and the sidecar flag creates two task dependency patterns for prestart tasks: init tasks and sidecar tasks.

Init tasks are ephemeral prestart tasks that must run to completion before the main workload is started. They are commonly used to download assets or to create necessary tables for an extract-transform-load (ETL) job.

You can create an init task by adding a lifecycle stanza with hook set to prestart and sidecar to false as below.

hcl lifecycle { hook = "prestart" sidecar = false }

Sidecar tasks are prestart tasks that are started before main workload starts and run for the lifetime of the main workload. Typical sidecars tasks are log forwarders, proxies, and for platform abstractions.

You can create a sidecar task by adding a lifecycle stanza with hook set to prestart and sidecar to true as below.

hcl lifecycle { hook = "prestart" sidecar = true }

Describing Sidecar Tasks in a Nomad Job

Now we will demonstrate how to configure two remote_syslog containers — one for stderr and one for stdout—to ship log events to Papertail for a sample Redis instance. In this example, the log-shippers are sidecar tasks and the Redis instance is the main task.

hcl job "example" { datacenters = ["dc1"] group "cache" { task "remote_syslog_stdout" { driver = "docker" config { image = "octohost/remote_syslog" args = [ # REPLACE placeholders with your Papertrail information. "-p", "«papertrail port»", "-d", "logs.papertrailapp.com", "/alloc/logs/redis.stdout.0" ] } lifecycle { sidecar = true hook = "prestart" } } task "remote_syslog_stderr" { driver = "docker" config { image = "octohost/remote_syslog" args = [ # REPLACE placeholders with your Papertrail information. "-p", "«papertrail port»", "-d", "logs.papertrailapp.com", "/alloc/logs/redis.stderr.0" ] } lifecycle { sidecar = true hook = "prestart" } } task "redis" { driver = "docker" config { image = "redis:3.2" port_map { db = 6379 } } resources { cpu = 500 memory = 256 network { mbits = 10 port "db" {} } } } } }

Once you run the job, you can navigate to your Papertrail UI and view the logs.

Getting Started

We're releasing Nomad's Task Dependencies feature in beta for Nomad 0.11 to get feedback from our practitioners. We’d like to hear what other task dependencies features you are interested in seeing. Feel free to try it out and give us feedback in the issue tracker. To see this feature in action, please register for the upcoming live demo session here. Learn more about Task dependencies at the HashiCorp Learn website.

Read the whole story
sbanwart
11 hours ago
reply
Akron, OH
Share this story
Delete

PowerShell 7 support with AWS Lambda

1 Share

Recently we released our .NET Core 3.1 AWS Lambda runtime. With our previous .NET Core 2.1 Lambda runtime we released the AWSLambdaPSCore PowerShell module that made it easy to deploy PowerShell scripts to Lambda using PowerShell 6 and the .NET Core 2.1 Lambda Runtime.

Now we have released version 2.0.0 of the PowerShell module AWSLambdaPSCore. This new version deploys PowerShell scripts to Lambda using PowerShell 7 and the new .NET Core 3.1 Lambda Runtime.

For a tutorial on how to use AWSLambdaPSCore to deploy PowerShell scripts see this previous blog post introducing the AWSLambdaPSCore module.

AWS Tools for PowerShell 4.0.0

Since we released the original version of AWSLambdaPSCore, AWS Tools for PowerShell version 4 has been released. This major update contains many new features like support for mandatory parameters, easy to use Stream parameters, and a new -Select parameter to control the output of cmdlets. You can read more about this release here.

The biggest change in version 4 of the AWS Tools for PowerShell was a refactor to a more modular version, with a separate PowerShell module for each AWS service, as well as a shared module named AWS.Tools.Common. This means your scripts can import modules for only the services they need. When you write your PowerShell scripts to deploy to Lambda I recommend using the modular variant as opposed to the much larger AWSPowerShell.NetCore module. Your deployment bundle will be much smaller and it can be faster to import the smaller service modules, as opposed to the large all-encompassing AWSPowerShell.NetCore module.

When the deployment bundle is created by AWSLambdaPSCore the script is inspected for #Requires statements and the requested modules are included in the deployment bundle. Be sure to have a #Requires statement for AWS.Tools.Common module whenever using the modular AWS.Tools modules. Here is an example of adding the module for Amazon S3 to your PowerShell script.

#Requires -Modules @{ModuleName='AWS.Tools.Common';ModuleVersion='4.0.5.0'}
#Requires -Modules @{ModuleName='AWS.Tools.S3';ModuleVersion='4.0.5.0'}

Conclusion

PowerShell 7 is an exciting release to the PowerShell community and we are excited to see our AWS PowerShell community take advantage of PowerShell 7 on AWS Lambda.

Our Lambda tooling for PowerShell is open source and you can check it out here. For any feedback for our Lambda PowerShell tooling let us know there.

Read the whole story
sbanwart
11 hours ago
reply
Akron, OH
Share this story
Delete

Blog: API Priority and Fairness Alpha

1 Share

Authors: Min Kim (Ant Financial), Mike Spreitzer (IBM), Daniel Smith (Google)

This blog describes “API Priority And Fairness”, a new alpha feature in Kubernetes 1.18. API Priority And Fairness permits cluster administrators to divide the concurrency of the control plane into different weighted priority levels. Every request arriving at a kube-apiserver will be categorized into one of the priority levels and get its fair share of the control plane’s throughput.

What problem does this solve?

Today the apiserver has a simple mechanism for protecting itself against CPU and memory overloads: max-in-flight limits for mutating and for readonly requests. Apart from the distinction between mutating and readonly, no other distinctions are made among requests; consequently, there can be undesirable scenarios where one subset of the requests crowds out other requests.

In short, it is far too easy for Kubernetes workloads to accidentally DoS the apiservers, causing other important traffic–like system controllers or leader elections—to fail intermittently. In the worst cases, a few broken nodes or controllers can push a busy cluster over the edge, turning a local problem into a control plane outage.

How do we solve the problem?

The new feature “API Priority and Fairness” is about generalizing the existing max-in-flight request handler in each apiserver, to make the behavior more intelligent and configurable. The overall approach is as follows.

  1. Each request is matched by a Flow Schema. The Flow Schema states the Priority Level for requests that match it, and assigns a “flow identifier” to these requests. Flow identifiers are how the system determines whether requests are from the same source or not.
  2. Priority Levels may be configured to behave in several ways. Each Priority Level gets its own isolated concurrency pool. Priority levels also introduce the concept of queuing requests that cannot be serviced immediately.
  3. To prevent any one user or namespace from monopolizing a Priority Level, they may be configured to have multiple queues. “Shuffle Sharding” is used to assign each flow of requests to a subset of the queues.
  4. Finally, when there is capacity to service a request, a “Fair Queuing” algorithm is used to select the next request. Within each priority level the queues compete with even fairness.

Early results have been very promising! Take a look at this analysis.

How do I try this out?

You are required to prepare the following things in order to try out the feature:

  • Download and install a kubectl greater than v1.18.0 version
  • Enabling the new API groups with the command line flag --runtime-config="flowcontrol.apiserver.k8s.io/v1alpha1=true" on the kube-apiservers
  • Switch on the feature gate with the command line flag --feature-gates=APIPriorityAndFairness=true on the kube-apiservers

After successfully starting your kube-apiservers, you will see a few default FlowSchema and PriorityLevelConfiguration resources in the cluster. These default configurations are designed for a general protection and traffic management for your cluster. You can examine and customize the default configuration by running the usual tools, e.g.:

  • kubectl get flowschemas
  • kubectl get prioritylevelconfigurations

How does this work under the hood?

Upon arrival at the handler, a request is assigned to exactly one priority level and exactly one flow within that priority level. Hence understanding how FlowSchema and PriorityLevelConfiguration works will be helping you manage the request traffic going through your kube-apiservers.

  • FlowSchema: FlowSchema will identify a PriorityLevelConfiguration object and the way to compute the request’s “flow identifier”. Currently we support matching requests according to: the identity making the request, the verb, and the target object. The identity can match in terms of: a username, a user group name, or a ServiceAccount. And as for the target objects, we can match by apiGroup, resource[/subresource], and namespace.

    • The flow identifier is used for shuffle sharding, so it’s important that requests have the same flow identifier if they are from the same source! We like to consider scenarios with “elephants” (which send many/heavy requests) vs “mice” (which send few/light requests): it is important to make sure the elephant’s requests all get the same flow identifier, otherwise they will look like many different mice to the system!
    • See the API Documentation here!
  • PriorityLevelConfiguration: Defines a priority level.

    • For apiserver self requests, and any reentrant traffic (e.g., admission webhooks which themselves make API requests), a Priority Level can be marked “exempt”, which means that no queueing or limiting of any sort is done. This is to prevent priority inversions.
    • Each non-exempt Priority Level is configured with a number of “concurrency shares” and gets an isolated pool of concurrency to use. Requests of that Priority Level run in that pool when it is not full, never anywhere else. Each apiserver is configured with a total concurrency limit (taken to be the sum of the old limits on mutating and readonly requests), and this is then divided among the Priority Levels in proportion to their concurrency shares.
    • A non-exempt Priority Level may select a number of queues and a “hand size” to use for the shuffle sharding. Shuffle sharding maps flows to queues in a way that is better than consistent hashing. A given flow has access to a small collection of queues, and for each incoming request the shortest queue is chosen. When a Priority Level has queues, it also sets a limit on queue length. There is also a limit placed on how long a request can wait in its queue; this is a fixed fraction of the apiserver’s request timeout. A request that cannot be executed and cannot be queued (any longer) is rejected.
    • Alternatively, a non-exempt Priority Level may select immediate rejection instead of waiting in a queue.
    • See the API documentation for this feature.

What’s missing? When will there be a beta?

We’re already planning a few enhancements based on alpha and there will be more as users send feedback to our community. Here’s a list of them:

  • Traffic management for WATCH and EXEC requests
  • Adjusting and improving the default set of FlowSchema/PriorityLevelConfiguration
  • Enhancing observability on how this feature works
  • Join the discussion here

Possibly treat LIST requests differently depending on an estimate of how big their result will be.

How can I get involved?

As always! Reach us on slack #sig-api-machinery, or through the mailing list. We have lots of exciting features to build and can use all sorts of help.

Many thanks to the contributors that have gotten this feature this far: Aaron Prindle, Daniel Smith, Jonathan Tomer, Mike Spreitzer, Min Kim, Bruce Ma, Yu Liao!

Read the whole story
sbanwart
14 hours ago
reply
Akron, OH
Share this story
Delete

Automatically Enforcing AWS Resource Tagging Policies

2 Shares

AWS publishes best practices for how to tag your resources for cost tracking, automation, and organization. But how do you enforce that you’re doing it correctly across all of your projects? And is it really necessary to manually track down all those places where you missed a tag and manually patch things up? In this article, we’ll see how to use Policy as Code to enforce your team’s tagging strategies in addition to some powerful Infrastructure as Code techniques to automate applying your tags in a consistent way across all of your projects and resources.

Why Tag Your Resources?

A tag is simply a key/value label that you can apply to your AWS infrastructure resources. Tags enable you to manage, search for, and filter resources. There aren’t any predefined tags — you can use whatever makes sense for your scenario and business requirements.

Amazon recommends many tagging strategies, including technical tags like name and environment, automation tags like dates and security requirements, business tags like owner and cost center, and security tags for compliance. Each of these enables you to apply policies.

Specifying a tag in your Infrastructure as Code is easy. Not all resources are taggable (although the most important ones are); to tag a resource, specify a map of key/values using the tags property. For example, this code declares an S3 Bucket that carries three tags that enable cost allocation reporting: "user:Project", "user:Stack", and "user:Cost Center":

let aws = require("@pulumi/aws");
let pulumi = require("@pulumi/pulumi");
// Create an S3 bucket (with tags).
let config = new pulumi.Config();
let bucket = new aws.s3.Bucket("my-bucket", {
tags: {
"user:Project": pulumi.getProject(),
"user:Stack": pulumi.getStack(),
"user:Cost Center": config.require("costCenter"),
},
});
import * as aws from "@pulumi/aws";
import * as pulumi from "@pulumi/pulumi";
// Create an S3 bucket (with tags).
const config = new pulumi.Config();
const bucket = new aws.s3.Bucket("my-bucket", {
tags: {
"user:Project": pulumi.getProject(),
"user:Stack": pulumi.getStack(),
"user:Cost Center": config.require("costCenter"),
},
});
import pulumi
import pulumi_aws as aws
# Create an S3 bucket (with tags).
config = pulumi.Config()
bucket = aws.s3.Bucket('my-bucket',
tags={
'user:Project': pulumi.get_project(),
'user:Stack': pulumi.get_stack(),
'user:Cost Center': config.require('costCenter'),
},
)
package main
import (
"github.com/pulumi/pulumi-aws/sdk/go/aws/s3"
"github.com/pulumi/pulumi/sdk/go/pulumi"
"github.com/pulumi/pulumi/sdk/go/pulumi/config"
)
func main() {
pulumi.Run(func(ctx *pulumi.Context) error {
// Create an S3 Bucket (with tags):
 _, err := s3.NewBucket(ctx, "my-bucket", &s3.BucketArgs{
Tags: pulumi.Map({
"User:Project": pulumi.String(ctx.Project()),
"User:Stack": pulumi.String(ctx.Stack()),
"User:Cost Center": pulumi.String(config.Require(ctx, "costCenter")),
})
})
return err
}
}
using Pulumi;
using Pulumi.Aws.S3;
using System.Collections.Generic;
using System.Threading.Tasks;
class Program {
static Task Main() {
return Deployment.RunAsync(() => {
// Create an S3 Bucket (with tags):
 var config = new Config();
var bucket = new Bucket("my-bucket", new BucketArgs {
Tags: new Dictionary<string, string> {
{ "User:Project", Deployment.Instance.ProjectName },
{ "User:Stack", Deployment.Instance.StackName },
{ "User:Cost Center", config.Get("costCenter") },
},
});
});
}
}

In this example, we’re using the project and stack names from the current project, and requiring an explicit cost center configured using pulumi config set costCenter 11223344. Of course, this is just for illustration purposes — we can easily use anything for these keys and values.

Suppose our team requires certain tags, though: How do we ensure we don’t forget them?

Enforcing Tagging Strategies

Policy as Code is a way to enforce infrastructure policies, such as ensuring we’re not opening databases to the internet, that we’re leveraging strong encryption in all the right places, that we aren’t violating cost policies, and so on. And it’s a great way to ensure we’re tagging everything.

Imagine we forgot to tag our S3 Bucket from earlier:

// Oops -- no tags!
let bucket = new aws.s3.Bucket("my-bucket");
// Oops -- no tags!
const bucket = new aws.s3.Bucket("my-bucket");
# Oops -- no tags!
bucket = aws.s3.Bucket('my-bucket')
// Oops -- no tags!
_, err := s3.NewBucket(ctx, "my-bucket", nil)
return err
// Oops -- no tags!
var bucket = new Bucket("my-bucket");

By applying the policy defined below, deployments that are missing the required tags will fail:

Tag Policy Failed

This approach not only ensures we don’t forget, but if you’re an infrastructure or security engineer, you can now apply this across your team to make sure your team doesn’t forget either.

Defining our Tags Enforcer Policy

This policy is ultimately defined as a simple policy pack as follows:

let policy = require("@pulumi/policy");
let isTaggable = require("../lib/taggable").isTaggable;
new policy.PolicyPack("aws-tags-policies", {
policies: [{
name: "check-required-tags",
description: "Ensure required tags are present on all AWS resources.",
configSchema: {
properties: {
requiredTags: {
type: "array",
items: { type: "string" },
},
},
},
validateResource: (args, reportViolation) => {
const config = args.getConfig();
const requiredTags = config.requiredTags;
if (requiredTags && isTaggable(args.type)) {
const ts = args.props[tags];
for (const rt of requiredTags) {
if (!ts || !ts[rt]) {
reportViolation(
`Taggable resource '${args.urn}' is missing required tag '${rt}'`);
}
}
}
},
}],
});
import * as policy from "@pulumi/policy";
import { isTaggable } from "../lib/taggable";
new policy.PolicyPack("aws-tags-policies", {
policies: [{
name: "check-required-tags",
description: "Ensure required tags are present on all AWS resources.",
configSchema: {
properties: {
requiredTags: {
type: "array",
items: { type: "string" },
},
},
},
validateResource: (args, reportViolation) => {
const config = args.getConfig<AwsTagsPolicyConfig>();
const requiredTags = config.requiredTags;
if (requiredTags && isTaggable(args.type)) {
const ts = args.props[tags];
for (const rt of requiredTags) {
if (!ts || !ts[rt]) {
reportViolation(
`Taggable resource '${args.urn}' is missing required tag '${rt}'`);
}
}
}
},
}],
});
interface AwsTagsPolicyConfig {
requiredTags?: string[];
}

This project defines a policy pack containing a set of policy rules, in this case, just one. That rule takes in a configurable set of tags to ensure they exist on every taggable AWS resource. If a tag is missing, a violation is reported and the deployment fails.

This leverages a library that helps to identify taggable AWS resources (see this repo for the full example code). Although this policy is written in TypeScript, it can be applied to stacks written in any language.

At this point, we can apply our policy pack to our infrastructure project in two ways: at the CLI or in the SaaS web console.

Applying Tags Enforcement in the CLI

This policy pack is configurable so that you can enforce arbitrary tags without needing to change the pack’s code, making it reusable. For the CLI scenario, we will create a policy-config.json file that specifies the same three required tags shown above:

{
"all": "mandatory",
"check-required-tags": {
"requiredTags": [
"user:Project",
"user:Stack",
"user:Cost Center"
]
}
}

Next, we can manually specify that our policy pack is applied using the CLI’s --policy-pack flag, along with --policy-pack-config to point at our configuration file:

$ pulumi up \
 --policy-pack=./policy \
 --policy-pack-config=./policy-config.json

Applying Tags Enforcement in the SaaS Console

It’s nice to be able to use the CLI for this — and all of that is available in open source, no matter whether you’re using the SaaS or not. However, it requires manually distributing policy packs and remembering to pass --policy-pack with the right configuration for every update.

The Pulumi Enterprise SaaS console lets you manage policy packs and apply them to your organization’s stacks in a central way — and then every update that the pack is applied to automatically runs the policy checks. Let’s give it a try.

First, let’s publish the pack:

$ pulumi policy publish
Obtaining policy metadata from policy plugin
Compressing policy pack
Uploading Policy Pack to Pulumi service
Publishing "aws-tags-policies" - version 0.0.1 to "acmepolicy"
Permalink: https://app.pulumi.com/acmepolicy/policypacks/aws-tags-policies/0.0.1

This stores the policy pack in the Pulumi SaaS:

Policy Pack Page

From there, we can enable it in the UI. First, we go to our organization’s “POLICIES” tab:

Organization Policies Page

Every organization gets a default policy group, and that’s what we’ll use here. However, for sophisticated scenarios, you may want multiple policy groups (for instance, to enforce different tags for your production environments than your development ones).

Let’s “ADD” the new pack. This pops a dialog we can use to select the pack, its version, and our desired Enforcement Level (we’ll pick “mandatory”):

Enable Policy Pack

Finally, further down in the dialog, we can enter the tags we’d like to enforce:

Configure Policy Pack

Now that it is configured, all subsequent updates across the organization will run policy checks.

Automatically Applying Tags

In all cases, after manually fixing our bucket, and adding the correct tags, the policy will pass:

Tag Policy Succeeded

This is great — we can now rest assured that all taggable AWS resources will be tagged before we provision them. But it sure is tedious to add these tags to every resource and then get policy violations errors anytime we forget. One of the advantages of using Infrastructure as Code is that we can automate the injection of these tags.

To do that, let’s write a function that detects taggable resources and merges in automatic tags:

let pulumi = require("@pulumi/pulumi");
let isTaggable = require("./taggable").isTaggable;
/**
 * registerAutoTags registers a global stack transformation that merges a set
 * of tags with whatever was also explicitly added to the resource definition.
 */
module.exports = {
registerAutoTags: function (autoTags) {
pulumi.runtime.registerStackTransformation((args) => {
if (isTaggable(args.type)) {
args.props["tags"] = Object.assign(args.props["tags"], autoTags);
return { props: args.props, opts: args.opts };
}
};
},
};
import * as pulumi from "@pulumi/pulumi";
import { isTaggable } from "./taggable";
/**
 * registerAutoTags registers a global stack transformation that merges a set
 * of tags with whatever was also explicitly added to the resource definition.
 */
export function registerAutoTags(autoTags: Record<string, string>): void {
pulumi.runtime.registerStackTransformation((args) => {
if (isTaggable(args.type)) {
args.props["tags"] = { ...args.props["tags"], ...autoTags };
return { props: args.props, opts: args.opts };
}
return undefined;
});
}
import pulumi
from taggable import is_taggable
# registerAutoTags registers a global stack transformation that merges a set
# of tags with whatever was also explicitly added to the resource definition.
def register_auto_tags(auto_tags):
pulumi.runtime.register_stack_transformation(lambda args: auto_tag(args, auto_tags))
# auto_tag applies the given tags to the resource properties if applicable.
def auto_tag(args, auto_tags):
if is_taggable(args.type_):
args.props['tags'] = {**(args.props['tags'] or {}), **auto_tags}
return pulumi.ResourceTransformationResult(args.props, args.opts)
package main
import (
"reflect"
"github.com/pulumi/pulumi/sdk/go/pulumi"
)
// registerAutoTags registers a global stack transformation that merges a set
// of tags with whatever was also explicitly added to the resource definition.
func registerAutoTags(ctx *pulumi.Context, autoTags map[string]string) {
ctx.RegisterStackTransformation(
func(args *pulumi.ResourceTransformationArgs) *pulumi.ResourceTransformationResult {
if isTaggable(args.Type) {
// Use reflection to look up the Tags property and merge the auto-tags.
 ptr := reflect.ValueOf(args.Props)
val := ptr.Elem()
tags := val.FieldByName("Tags")
var tagsMap pulumi.Map
if !tags.IsZero() {
tagsMap = tags.Interface().(pulumi.Map)
} else {
tagsMap = pulumi.Map(map[string]pulumi.Input{})
}
for k, v := range autoTags {
tagsMap[k] = pulumi.String(v)
}
tags.Set(reflect.ValueOf(tagsMap))
return &pulumi.ResourceTransformationResult{
Props: args.Props,
Opts: args.Opts,
}
}
return nil
},
)
}
static ResourceTransformation RegisterAutoTags(Dictionary<string, string> autoTags) {
return args => {
if (IsTaggable(args.Resource.GetResourceType())) {
// Use reflection to look up the Tags property and merge the auto-tags.
 var tagp = args.Args.GetType().GetProperty("Tags");
var tags = (InputMap<object>)tagp.GetValue(args.Args, null) ?? new InputMap<object>();
foreach (var tag in autoTags) {
tags[tag.Key] = tag.Value;
}
tagp.SetValue(args.Args, tags, null);
return new ResourceTransformationResult(args.Args, args.Options);
}
return null;
};
}

Now we can go back to our main program, use this new module, remove the explicit tags, and every taggable AWS resource we create will automatically get the tags we’ve specified:

let aws = require("@pulumi/aws");
let pulumi = require("@pulumi/pulumi");
let registerAutoTags = require("./autotag".registerAutoTags);
// Automatically inject tags.
let config = new pulumi.Config();
registerAutoTags({
"user:Project": pulumi.getProject(),
"user:Stack": pulumi.getStack(),
"user:Cost Center": config.require("costCenter"),
});
// Create a bunch of AWS resources -- with auto-tags!

let bucket = new aws.s3.Bucket("my-bucket");
let group = new aws.ec2.SecurityGroup("web-secgrp", {
ingress: [
{ protocol: "tcp", fromPort: 22, toPort: 22, cidrBlocks: ["0.0.0.0/0"] },
{ protocol: "tcp", fromPort: 80, toPort: 80, cidrBlocks: ["0.0.0.0/0"] },
],
});
let server = new aws.ec2.Instance("web-server-www", {
instanceType: "t2.micro",
ami: "ami-0c55b159cbfafe1f0",
vpcSecurityGroupIds: [ group.id ],
});
import * as aws from "@pulumi/aws";
import * as pulumi from "@pulumi/pulumi";
import { registerAutoTags } from "./autotag";
// Automatically inject tags.
const config = new pulumi.Config();
registerAutoTags({
"user:Project": pulumi.getProject(),
"user:Stack": pulumi.getStack(),
"user:Cost Center": config.require("costCenter"),
});
// Create a bunch of AWS resources -- with auto-tags!

const bucket = new aws.s3.Bucket("my-bucket");
const group = new aws.ec2.SecurityGroup("web-secgrp", {
ingress: [
{ protocol: "tcp", fromPort: 22, toPort: 22, cidrBlocks: ["0.0.0.0/0"] },
{ protocol: "tcp", fromPort: 80, toPort: 80, cidrBlocks: ["0.0.0.0/0"] },
],
});
const server = new aws.ec2.Instance("web-server-www", {
instanceType: "t2.micro",
ami: "ami-0c55b159cbfafe1f0",
vpcSecurityGroupIds: [ group.id ],
});
import pulumi
import pulumi_aws as aws
from autotag import register_auto_tags
# Automatically inject tags.
config = pulumi.Config()
register_auto_tags({
'user:Project': pulumi.get_project(),
'user:Stack': pulumi.get_stack(),
'user:Cost Center': config.require('costCenter'),
})
# Create a bunch of AWS resources -- with auto-tags!
bucket = aws.s3.Bucket('my-bucket')
group = aws.ec2.SecurityGroup('web-secgrp',
ingress=[
{ 'protocol': 'tcp', 'from_port': 22, 'to_port': 22, 'cidr_blocks': ['0.0.0.0/0']},
{ 'protocol': 'tcp', 'from_port': 80, 'to_port': 80, 'cidr_blocks': ['0.0.0.0/0']},
],
)
server = aws.ec2.Instance('web-server-www',
instance_type='t2.micro',
ami='ami-0c55b159cbfafe1f0',
vpc_security_group_ids=[ group.id ],
)
package main
import (
"github.com/pulumi/pulumi-aws/sdk/go/aws/ec2"
"github.com/pulumi/pulumi-aws/sdk/go/aws/s3"
"github.com/pulumi/pulumi/sdk/go/pulumi"
"github.com/pulumi/pulumi/sdk/go/pulumi/config"
)
func main() {
pulumi.Run(func(ctx *pulumi.Context) error {
// Automatically inject tags.
 RegisterAutoTags(ctx, map[string]string{
"User:Project": ctx.Project(),
"User:Stack": ctx.Stack(),
"User:Cost Center": config.Require(ctx, "costCenter"),
})
// Create a bunch of AWS resources -- with auto-tags!

_, err := s3.NewBucket(ctx, "my-bucket", nil)
if err != nil {
return err
}
grp, err := ec2.NewSecurityGroup(ctx, "web-secgrp", &ec2.SecurityGroupArgs{
Ingress: ec2.SecurityGroupIngressArray{
ec2.SecurityGroupIngressArgs{
Protocol: pulumi.String("tcp"),
FromPort: pulumi.Int(80),
ToPort: pulumi.Int(80),
CidrBlocks: pulumi.StringArray{pulumi.String("0.0.0.0/0")},
},
},
})
if err != nil {
return err
}
_, err = ec2.NewInstance(ctx, "web-server-www", &ec2.InstanceArgs{
InstanceType: pulumi.String("t2.micro"),
Ami: pulumi.String("ami-0c55b159cbfafe1f0"),
VpcSecurityGroupIds: pulumi.StringArray{grp.ID()},
})
return err
}
})
using Pulumi;
using Pulumi.Aws.Ec2;
using Pulumi.Aws.Ec2.Inputs;
using Pulumi.Aws.S3;
using System.Collections.Generic;
using System.Reflection;
class MyStack : Stack {
static Config config = new Config();
public MyStack() : base(
new StackOptions {
ResourceTransformations = {
RegisterAutoTags(new Dictionary<string, string> {
{ "User:Project", Deployment.Instance.ProjectName },
{ "User:Stack", Deployment.Instance.StackName },
{ "User:Cost Center", config.Require("costCenter") },
}),
},
}
)
{
// Create a bunch of AWS resources -- with auto-tags!

var bucket = new Bucket("my-bucket", new BucketArgs());
var grp = new SecurityGroup("web-secgrp", new SecurityGroupArgs {
Ingress = {
new SecurityGroupIngressArgs {
Protocol = "tcp", FromPort = 80, ToPort = 80, CidrBlocks = {"0.0.0.0/0"},
},
}
});
var srv = new Instance("web-server-www", new InstanceArgs {
InstanceType = "t2.micro",
Ami = "ami-0c55b159cbfafe1f0",
VpcSecurityGroupIds = { grp.Id },
});
}
// ...
}

Notice that we didn’t specify any tags by hand for the resource definitions and yet if we run a pulumi preview --diff, we see that the correct tags are applied automatically, thanks to the global stack transformation:

Policy Pack Page

And running an ordinary update now passes:

Tag Policy Succeeded (Many Resources)

Try adding some resources of your own — VPCs, EC2 instances, security groups, EKS clusters, anything — and it will automatically get tagged with these same cost center tags.

In Conclusion

In this post, we’ve seen some ways to enforce AWS tagging best practices. This includes manually applying tags using Infrastructure as Code, checking that the desired tags are applied to the relevant resources using Policy as Code, and even using some advanced techniques to automatically tag resources, reducing manual efforts and the chance of human error.

Check out these resources to get started with Pulumi’s open source platform:

Good luck making sure your team’s resources are tagged early and often with less manual effort!

Read the whole story
alvinashcraft
15 hours ago
reply
West Grove, PA
sbanwart
16 hours ago
reply
Akron, OH
Share this story
Delete
Next Page of Stories