Blogs

AWS API Gateway vs. Application Load Balancer (ALB)

In this article, we will dive into more details on how these two types of HTTP networking services compare, using the AWS services as a base level: API Gateway and Application Load Balancer (ALB).


Scalability

Both are highly-scalable services to a point that scalability should not be a concern for most use cases. For high-throughput applications, though, there are differences that need to be considered.

API Gateway has a limit of 10,000 RPS (requests per second), which might not be enough for some cases. When we look at Regional and Edge APIs, the limit is a lot more concerning: 600 and 120, respectively. More troublesome is that the last two can’t be increased, while the larger quota can on a per-request basis.

The 10,000 limit also benefits from burst capacity – up to 5,000 additional RPS – in peak demand moments. However, AWS does not take any hard commitments, and developers can’t control or predict how the burst capacity will be allocated. In practice, it’s risky to rely on it for purposes that involve user-facing endpoints.

ALB, on the other hand, is virtually unlimited. In fact, AWS specifies no limits in terms of connections per second or concurrently in the service quotas page. It can easily scale to handle +100,000’s RPS in a second and, in principle, could go beyond millions of RPS as well at these levels, it’s probably a good idea to pre-warm the Load Balancer with the help from the AWS support team, as well as to conduct stress tests and make sure the architecture is well optimized for the load.


Reliability and Availability

Both services are managed by AWS. API Gateway is highly reliable and available out of the box, developers do not have to worry about anything here. ALB requires developers to specify more than one Availability Zone per region to reach a higher level of availability.

Integrations

For Serverless applications, API Gateway was the only way to go until recently, when AWS announced the integration of ALB with Lambda functions.

Apart from Lambda functions, ALB can route requests to EC2 instances, ECS containers, and IP addresses. It also integrates with AWS Cognito for user authentication and authorization purposes.

API Gateway, on the other hand, is much better integrated with AWS’s managed services. Apart from Lambda functions, it can also integrate with virtually any other service that is available through HTTP requests, such as DynamoDB tables, SQS queues, S3 buckets, etc. Even external HTTP endpoints hosted outside of AWS can be integrated through HTTP.

It’s also possible to customize requests before forwarding to downstream resources, and also the responses from these resources before sending back to the clients. This way, API Gateway can even replace many use cases when a Lambda function would be needed as simply an intermediary, cutting costs and improving performance.

Request Routing Capabilities

API Gateway supports path-based routing. In other words, developers can configure which resources will receive incoming API requests based on the URL requested by the client.

ALB, on the other hand, offers a rule-based routing mechanism. Apart from supporting a URL path-based approach similarly to API Gateway, it also provides:

  • Requester Hostname

  • Requester IP address (CIDR blocks)

  • HTTP Headers

  • HTTP Request method (POST, GET, etc)

  • Key/value pairs incoming as query strings


It is possible to combine multiple conditions based on the options listed above, but there are some limitations. Wildcards are also supported, making the rule system flexible enough for most use cases.

Cost

Based on a fully Serverless pricing model, API Gateway charges only for requests received. The price depends on what type of API service is used:

  • Rest APIs: from $1.51 to $3.50 per million requests

  • HTTP APIs: from $0.90 to $1.00 per million requests

  • WebSockets: from $0.80 to $1.00 per million requests, plus $0.25 per million connection minutes


ALB charges based on two dimensions: time and resource usage. The first is straightforward: $0.0225 per hour. The second is a bit more complex: $0.008 per LCU-hour. LCU measures traffic processed by ALB. One LCU can support:

  • 25 new connections per second

  • 3,000 active connections per minute

  • 1 GB of traffic per hour for EC2 instances, or 0.4 GB per hour for Lambda functions

  • 1,000 routing rule evaluations per second

When any of these dimensions are exceeded, the ALB will charge an additional LCU for the hour.



Best use of Kinesis Stream

Amazon Kinesis is the real-time stream processing service of AWS. Whether you got video, audio, or IoT streaming data to handle, Kinesis is the way to go.

Kinesis is a serverless managed service that integrates nicely with other services like Lambda or S3. Often, you will use it when SQS or SNS is too low-level.

But as with all the other services on AWS, Kinesis is a professional tool that comes with its share of complications. This article will discuss the most common issues and explain how to fix them. So, let’s get going!

1. What Limits Apply when AWS Lambda is Subscribed to a Kinesis Stream?

If your Kinesis stream only has one shard, the Lambda function won’t be called in parallel even if multiple records are waiting in the stream. To scale up to numerous parallel invocations, you need to add more shards to a Kinesis Stream.

Kinesis will strictly serialize all your invocations per shard. This is a nice feature for controlling your parallel Lambda invocations. But it can slow down overall processing if the function takes too long to execute.

If you aren’t relying on previous events, you can use more shards, and Lambda will automatically scale up to more concurrent invocations. But keep in mind that Lambda itself has a soft limit on 1,000 concurrent invocations. You can reach out to AWS to get this limit lifted. There isn’t an explicitly defined hard limit above that, but AWS mentions its multiples of 10,000.

2. Data Loss with Kinesis Streams and Lambda

If you call put_record in a loop to publish records from a Lambda function to a Kinesis stream, this can fail mid-loop. To fix this, make sure you catch any errors the put_record method throws; otherwise, your function will crash and only partially publish the list of records.

If one Lambda invocation is responsible for publishing multiple records to a Kinesis stream, you have to make sure a crash of the Lambda function doesn’t lose data. Depending on your use case, this could mean you need to use retries or another queue in front of your Lambda function.

You can also try to catch any errors instead of crashing and then put the missing records somewhere else to ensure they don’t get lost.

3. InvokeAccessDenied Error When Pushing Records from Firehose to Lambda

You’re trying to push a record from Kinesis Firehose to a Lambda function but get an error. This is usually a permission issue with IAM roles. To fix this, make sure to assign your firehose the correct IAM role.

In the Resource section of your policy document, you need to make sure all your Lambda functions’ ARNs are listed. You achieve this with either a wildcard in the ARN or an array of ARNs.

But there can be many other permission problems that prevent invocation. Some of them are:

  • Missing the “Action”: [“lambda:InvokeFunction”]

  • Having an “Effect”: “Deny” somewhere

  • Assigning the wrong role to the firehose


4. Error When Trying to Update the Shard Count

You tried to update the shard count too often in a given period. The UpdateShardCount method has rather tight limits. To get around this issue, you can call other functions like SplitShard and MergeShards, with more generous quotas.

Often, you don’t know how many shards are sufficient to handle your load, so you have to update their numbers over time. AWS limits how you meddle with the shard count. To quote the docs here, you can’t

  • Scale more than ten times per rolling 24-hour period per stream

  • Scale up to more than double your current shard count for a stream

  • Scale down below half your current shard count for a stream

  • Scale up to more than 10000 shards in a stream

  • Scale a stream with more than 10000 shards down unless the result is less than 10000 shards

  • Scale up to more than the shard limit for your account

If you use other methods, you can get around some of the limitations, which give you more flexibility around sharding.

5. Shard is Not Closed

You interacted too soon after you created a new stream. Creating a new stream can take up to 10 minutes to complete. You can set timeouts after creating a stream or ensure that you retry a few times to fix this.

Creating new streams or shards isn’t an instant action. It happens very quickly, but you might have to wait for minutes in the worst case. As with any distributed system, you have to keep latencies in mind. Otherwise, your logs will be littered with errors.

Summary

If you have to process your data or media in real-time, it’s best to go for Kinesis on AWS.

Sadly, it’s not as straightforward as SQS and SNS, but it’s also more flexible than those services.

Your best course of action is to learn about the limitations of the service so you aren’t littered with avoidable error messages. Also, make sure to program your Lambda functions robustly so they don’t crash with half your data not processed yet.


Going Serverless

Serverless is a tool for radical change for both companies looking to rocket into the future and for developers looking to differentiate themselves in the job market.

Cloud Computing—What is it?

Cloud computing allows anyone to harness the power of services that took tens of thousands of engineering hours to develop and millions of dollars in investment to improve. The majority of cloud computing services open an easy to use interface which developers can leverage to build incredible world-class applications.

You most likely use cloud computing everyday, things like Gmail, Google Drive, Netflix, and practically every other technology or software running today is leveraging cloud computing in some form.

Cloud Computing—A Brief History

The history of cloud computing is important so you know why companies like Amazon got involved early on.

Back in the mid 2000's, Amazon.com started facing issues scaling their online store. Amazon had a growing team of developers and they were trying to find a way to increase TTV (time to value). Amazon did this through developing internal services which could be shared by other internal teams. This eventually revealed an idea that kicked off cloud computing. Other companies were most likely facing exactly what Amazon was facing and these abstractions to speed up internal processes could be extended to others.

Meaning that Amazon would be improving these services for themselves and for customers at the same time. Opening up additional revenue streams and not letting all of that world class engineering talent go to waste.

Soon after the cloud was born and in 2006 Amazon Web Services (AWS) released their first service AWS S3 which mirrors something similar to Google Drive which allows developers to programmatically store images, documents, etc. through a simple API from their applications. Think Instagram image upload or Netflix asset storage for videos, audio, subtitles, etc. It was easy to use, cheap, and highly scalable.

Since then AWS has continuously upgraded AWS S3 in the background making it even more performant, cheaper, and so on. The true beauty of cloud.

Following the 2006 release of AWS S3, hundreds of additional cloud services have been released by AWS which range from fully managed databases with AWS DynamoDB to fully managed machine learning with AWS Rekognition.

The market took notice of the impact that AWS was having and other competitors sprung up such as Microsoft, Google, IBM, and many more.

Serverless—One Abstraction Further

Fast forward to 2015 and AWS took their vision of abstracting over common use-cases like image storage to the next level. This next level is what we commonly refer to as "serverless".

AWS released the first version of a service called AWS Lambda, which allowed developers to no longer worry about provisioning of servers, scaling, and so on. This narrowed a developers focus to only their application code or business logic.

This gave companies and developers the freedom to stop worrying about all the underlying systems and stay focused on their product.

AWS Lambda had all of this baked in and the idea was simple. Simply upload your code and run it. When it runs you pay AWS and when it doesn't you're billed zero.

However, serverless was still maturing at this point and the integrations were not enough for most companies to make a hard switch away from their self-managed servers or containers to AWS Lambda.

By 2016, AWS released AWS API Gateway and now you had the ability to define a REST API which would connect to your AWS Lambda functions. This was a game changer. Leading up to 2016, the common way for you to run a REST API was to use a server or container. Now you had the ability to run the same code previously with some modification, but with built-in horizontal scaling, pay-per-use pricing, and so on.

No longer did you have to pay for your servers or containers running your REST APIs to be online 24hrs/day. This was a massive cost saving for customers and meant less time focused on underlying systems and more time spent innovating for customers.

Serverless—A Background On The Word

Serverless in 2016 was still commonly referred to as FaaS (Functions as a Service) and when someone would say serverless or FaaS they typically meant AWS Lambda.

Shortly after the release of AWS Lambda other cloud providers started offering a somewhat competitive products like Microsoft Azure functions, Google Cloud Functions, etc.

However, AWS Lambda was first to market with wide spread adoption and since then AWS has continued to keep innovating from that first release. Which is why AWS Lambda as of today, 01/2021, would still be our primary choice for building serverless applications as AWS never stopped innovating and AWS Lambda has continued to be improved automatically for users in the background similar to what I described with AWS S3, above.

As companies and developers switched some workloads to AWS Lambda and started taking serverless seriously. The word "serverless" didn't quite mean FaaS or AWS Lambda any longer. The word was limiting and needed to be expanded.

To build and run a modern web/mobile application as many readers would know, there are a slew of other resources required including things like databases, asset storage, cron jobs, logging, monitoring, security, and so on.

AWS Lambda may power your custom business logic, but AWS Lambda still needed to connect to other things to make an application useful.

This is where "serverless" turned into BaaS (Backend as a Service) meaning backend resources like databases would now be offered fully-managed for the user similar to AWS Lambda. Meaning companies and developers didn't need to worry about scaling, provisioning servers or containers, and paying 24hrs/day.

The result of utilizing services under this BaaS umbrella meant savings on the raw resource expense and saved on the human engineering/operations expense.

More importantly they increased focus on their product by elimination of "undifferentiated heavy lifting" as Werner Vogels would say, which had a direct result on TTV (time to value) which is commonly used to measure innovation.

To highlight a service which is commonly under the serverless or BaaS umbrella we have AWS DynamoDB, a NoSQL database which scales to the moon, has pay-per-use pricing, and just like AWS Lambda most things are handled under-the-hood and automatically upgraded/improved for you.

Nowadays it's very common to hear serverless mean "fully-managed" in addition to FaaS or BaaS.

This definition includes services like:

  • AWS Lambda for business logic

  • AWS DynamoDB for NoSQL database

  • AWS S3 for asset storage

  • AWS Cognito for user authentication

  • AWS API Gateway for your REST APIs

  • AWS AppSync for GraphQL APIs


Each one of these services can work together and AWS Lambda acts as the "glue" connecting them together like a spiderweb of cloud greatness.

With that background out of the way, let's jump back to why serverless is important.

Serverless—Why Learn It, Why Care

The simple answer, it's the current and the future.

It was the future back when I was in code school and first heard the word serverless uttered and it's the same today.

Millions of companies around the world have had success in the past. They've found a market, built a product, and achieved product market fit. However, sometimes our past successes can cripple our ability to stay competitive.

As humans we develop processes, protocols, and eventually we settle into a comfortable status-quo. It's natural progression, but it's not without some drawbacks.

A status quo has the ability to block new ideas and ultimately lead to companies being disrupted by new young lean startups. And the bad part is even if you see it coming and even if you know you need to change it can be so difficult with technical limitations and the company culture that you just simply can't move fast enough.

That's a sad day. That's what every company needs to take seriously. And it's a situation where the phrase, "adapt or die" has been tossed out quite a few times in the recent years. We've seen many industries be disrupted and the trend is accelerating.

It's dramatic, but it's unfortunately true.

Some of the technical limitations to being agile and lean stem from the sheer fact that a company was successful previously. When companies have a successful product or service they then need to spend time maintaining it and as the years fade away these once modern applications drift into being legacy applications.

These legacy applications in most cases were built long before leveraging serverless was a possibility and cloud computing for that matter. After all we have to live with the limitations of the time we are in and make what we can out of what we have. When everyone had the same limitations, it wasn't a debilitating factor, but it is now because there are other options.

Legacy applications pre-dating cloud computing or serverless can consist of hand-rolled internal services similar to what AWS offers. Again because there wasn't other options. However, the investment to build these internal services took in some cases months or years and millions of dollars to standup all the infrastructure.

And the reality is that whether or not you spend a couple million dollars building an internal service or use a fully-managed pay-per-use service developed by a cloud provider; from the customer perspective they look and feel the same.

This means that there is a gap. A gap from existing companies with past success moving away from hand-rolled internal services and away from previously established status quo's to a world in which the only thing a company focuses on is the customer need.

It's not that radical of an idea, right?

Stop spending millions of dollars on hand-rolled internal services which have nothing to do with your customers needs and instead re-invest that money into making your product better and making your customers happy.

That is what we mean by "adapt or die". Startups don't have past successes, they don't have established status quo's. They have fresh perspectives and are starting today with todays best-in-class offerings which can look like a Formula 1 Race Car next to a rusty hand-built wagon.

It's an amazing world we live in today. That more people than ever before in history have the ability, even an individual, to string together services which took tens of thousands of engineering hours with a few clicks of a button. Build a product, release it to the entire world (literally), and compete against the 800lb enterprise gorilla's by rubbing a couple nickels together.

This is why serverless is worth time learning.

You skip the line. You move to the front. You learn the way to build applications today without the baggage of the past. Both startups and enterprise companies need help and everyone will build applications like this in the future. The revolution has already begun.

That is why years ago, serverless was the future. And that's why years later, serverless is still the future.

It's what we as a community have been moving towards since we had servers taking up entire rooms.

Do more with less. Do higher quality with less. Do more and pay less. When it's not running you don't pay anything.

The promise of serverless is that anyone has the ability to turn their ideas into reality.

You can be part of this future and if you start today you will have the advantage of time. As serverless continues to gain momentum and mature you will be on the forefront of that innovation.



What is AWS Control Tower?

If you’re planning a large-scale AWS deployment, you’re probably wondering how to orchestrate multiple applications and teams on AWS. How do you make sure that every team can access AWS without your accounts turning into sprawling, ungoverned chaos?

For many companies, a multi-account structure can help meet the unique needs of each application team or business group. AWS provides free native tools like AWS Organizations to help provide central orchestration of multiple accounts, so that you can enforce security and billing configurations while still giving each team some degree of autonomy over their account. Still, maintaining multiple AWS accounts can require a lot of annoying administrative setup and is prone to configuration drift.

Recently, AWS launched a series of new services to make that easier. AWS Control Tower is essentially an opinionated architecture that builds out a multi-account architecture with pre-configured security and access settings, plus a dashboard to manage that multi-account architecture over time.


Why Multi-Account?

  • Network isolation. Ensure that services of one account are not affected by the others. By separating applications or teams into completely separate accounts, there’s a better chance that an issue in one account won’t affect all accounts.

  • Separation of concerns & modularity. An architecture that is separated into distinct services allows you to make changes, without affecting the rest of the company’s accounts. It often takes less time and coding to make a change to modular infrastructure than to monolithic infrastructure where features are mixed up together.

  • Scalability. Need to spin up or down a new application or SDLC tier? You can do so knowing that the additional account is connected to the Hub and central security requirements.

  • Compliance. Limit the scope of your audits (and cut audit expenses) by maintaining regulated data in a limited number of accounts and by putting non-regulated data into another account. Also, it is often a compliance requirement to separate development and production environments (ex. SOC1 + 2). The multi-account model allows you to do this without duplicating security controls for each account.

What is AWS Control Tower?

AWS Control Tower is a solution that helps automate the process of setting up and configuring multiple accounts. (Formerly known as AWS Landing Zone.) Best practices for a multi-account architecture are embedded in the solution, making AWS Control Tower perfect for companies with complex workloads and larger teams that want to quickly migrate to AWS. Control Tower is deeply tied into AWS Organizations, a service that allows you to enroll any number of “child” accounts under a parent account and apply policies across all accounts from a single location. This extends similar functions originally used for

Consolidated Billing and provides additional capabilities like AWS CloudFormation “stacksets”. Stacksets allow you to provision infrastructure across child accounts.

To start, you might have one account that has the majority of workloads. From this foundation, you can launch individual accounts for applications, environments, business groups, or corporate entities, while keeping them separate from base infrastructure accounts.


Why separate central functions from application accounts?

As Control Tower is built on the backbone of AWS Organizations, which allows you automatically control access and permissions for child accounts. AWS Organizations allows you to define Service Control Policies to limit the services that are available to different accounts within the Organization. You can enforce policies on users of an account and define cross-account permissions to ensure your organization has the guardrails in place to maintain a secure environment. This is particularly useful for setting restrictions to powerful roles in child accounts. If the master account denies a privilege, a child account has no ability to override that restriction. Without the controls available inside an AWS Organizations structure, granting select administrative access is more difficult.

This can be a core function of your security and cost management strategies. Even if a malicious actor accesses one account, there is no way for them to access other accounts, and they may have limited privileges within that account. This limits the blast radius of certain activities. Additionally, by having a cross-account destination for all of your logs, backups and other items you need to archive, you can more easily restrict access to those archives and ensure nothing gets deleted.

AWS Control Tower and AWS Organizations are most compelling for companies with many different IT roles who have different needs. It is also useful if you want to segregate compliance standards but still want default functionality across environments.

What does a default AWS Control Tower include?

  1. Core Organizational Unit with 3 accounts:

  • Master Account – Provides the ability to create and financially manage member accounts. Also used for Account Factory provisioning and accounts, managing Organizational Units, and guardrails

  • Log Archive Account – Contains central Amazon S3 bucket for storing logs of API activities and resource configurations from all accounts in the solution.

  • Audit Account – A restricted account that’s designed to give security and compliance teams read/write access to all accounts in the landing zone. From the audit account, you have programmatic access to review accounts, by means of a role that is granted to Lambda functions only. The audit account does not allow you to log in to other accounts manually.

  1. Within each account, an initial security baseline that includes:

  • AWS CloudTrail, sent to a centrally managed S3 bucket in the Logging Account

  • AWS Config, also sent to a centrally managed S3 bucket in the Logging Account

  • AWS Config Rules enabled for monitoring encryption, IAM password policies, MFA, and security group rules

  • AWS IAM roles, potentially including restrictions applied from the master account

  • An initial Amazon VPC network

  1. An Account Factory – essentially, an AWS Service Catalog product that allows you to automatically create new “child” accounts to the existing Organization that maintain all predefined security baselines

  2. The Control Tower Dashboard – limited UI to the base Control Tower constructs. Only components deployed and managed by Control Tower are seen in the dashboard.

Control Tower can additionally work with functionality not yet exposed in the Control Tower dashboard interface, but available in the direct configuration of the foundational services. An example of this is repointing AWS SSO to another identity provider directory, including Azure Active Directory (AD) or AWS Managed Active Directory. This AWS SSO configuration works in a Control Tower environment, but is not yet displayed in the Control Tower dashboard itself. Control Tower can also be extended with customizations or “add-ons”.


Launching Control Tower in the Real World

Control Tower is a perfect toolset for any company that needs to segregate business units or SaaS tenants while maintaining central billing and security baselines. And companies that we’ve worked with have been pleased at the result: a secure, well-organized account structure that can expand with their company.

Here are just a few of the multi-account projects that we’ve worked on in the last 12 months:

  • A financial investment management firm used a landing zone to provide dedicated accounts for each portfolio manager or team, with automated baselines and security tooling by default.

  • A franchise-based service delivery organization used a landing zone to provide accounts for every business unit and application development lifecycle

  • A billing management SaaS provider used a landing zone to deploy dedicated accounts per end tenant for clear infrastructure segmentation by customer. This also allows them to maintain different security and compliance requirements for each customer, if necessary.

Sample AWS Control Tower Architecture

The following are actual architecture diagrams from a project recently completed with a SaaS company. Each account had its own diagram, but for the purposes of this guide, we’ve provided the overall account structure and a look at network flow between various critical components.

A few notes:

  • Account Vending Machine has been renamed Account Factory, as explained above.

  • The customer was a single tenant SaaS company, so ever customer maintained a separate account.

  • This diagram is representative of the core account architecture plus a single customer account. Obviously, the SaaS company managed multiple customer accounts.

Network Flow Detail

This architecture diagram shows how information flows from the on-premises datacenter through the Network Accounts to the App Account.

  • Data flow from on-premises to AWS:

    • Data flows back and forth from their on-premises datacenter through an AWS Direct Connect (dedicated network connection) to the Network Account

    • The Direct Connect Gateway can transfer connections to multiple VPCs, like a Transit Gateway

    • On the App account, a Virtual Private Interface acts as the endpoint from the Direct Connect

  • End user access to the applications:

    • From the internet and Akamai (a CDN) to the load balancer, which distributes the traffic to the instances that are contained in private subnets. This represents all inbound traffic from external users to the eCommerce website assets (all static content)

    • Logs flow from every account (including the Network Account, Sandbox, Dev/Test, QA/STG, and Production accounts) to the Log Account using a Lambda function and IAM access

    • Cross-account IAM permissions are allowing access from services in the Shared Services account to the App Account

How to Deploy This Architecture

Since AWS Control Tower is a multi-account solution, it’s not possible to give you a CloudFormation template, as we will for other architectures in this Guide. Control Tower isn’t really an AWS service in its truest form. It has no API and you can’t create it with CloudFormation. It’s just a wrapper for other AWS services through the console.

To launch a Control Tower, navigate in the AWS console to https://console.aws.amazon.com/controltower. Once there, you can pick your desired home region, provide details about core OUs, review service permissions, and launch Control Tower.


Summary

In this guide, we discussed the basics of AWS Control Tower and outlined a few best practices. As an implementation example, we introduced the AWS Control Tower solutions that we used to help customers deploy real-life applications. A multi-account architecture is an ideal solution if you’re migrating a large, complex set of applications to AWS. AWS Control Tower is meant to help reduce the complexity of building and managing a multi-account structure long-term.


AWS Fargate vs OpenShift Container Platform

Introduction

Critical differences between AWS Fargate vs OpenShift Container Platform:

  1. AWS Fargate only works with AWS cloud services, while OpenShift has more collaboration options.

  2. OpenShift is an open-source solution while AWS Fargate keeps its coding secret.

  3. AWS Fargate is only available in select regions, while OpenShift Container Platform can operate anywhere.

AWS Fargate and OpenShift Container Platform are serverless container managers that can help companies develop effective applications. If you don’t know the differences between them, though, you can’t make an informed choice. The following AWS Fargate vs OpenShift Container Platform should help you start your comparison.

What Is AWS Fargate?

AWS Fargate is a serverless manager for containers. As a serverless solution, developers and admins do not need to spend time choosing server types or setting access rules. (Serverless, of course, does not mean that AWS Fargate does not use servers. It simply means that users do not need to concern themselves with servers because AWS Fargate performs that service for them).

The serverless option makes it easier for users to allocate resources to containers instead of dedicating time to the underlying infrastructure. This helps ensure that applications run as expected on a variety of platforms.

What Is OpenShift Container Platform?

OpenShift Container Platform is also a serverless container management platform. Red Hat developed OpenShift to accelerate applications in a variety of environments, including hybrid cloud environments.

Red Hat OpenShift Container Platform can operate independently on a dedicated server, but the company also built versions that work in coordination with IBM Cloud, Microsoft Azure, and other cloud providers. This level of flexibility makes it appealing to many enterprise users who don’t want to abandon ongoing relationships with other companies.

AWS Fargate vs OpenShift Container Platform: Top Features

Although AWS Fargate and OpenShift Container Platform largely perform the same tasks, each has unique features that may make one more attractive than the other to certain users.

Some of the features potential users should know about AWS Fargate include:

  • Its ability to work quickly in the AWS cloud. (Fargate actually works well with the entire AWS ecosystem. Unfortunately, it cannot run on the private cloud. If you try to leave AWS, you will have problems with Fargate.)

  • The option to launch tens of thousands of containers within seconds, which many enterprise users will find useful at certain times.

  • Its ability to scale quickly when users need to launch more applications in a short time.

Red Shift’s developers give OpenShift Container Platform slightly different features that make it stand out as a strong option. In fact, some of its best features stand in opposition to those from AWS Fargate.

People who prefer OpenShift vs Fargate often point to features like:

  • The flexibility to work on numerous cloud platforms, including hybrid and private cloud servers. This feature gives users more control because they can continue using their current cloud services or reconsider their strategy to adopt a more robust option with a lower cost.

  • A user experience dashboard that makes it easy for new users to accomplish basic tasks after a relatively short learning curve. More advanced features, however, do take longer to master.

  • Multiple language support that lets companies develop applications for clients all over the world.

Overall, AWS Fargate vs OpenShift Container Platform could meet your organization’s needs. It just depends on the level of flexibility that you want and whether you prefer using tools outside of the AWS ecosystem.

AWS Fargate: Pros and Cons

Having a short list of AWS Fargate’s pros and cons can make it much easier for you to determine whether you want to learn more about the serverless container solution.

Advantages of choosing AWS Fargate:

  • It coordinates extremely well with other solutions in the AWS ecosystem.

  • Potentially lower costs when you pay close attention to how and when you use Farfate (this depends on several factors, including which solutions you compare it to).

  • AWK Fargate embeds security in its IT infrastructure, so users don’t need to spend as much money or time addressing security issues.

Disadvantages of choosing AWS Fargate:

  • It does not work on cloud providers other than AWS.

  • Although Amazon is working to expand Fargate’s reach, it isn’t available worldwide. Fargate is not available in major economic areas like Montreal, Hong Kong, Osaka, Paris, London, Montreal, Cape Town, Seoul, Beijing, and even Northern California.

  • Fargate often has higher costs than similar solutions, even though it only charges for the services that you use.


OpenShift Container Platform: Pros and Cons

OpenShift Container Platform operates as one of AWS Fargate’s competitors, so the product needs to excel in areas where Fargate fails. At times, Red Hat’s solution has been successful. At other times, it hasn’t met that need as well as expected.

Some advantages of choosing OpenShift Container Platform include:

  • Excellent flexibility that lets developers decide which other tools and cloud providers meet their needs best while using OpenShift.

  • OpenShift doesn’t have a vendor lock-in, which makes it even easier for companies to explore their options. If you find a cloud service provider that works better than your current one, you can switch without much delay.

  • OpenShift is an open-source platform that makes it easier and faster for experienced developers to get the specific features that they need to take their products to the market as soon as possible.

OpenShift does have a few disadvantages that might affect some people. For example, OpenShift:

  • Has strict security policies that can make it difficult for teams to do things like rubbing a container as a root. Inexperienced developers will find this helpful. Those who want more control will likely feel frustrated.

  • The templates that come with OpenShift aren’t as flexible as those from similar solutions. While you will not notice a difference between Fargate vs OpenShift Container Platform, people experienced with options like Kubernetes Helm may wonder why the restrictions exist.

AWS Fargate vs OpenShift Container Platform: User Reviews

AWS Fargate users who write reviews on G2 give the platform 4.5 out of 5 stars. Some of their comments include:

  • “The primary complaint that you will see people have is lack of persistent storage with AWS Fargate.”

  • “We need to replace AWS Fargate for AWS Batch due to compatibility of Fargate to attach EFS to containers.”

  • “The best thing about Fargate is that you can just start out of the gate without setting up servers.”

OpenShift Container Platform earns 4.4 out of 5 stars from users who post reviews on G2. Some of their reviews include:

  • “Broad set of peripheral tools for Monitoring and CI/CD Pipelining.”

  • “I like the ease of use, flexibility, security and gul features like reviewing logs and accessing shell from the console.”

  • I don’t like “lack of documentation in certain aspects such as the relaxing of certain security enforcements.”

Alternatives to AWS Fargate and OpenShift Container Platform

If AWS Fargate vs OpenShift Container Platform leaves you wishing that you had a better option, now is a good time to consider IronWorker. IronWorker is an extremely flexible solution that lets you run applications on any public, private, or on-premises cloud.

Other benefits of IronWorker include:

  • Straightforward pricing.

  • Serverless computing that scales up or down quickly.,

  • Excellent security features.

  • Full GPU support.

  • Custom-built solutions for your organization.


AWS Lambda vs. Amazon EC2: Which One Should You Choose?

Amazon is one of the leaders in providing diverse cloud services, boasting several dozen and counting. Amazon EC2 is one of the most popular Amazon services, and is the main part of Amazon Cloud computing platform that was presented in 2006. Amazon EC2 is widely used nowadays, but the popularity of another Amazon service called Lambda (introduced in 2014) is also growing. AWS Lambda vs EC2 – which one is better? Today’s blog post compares these two platforms to help you make the right choice for your environment. The main sections of the blog post are:

  • What is AWS EC2?

  • What Is AWS Lambda?

  • AWS Lambda vs EC2: Use Cases

  • AWS Lambda vs EC2: Working Principle

  • AWS Lambda vs EC2: Versions/Snapshots

  • AWS Lambda vs EC2: Security

  • AWS Lambda vs EC2: Performance and Availability

  • AWS Lambda vs EC2: Pricing Model

What Is AWS EC2?

AWS EC2 (Amazon Web Services Elastic Compute Cloud) is a service that allows for using virtual machines called EC2 instances in the cloud and providing scalability. You can change the amount of disk space, CPU performance, memory etc. whenever you need. You can select the base image with the necessary pre-installed operating system (OS) such as Linux or Windows and then configure most OS settings as well as installing custom applications. You have the root access for your Amazon EC2 instances and can create additional users. Manage everything you need and fully control your EC2 instances including rebooting and shutting down the instance. The category of AWS EC2 web service is known as Infrastructure as a Service. AWS EC2 can be used for cloud hosting – you can deploy servers as virtual machines (instances) in the cloud.

What Is AWS Lambda?

AWS Lambda is a computing platform that allows you to run a piece of code written on one of the supported programming languages – Java, JavaScript, or Python when a trigger linked to an event is fired. You don’t need to configure a virtual server and environment to run an application you have written. Just insert your program code (called Lambda function in this case) in the AWS Lambda interface, associate the Lambda function with the event and run the application in the cloud when needed, without taking care of server management and environment configuration. This way, you can focus on your application, not on server management—this is why AWS Lambda is referred to as serverless.

An event after which your application is executed can be uploading a file to the Amazon S3 bucket, making changes in DynamoDB tables, getting an HTTP request to the API Gateway service, etc. After configuring a function to run when an event occurs, your application will be executed automatically after each new event.

As for classification, Lambda is an implementation of Function as a Service (FaaS) by Amazon. On the table below, you can see the level of management for each service type starting from using physical servers and compare them. The lowest levels (required user management) are marked with a green color and the upper levels (provided provider management) are marked with a blue color. Thus, when using physical servers, you can manage hardware and all upper levels. When using Infrastructure as a service (IaaS) such as AWS EC2, you can manage operating systems on provided virtual machines (EC2 instances). On the Platform as a Service (PaaS) level, you can run your application that must be compiled before running. When using Function as a Service (FaaS) such as AWS Lambda, you don’t need to compile your application – just insert your code in the interface provided by MSP (managed service provider). SaaS (Software as a Service), that is mentioned for comparison in the table, allows you only to use ready-made applications (applications made by vendors) in the cloud by using a thin client or a web browser.

AWS EC2 vs Lambda: Use Cases

AWS EC2 has a wide range of use cases since almost everything can be configured when using this service. The most common use cases of AWS EC2 are:

  • Hosting web sites

  • Developing and testing applications or complex environments

  • High performance computing

  • Disaster recovery

General use cases of AWS Lambda:

  • Automating tasks

  • Processing objects uploaded to Amazon S3

  • Real-time log analyzing

  • Real-time filtering and transforming data

Let’s consider a particular example. Imagine that your web site uses an Amazon S3 bucket to store web-site content including pictures, videos, audio files, etc. When a new image or video file is uploaded, you need to create a preview image for your web page that is used as a link to a full size image or video file. Creating preview images manually can be a boring and time-consuming task. In this case, you can create a Lambda function that can automatically resize the image based on the uploaded picture, rename that image, and store the target image in the appropriate directory. You can configure the Lambda function to be executed right after the event of uploading the original image file to the Amazon S3 bucket used by your web site.

AWS EC2 vs Lambda: Working Principle

EC2. As you may recall, when using AWS EC2, you operate with virtual machines (VMs) known as EC2 instances. You can add virtual hardware (virtual disks, network interfaces, processors, memory) to an EC2 instance, as well as start, stop, and reboot a VM instance. EC2 instances can work with two storage types – Elastic Block Storage (EBS) and S3 buckets. You can use a pre-configured image with an installed operating system and create your customized Amazon Machine Image (AMI). The EC2 cloud service provides automatic scaling and load balancing. EC2 instances can work in conjunction with most other Amazon web services, such as S3, ECS, Route53, Cloudwatch, etc.

Lambda. When using AWS Lambda, your application (Lambda function) is running in a container that is seamless for you. The container contains code and libraries. Resources are provided by Amazon in accordance with application needs, and scaling is automatic and seamless. You cannot control neither a container running your application nor an EC2 instance on which the container is running (you don’t know anything about them because the underlying infrastructure is unavailable for Amazon Lambda users). Refer to the table above.

AWS Lambda can be considered as a framework of EC2 Container Service (ECS) that uses containers to run a piece of code that represents your application. The life cycle of each container is short. The running Lambda function doesn’t save its state. If you want to save results, they should be kept in some data storage, for example, in an Amazon S3 bucket. It is possible to configure a virtual network for a Lambda function, for example, for connecting to Amazon RDS (Amazon Relational Database Service). Lambda consists of multiple parts: layers, function environment, and a handler. Triggers are Lambda activators. Lambda is one function that is executed by queries from triggers.

The complete list of available triggers:

  • API Gateway

  • AWS IoT

  • Alexa Skills Kit

  • Alexa Smart Home

  • Application Load Balancer

  • CloudFront

  • CloudWatch Events

  • CloudWatch Logs

  • CodeCommit

  • Cognito Sync Trigger

  • DynamoDB

  • Kinesis

  • S3

  • SNS

  • SQS

API Gateway is a special service that allows developers to connect diverse non-AWS applications to AWS applications and other resources.


AWS EC2 vs Lambda: Versions/Snapshots

EC2. A complex system of snapshots is available for EBS (Elastic Block Storage) volumes of AWS EC2 instances. You can create incremental snapshots and roll back to the needed state of an EC2 instance. Multi-volume snapshots can be used for critical workloads, for example, databases that use multiple EBS volumes.

Lambda. A convenient versioning system is supported for better management of Lambda functions. You can assign a version number to each uploaded copy of code and then add aliases that are pointed to the appropriate code version. Each version number starts from 1 and incrementally goes up. You can categorize Lambda functions to alpha, beta, and production, for example. The Amazon Resource Name is assigned to each Lambda function version when publishing and cannot be changed later.

AWS EC2 vs Lambda: Security

EC2. You should take care of your EC2 instances and all components inside the instances. You can manually configure a firewall for your EC2 instance – Amazon provides VPC (Virtual Private Cloud) Firewall to control traffic and ensure security for EC2 instances in the cloud. You can manually set up and configure antivirus software for your EC2 instances, create IAM roles, specify permissions, create security groups, etc. AWS Systems Manager Patch Manager allows you to install OS updates and security patches automatically. You can configure AWS to take a snapshot before installing a patch or update to prevent possible issues. Create key pairs to access EC2 instances if needed. You should pay more attention to security when using AWS EC2 compared to when using AWS Lambda.

Lambda. There are permissions to AWS services to which Lambda has access by default. The IAM role is used to define services that must be available for a Lambda function. For each Lambda, you should configure the IAM Role on behalf of which Lambda function will be launched. It means that after configuring the IAM role, you will be able to connect your Lambda function to the defined Amazon services without using keys or other authorizing parameters.

It is possible to configure encryption between a Lambda function and S3 as well as between an API gateway and Lambda with a KMS key. When you create a Lambda function, a default encryption key is created. However, the recommendation is to create your own KMS key.

Compared to EC2 instances, Lambda functions don’t require security updates and patches. Underlying containers and operating systems are updated automatically by Amazon. This is the advantage of using Lambda functions in terms of security.

AWS EC2 vs Lambda: Performance and Availability

EC2. After powering on an EC2 instance, the instance runs until you manually stop it or schedule a shutdown task. When an EC2 instance is running, an application is executed near instantly on that instance. You can run as many applications as you want simultaneously if performance of your EC2 instance allows that. Running applications on EC2 instances is a good solution when applications must be run regularly during the entire day.

Lambda. A Lambda function is always available but it is not running all the time. By default, the Lambda function is inactive. When a trigger linked to an event is activated, your application (Lambda function) is started. The maximum time for running the Lambda function (timeout) is limited to 900 seconds (15 minutes). Executing long-running applications in AWS Lambda is not a good idea, accordingly. If you need to run applications that require more than 900 seconds to complete successfully or applications that have a variable execution time, consider using AWS EC2. Another limit for a running Lambda function is the maximum amount of memory that is equal to 3008 MB.

1000 to 3000 Lambda instances can be executed simultaneously, depending on the region. Contact AWS support if you are interested in running more instances simultaneously.

A delay between sending a request and application execution is up to 100 milliseconds for AWS Lambda, unlike applications running on EC2 instances that don’t have such delay. 100ms is not a long time, but for some types of applications, this time can be critical. If your application must download some data from an Amazon S3 bucket, an additional 1 to 3 seconds may be needed before application execution. Keep in mind this delay time when planning to use AWS Lambda to run applications.

The cold startup time is a drawback of Lambda functions. Latency occurs when a function is not executed for a durable period of time, and time is needed to start a container and run the function in Amazon Cloud. Using AWS Lambda for running applications may be a good solution when you have uneven workloads, and applications must be run in different periods of the day with long pauses between application executions.

AWS Lambda vs EC2: Pricing Model

Both EC2 and Lambda cloud services use the pay-as-you-go principle. However, let’s consider details and differences.

EC2. You pay for the time when your AWS EC2 instance is running whether or not the function/application is executed. The price per hour depends on the CPU performance, amount of memory, video card performance, and storage capacity used by the EC2 instance. When you need your function/application to be always available due to a high number of regular requests, using AWS EC2 instances may be more rational financially.

Lambda. You pay for a number of application executions and the time needed to finish execution. The price for each second of running an application depends on the amount of memory provisioned for an application and is $0.00001667 per each Gigabyte-second. The time of application execution is counted from the application's start to the return of the result or to stop after timeout. Time is rounded up to the nearest number that is multiple of 100ms. When you need on-demand availability, the price for using AWS Lambda to run functions/applications may be better.

Conclusion

Today’s blog post has compared AWS EC2 and AWS Lambda because AWS Lambda vs EC2 is a popular topic nowadays. AWS EC2 is a service that represents the traditional cloud infrastructure (IaaS) and allows you to run EC2 instances as VMs, configure environments, and run custom applications.

AWS Lambda is the implementation of Function as a Service by Amazon that allows you to run your application without having to worry about underlying infrastructure. AWS Lambda provides you a serverless architecture and allows you to run a piece of code in the cloud after an event trigger is activated. When using AWS Lambda, you have a scalable, small, inexpensive function with version control. You can focus on writing code, not on configuring infrastructure.

If you have calculated that there is a lot of idle time of your application on an EC2 instance that is always running, consider using AWS Lambda with which you don’t need to pay for idle time if there are no requests to run an application. If there is a high number of regular requests to run your application, it may be better to deploy an application on an EC2 instance that is always running.

Using AWS EC2 is good for running high-performance applications, long-running applications, and the applications that must not have a delay at the start time. If you use AWS EC2 instances, don’t forget to back them up to avoid losing your data.

Business Process Management Suite (BPMS)

The Business Process Management Suite (BPMS) is an automation tool that helps analyze, model, implement, and monitor business processes. It identifies vulnerabilities in everyday business practices that are costing the company time and money and helps control them. Through this, it increases the efficiency of the company’s employees.


Processes like account management, employee hiring, invoicing, inventory management, and compliance documentation (which involve a lot of complicated data management) can be automated using BPMS.

Where Can Organizations Use BPMS?

BPMS is applied to processes in an organization to produce a business outcome. The process must be repeatable or done on a regular basis such as hiring an employee, shipping a package, paying salaries, or managing compliance certificates, licenses, accounts, invoicing, customer service, IT, and finances. The goal is to reduce error and latencies due to human errors.

Some common uses for BPMS in day to day business life include enhancing purchase order processes, optimizing content marketing workflows, and managing healthcare outcomes:

Enhancing Purchase Order Processes

In the course of purchase orders being fulfilled, sometimes a number of necessary details can be lost along the way. This causes a great deal of confusion, wasted time, and a loss of productivity. There are a few major stages of a purchase order, and data can get lost at any point along this journey:

  • The creation of the purchase order and the approval process it goes through

  • The processing of the order

  • The delivery of the order

  • The payment procedures completing the process

Any organization dealing with bulk orders knows the importance of having a fool-proof system to keep the flow of orders moving. Businesses are at a massive risk when they do not do so. This is where BPMS comes into the picture to ensure that the entire process is seamless and does not meet any road blocks (or loss of data) along the way.

Content Marketing

Content marketing can seem quite straightforward—know the product and the client, develop messaging, create the content, and send it out. However, there is much more to it than that. The average content marketing process goes through a long cycle:

  • Writing out content to match a brief, often multiple options

  • Editing, which goes through a hierarchy and to different departments

  • Designing to ensure better branding

  • Publishing material and distributing it across various media

  • Monitoring content’s effect and collecting analytics and insights.

BPMS solutions ensure there is a smooth workflow from one segment to another. It also offers all those involved the ability to spot any redundancies as well as inefficiencies and work on fixing them to ensure better results.

Healthcare Management

Hospitalization can be quite traumatic for patients. Any disruptions in the admission and discharge process only adds to this distress. The admission process alone has several stages, ranging from information collection, obtaining medical records, insurance details, and room preference. Generating bills happens in conjunction with several departments—from nursing to surgery to house-keeping to ancillary medical needs and more. BPMS processes ensure that details are not forgotten or missed during the various stages. Efficiency is increased, the patient is cared for and experiences less stress, and no processes are missed.

Types of BPMS

BPMS system can be broadly classified into three types:

  • Integration-centric BPMS: This system handles existing processes that require minimal or no human interaction. It depends on the integration of computer and internet-based applications. For example, the data used by a sales team might be obtained by integration of data received from a marketing tool that stores it in a Customer Relationship Management (CRM) tool. Though the information is used by one team, it has been derived from an integration of multiple points.

  • Human-centric BPMS: This is a more direct approach where humans are the decision makers at each step. They are, however, guided by a visual interface to understand the decision-making process better. An example of this is the hiring of an employee. At each step—from posting a request for hiring to reviewing the request and processing by the HR department—the process is completely done by humans.

  • Document-centric BPMS: This process is completely driven by a process document. It requires multiple approvals at each point of the workflow. An example of this would be a budget approval that requires approval at multiple levels is a document-centric BPMS.

How does BPMS work?

Efficient BPMS requires not just improvement of processes but also automation of those processes, which is all handled by software. This software projects the entire process workflow and tests it in a virtual environment, while assuming variables and outcomes and identifying bottlenecks and eliminating them. The newly tested process is then deployed. The BPMS doesn’t stop there. From here on, it continuously monitors the workflow for effectiveness and efficiency.

BPM Suite workflow is based on the business process management steps: analysis, design, modeling, execution, monitoring, and optimization.

Analysis

This is the process of studying the existing practices and analyzing them for latencies. Every aspect of the workflow is analyzed and metrics are put in place for comparison. This initial version of the process is called ‘as is.’

Design

This process involves correcting the flaws and latencies of the ‘as is’ processes by designing a more efficient workflow. The design aims to correct the workflow and the processes within that lead to bottlenecks and inefficiency. It targets all the alerts and escalations within the standard operation processes and corrects them with a more efficient process.

Modelling

The design is now represented in a flowchart by fixing accountability and redundancies at each process. It introduces conditional loops like ‘if’ and ‘when’ with variables at each point to determine different outcomes from the old processes, such as steps to take when the target output isn’t met or if the outcome of the previous step is satisfactory.

Ideal business modelling tools should be easy to read, simple to communicate, inexpensive, up to date with industry standards, and have redundancies in place. The model should have a graphic interface and a workflow editor and simulator. This model of the process is called the ‘to be’ process.

Execution

After successfully modelling and simulating the workflow design, the next step is to execute the process. It is tested on a smaller group before deploying it to larger groups. Access restrictions are put in place to protect sensitive information. These processes are either automated or manual.

Monitoring

Here, the individual processes are tracked and statistics are derived. Performance at each step is analyzed to determine its effectiveness. It also helps identify bottlenecks and security vulnerabilities. Various levels of monitoring can be used, depending on the information the business wants. It can vary from real-time to ad hoc. Monitoring involves process mining, where event logs are analyzed and compared between the current process and the previous process. It exposes the discrepancies and bottlenecks between the two processes.

Optimization

This is the step where the data obtained from monitoring is analyzed and any changes that are required are implemented to make the workflow more efficient.

Features of an Ideal BPMS

The objective of BPMS is to automate as much of the business as possible and run it efficiently for long term benefits. However, a badly designed and unintuitive software could do more damage than good. An ideal BPMS must have the following features:

  • User friendly interface for the process design

  • An intuitive and simplified process diagram

  • Cloud-based storage for better stability and reliability

  • Dashboards and reports that are customizable and integrated

  • Automated real time alerts.

Challenges of BPMS

Like with most businesses moving to automation, the biggest challenge will be reliability of the software to come up with the right solutions. An effective BPMS will be simple to use and will not need additional services to decode its analytics.

Challenges in Maintenance and Upgrade

Most of the traditionally packaged BPMS are hard to maintain and upgrade. They require additional training by experts and an in-house talent to maintain and decode. This becomes more expensive and unreliable than what it was originally intended to do. They come with a one-time purchase, and license upgrades come at a steep cost. These packages can become difficult to maintain and upgrade and are often not intuitive with the market conditions. They will also need specialized people operating it at all times, making it unreliable on a long term basis.

Cloud-based BPMS tools overcome all these challenges as they are based on a software as a service (SaaS) model. This makes it easy to set up and deploy. Data can be accessed anytime from anywhere with round the clock support. Since the cloud-based tools come with a subscription instead of a one-time license cost, it allows a company to try it on a smaller scale before expanding. This makes it more cost-efficient compared to traditionally packaged tools.

Challenges of Multiple Features

BPMS with too many complex features are also something to avoid. This will again require the organization to undergo additional training and cost management to use these aspects of software. Too many features can also lead to issues with integration. Complex features make it not just difficult to understand the software but difficult to integrate with other platforms like MS Office Suite, G Suite, or other management suites. Integration is critical to decode the information generated for analytics or basic decision-making. It aids in communicating analytics and other reports generated without needing additional intervention.

Cloud-based solutions allow users to streamline the number of features by allowing them to opt out of unneeded ones. These can be added or removed at any time, which is a more budget-friendly option.

Challenges in Acquiring a Well-Designed Suite

Another key challenge is to get good, intuitive design. A simple interface makes it easier for all stakeholders to weigh in on decision-making without having to involve additional talent to decode the information. Too many features requires many users to decode it.Similarly, too little features in the design oversimplifies the decision making process and might not give accurate results.

Increased User-Friendliness

The idea of BPMS is to simplify business processes and make them more efficient. The design of the process should not just allow every stakeholder involved to take control of setting up their tasks, it should also be without coding with a functional dashboard and user-friendly interface.

On-premise options will not have proper post-sales customer support. These are sold as one-time packages with limited upgrade options. The time spent training for this package will be for nothing when the technology changes. Integration should be a key functionality of a BPMS. Information and data in any organization moves across systems and departments, so pre-integration with platforms such as MS Office, G Suite, accounting, and HR suites is ideal for decoding and understanding the BPMS data.

While the idea of BPMS is automation of processes across departments, too many flowcharts and maps force the entire system to function like a heavily-programmed robot. The aim should be to allow the user to map the processes according to their point of view using a visual model. This allows the system to manage more complex processes rather than rigid yes or no rules that limit outcomes. The systems should follow a more human-centric workflow.

Automation is the key goal when implementing a BPMS. Even though BPMS is a system and not a specific software or hardware, individual technology plays a big part in making BPMS more efficient and effective. The software automates most of the parts of BPMS like modelling and monitoring. It eliminates the cost of employing and maintaining specialized skill sets for this purpose. Having a software do all this heavy lifting also makes it easier to be mobile with the technology without many overheads. An ideal BPMS should make processes simple, intuitive, automated, and seamless for all stakeholders and users.


MIGRATING MONOLITH TO MICROSERVICES

How to smoothly refactor a monolith to microservices with minimal risks and maximum benefits for the project. The microservices architecture deployment can be a good solution to this challenge. It is based on the idea of extracting large components into multiple, self-sufficient, and deployable functional entities grouped by a purpose. Each of these components is responsible for its own specific functions and interacts with other components through an API.

According to Dzone’s research, 63% of enterprises, including such giants as Amazon, Netflix, Uber, and Spotify, have already implemented microservice architectures. This widespread microservices adoption is driven by many advantages such as improvements in resilience, high scalability, faster time to market, and smooth maintenance.

However, migration to a microservice-based ecosystem may turn to be quite challenging and time-consuming, especially for organizations that operate large and complex systems with a monolithic architecture. The fact that microservices can peacefully coexist with a monolithic app gives a good trade space for such cases.

First of all, it allows you to stretch the microservices migration process (and corresponding investments) in time thus reducing immediate cost & effort loading, and stay fully operational through the whole process. Moreover, you don’t necessarily need to move the whole solution to microservices. It may be a good idea to go with a hybrid strategy when you extract only those parts that become hard to handle inside the monolith and keep the rest of the functionality unchanged.

GENERAL APPROACH TO MIGRATING FROM MONOLITH TO MICROSERVICES

As with any project related to architectural changes, the transition from monolith to microservices requires thoughtful and careful planning. Our core recommendation here would be to gradually cut off parts of the functionality from the monolith one after another, using an iterative approach and keeping those parts relatively small. This will result in reducing migration risks, more accurate forecasting, and better control over the whole project progress.

An excessive surge of optimism and desire to gain all the promised benefits as soon as possible may prompt to move towards a seemingly faster Big Bang rewrite when functionality for all microservices is being rewritten from scratch. However, we recommend thinking twice and go with the approach suggested above. It may take a bit longer, but is definitely less risky and more cost-efficient in the long run.

Of course, there can be cases when parts of the functional entities inside the monolith software are so tightly interrelated that the only option to run those as microservices is rewriting. However, such cases rarely cover the full functionality. So, we usually suggest extracting everything that can be isolated first and only after that taking the final decision on the remaining functionality. It can either be rewritten into separate microservices from scratch or left within the monolith which will become far easier to maintain after most of the functionalities have already been detached.

With the suggested approach, a typical process would include the following steps:

Planning

  • Identifying microservice candidates, creating a backlog

Iterative implementation (this set of steps is repeated for each microservice from the backlog)

  • Choosing the refactoring strategy

  • Designing the microservice and changes according to CI/CD & testing process

  • Setting up CI/CD

  • Implementing microservice

Competent planning and design are keys to smooth migration, as they significantly increase the chances of success. At the same time, it is also important to pay reasonable attention to the implementation itself and make sure you have available specialists who will take care of your entire migration process including integration of the dedicated functionality with the monolith and thorough testing of each release.

HOW TO CHOOSE MICROSERVICE CANDIDATES

First of all, you need to create a backlog for microservices adoption. This requires identifying candidates (parts of the functionality that should be turned into separate microservices) and prioritizing those to form a queue for further migration. Ideally, those candidates should be functional groups that can be isolated and separated from the monolith with less effort and solve some pressing problems your app already has.

Of course, the most obvious candidate is an area that has certain performance and resource utilization issues, or a domain area that will unblock other microservices’ separation from the monolith (e.g. some functionality that is widely used by other parts of the monolith).

However, if you are looking for a more inclusive approach to determining what can be decoupled from the monolith, here is a list of strategies you can apply during your migration to microservices:

DIVIDING THE APPLICATION INTO LAYERS

A standard application includes at least three various types of elements:

Presentation layer – elements that manage HTTP requests and use either a (REST) ​​API or an HTML-based web interface. In an application with a complex user interface, the presentation layer is frequently a significant amount of code.

Business logic layer – elements that are the nucleus of the application and apply business rules.

Data access layer – elements that access infrastructure parts such as databases and message brokers.

In this strategy, the presentation layer is the most typical candidate that can be transformed into a microservice. Such decoupling of the monolith has two main advantages. It allows you to create, deploy, and scale front- and back-end functionality independently from each other. Besides, it also allows the presentation-layer developers to quickly iterate the user interface and smoothly carry out microservices testing.

Nevertheless, it makes sense to split this layer into smaller parts if the application interface is big and complex.

Same works for the business logic layer and data access layer. Any of those is often too large to be implemented as a single microservice as it would still be too big, complex, and pretty similar to the monolith you already have. So, you would need to split those further into smaller functional groups using any of the following approaches.

DIVIDING THE APPLICATION INTO THE DOMAIN AREAS

In case some functionality inside the application layers can be grouped into certain domain areas (e.g. reporting or financial calculations), each of those domain areas can be turned into a microservice.

Since the functionality inside one domain area often has more connections inside the area than with other domain areas, this approach quite accurately guarantees a good degree of isolation. This means you will be able to separate this functionality from the monolith with fewer difficulties.

DIVIDING THE APPLICATION INTO THE MODULES

In the case of grouping, dividing functionalities into domain areas does not look like a fair alternative, it makes sense to consider grouping the functional entities into a microservice by the non-functional characteristics they have in common.

If a part of functionality calls for a fully independent lifecycle (which means committing code to a production thread), then it should be decoupled into a microservice. For example, if the system parts grow at different rates, then the best idea is to split these components into independent microservices. It will enable each component to have its own life cycles.

The load or bandwidth can also be the characteristics by which you can distinguish a module. If the system parts have different loads or bandwidth, they are likely to have different scaling needs. The solution to this is splitting these components into independent microservices, so they can scale up and down at different speeds.

One more good example is the functionality that should be isolated to prevent your application from certain types of crashes. For example, functionality that depends on an external service that is not likely to comply with your accessibility goals. You can turn it into a microservice to insulate this dependency from the rest of the system and embed a corresponding failover mechanism into this service.

Once you have identified the list of candidates to separate into independent microservices, you need to carefully analyze not only the candidates but also the interdependencies between those. As a result of this analysis, you will be able to create the backlog and define the order in which you will migrate the candidates. In some cases migrating certain functionality into a microservice can greatly simplify and speed up the process of migrating subsequent components. This should also be taken into account.

When developing a monolithic to microservices migration strategy, it also makes sense to plan the availability of resources required for migration – either in-house or outside the existing development organization.

MICROSERVICE MIGRATION STEP-BY-STEP

Once you have figured out which components and in which order to extract, it is time to pick up the first candidate in your queue and project further actions – define the refactoring strategy, design the microservice, estimate the efforts, and plan the iteration. It is also important to remember that the new approach would require certain changes in your CI/CD and testing processes, so it makes sense to agree on those changes with your team and adjust your processes accordingly.

STEP 1: CHOOSING THE REFACTORING STRATEGY

The next step is to select the right strategy to migrate the chosen functionality into a separate microservice. Two proven strategies we normally rely on in our projects are:

Strategy

Preconditions:

Isolating functionality inside the monolith with subsequent separation into a microservice.

Implies the gradual removal of connections with other functionality while keeping the candidate inside the monolith until it’s ready for separation.

Once all the connections are removed and the new API is designed, the candidate is separated from the monolith and launched as a separate microservice.

Too many connections with other functionality.

Active development & a lot of changes inside the monolith that affect the candidate.

Copying the functionality and turning the copy into the microservice

Implies creating an independent copy of the functionality that is further developed into the microservice while the initial functionality still remains operational inside the monolith. Once the microservice and its integrations are fully implemented and tested, the initial functionality is deleted from the monolith.

Need to have more flexibility in the microservice development process (the microservice is implemented independently, can have its own lifecycle, and be built & deployed faster).

Low probability of changes to the candidate while the microservice is being implemented (otherwise, you’ll need to carefully transfer corresponding functional changes to the microservice)

Both strategies have proven their efficiency. However, they both also have one vulnerability in common. Once the migration process is done in parallel with active development, new dependencies may emerge between the monolith and candidate functionality while the new microservice is being implemented.

In this case, you would need to carefully track and resolve those through new API endpoints or through shared libraries with common parts of a code. And on each occasion, this would take more time and effort than if the functionality was still a single whole with the entire monolithic application.

STEP 2: DESIGNING MICROSERVICES ARCHITECTURE AND CHANGES TO CI/CD & TESTING PROCESSES

First of all, you need to design the future microservice and define changes that should be brought to your CI/CD & testing processes. This important step forms the baseline for further successful migration of the microservice.

To design your microservice, you need to:

  • determine the exact code in your app that will be attributed to the microservice

  • design API interfaces through which the microservice will interact with the monolith and other microservices>

  • choose a stack of technologies that will be used for the implementation of communication with microservice (in certain cases, operational demands dictate the choice of a specific technology, but most commonly, your development team can opt for their favorite tech stacks).

The new microservice will require a corresponding CI/CD (Continuous Integration/Continuous Delivery). Thus your existing process may require significant architectural shifts. In case you did not use CI/CD processes before, it may be the right time to introduce those as they are crucial for any considerable refactoring of existing code. CI/CD helps automate and accelerate all phases from coding up to deployment as well as reduce error detection time.

You also should bear in mind that with the transition to microservices, you also will need to make some changes to your testing strategy for microservices. When a microservice is launched, testers should take into account that now they can interact directly with each microservice. The test team would now need to look for issues at the levels of microservices interactions and infrastructure (containers/VM deployment and resource use) as well as at the overall application level.

Each of these levels requires its own set of testing tools and resources. Test automation can be of much help in this case. With the right microservices testing strategy in place, your team can seamlessly migrate to microservices at the optimal time and without hassle.

STEP 3: SETTING UP CI/CD

The process of changing CI/CD can be started concurrently with microservice development, or even before its implementation so that you can successfully maintain microservice during its development and after it is launched.

Continuous Integration will allow you to easily integrate your code into a shared repository. Continuous Delivery will allow you to take the code stored in the repository and continually deliver it to production.

There is an even more effective way. CI can transform code from a repository to a compiled package (ready to be deployed) and store it in Artifactory. CD can deploy compiled packages from Artifactory to the destination environment. In this case, CI and CD processes are independent of each other, don’t have intersection steps, and could be changed/improved independently of each other.

STEP 4: MICROSERVICE IMPLEMENTATION & TESTING

The time has come to implement everything that has been planned. At this stage, the development team encapsulates the functionality into an independent microservice, cuts off the tightly-coupled connections with the parent application, and implements the API gateway that becomes a single point through which the microservice can communicate with the parent application and other microservices.

Thorough testing should be an integral part of the microservice implementation as you need to safeguard that the microservice works exactly as expected and all possible errors or problems are tackled before the new environment is launched. Test coverage development can start in sync with microservice implementation (or even in advance when it goes about tools and test types selection and planning) to monitor the results of the microservice as early as possible (this can make test results more accurate). Testing should be considered completed when the microservice operates flawlessly in the new environment.

The final result of the implementation step is a fully-functional independent microservice that operates in isolation from the monolithic application. With proper and detailed planning, deployment of microservices is completed as forecasted and does not create any delays or hindrances in the overall development process


AppSync for GraphQL

You may already be thinking that building and operating infrastructure for a GraphQL endpoint are tedious tasks, especially if it needs to be scalable for increasing or ever-changing workloads.

This is where AWS AppSync steps in by providing a fully managed service and a GraphQL interface to build your APIs. It’s feature rich, as it allows you to easily connect resolvers to Lambda, OpenSearch, and Aurora or even directly attach DynamoDB tables via VTL templates.

Essentially, you’ll get a managed microservice architecture that connects to and serves data from multiple sources with just a few clicks or lines of infrastructure code. AWS will manage your infrastructure, and you can focus on the implementation of resolvers.

Monitoring your endpoints

Sadly, the “managed” in “managed service” doesn’t release you from all duties. There’s no need to monitor infrastructure, as all infrastructure components are AWS’s responsibility, but you still need to monitor your GraphQL endpoints to identify errors and performance bottlenecks.

AppSync integrates with CloudWatch natively and forwards a lot of metrics about your AppSync endpoints without the need for further configuration.

Metrics

The metrics reported by AppSync by CloudWatch are:

  • Number of 4xx errors

  • Number of 5xx errors

  • Request latency – how long it takes to fully process a request, including possible response template mappings

  • Sample count statistics – the number of API requests


If caching is enabled for your AppSync endpoint, the cache statistics include:

  • Hits

  • Misses

  • Current items

  • Evictions

  • Reclaimed

This provides a good overview of how your API is doing, but no detailed view per resolver is possible because everything is aggregated. You can’t tell which resolver is the issuer of an HTTP 500 error. Furthermore, there’s no insight about any timings about the resolvers involved in a request. A query could be resolved via a lot of nested resolvers, which may result in high latencies for just a subset or a single slow resolver.

This is where logging comes in to help.

Logging

As previously mentioned, it’s important to know all details about AppSync’s processing steps for each request to gather insights about failing queries and other errors. Additionally, you require in-depth information about the performance of your GraphQL APIs and your resolvers.

This can be achieved by enabling logging for your APIs. AppSync allows you to configure this via multiple settings within the console, including:

  • Enable Logs – tells AppSync to start recording logs. It will generate execution and request summary logs that contain useful information we can use to monitor performance and errors.

  • Include verbose content – extends logs to include more details, including request and response headers, context, and evaluated mapping templates.

  • Field resolver log level – set this to Error or All to get further enhanced tracing logs that contain details about how long queries took to resolve per attribute.

Even when verbose content is not enabled for your logs, the request and execution summary logs already show a lot of useful information.

This includes how long the parsing and validation steps took, as well as how long it took to complete the whole request.

With this configuration, we can now monitor the error rate, the errors themselves, and the latency of the top-level resolvers but nothing about nested resolvers. Field resolver log-level settings are required to get all the information to debug nested resolvers’ performances. With this, we’ll also get tracing messages per resolved attribute. This will enable us to analyze query performance in a very fine-grained manner, as we can see how long each resolver takes to collect the result of a field.

Additionally, we’ll get RequestMapping and ResponseMapping logs for each (nested) resolver. By also enabling verbose content, these logs will be enhanced with the GraphQL request and both request and response headers, as well as the context object itself. This means that if we’re investigating a DynamoDB resolver, we can see the mapping that was done by our VTL template and identify issues quickly.

💡 Important detail about having field resolvers’ log-level settings set to All: the number of generated logs will not increase slightly but by an order of magnitude. For a frequently used AppSync endpoint, this will drastically increase costs due to the log ingestion as well as storage costs at CloudWatch. A great mitigation strategy to avoid exploding costs is to set proper log retention and have only small windows of detailed field resolver logs.

The latter can, for example, be achieved by using scheduled EventBridge rules and Lambda, which will switch between resolver log-level configurations of “Error” and “All” regularly. Depending on the schedule, you’ll end up with only a fraction of the logs and, therefore, costs without losing insights into your AppSync endpoint.

Switching between field resolver log-level settings via Lambda

AWS X-Ray + AppSync

AppSync integrates with X-Ray, allowing you to get trace segment records and a corresponding timeline graph. This helps you visually identify bottlenecks and pinpoint their resolvers at fault.

Sadly, the subsegments only show rudimentary information about the request and response and don’t provide further details that would help to debug problems. Everything is just collected under the AppSync endpoint. It can be filtered, but it’s not flexible enough to cover the requirements necessary for GraphQL.

How to monitor AWS AppSync with Dashbird

As with many other services, you’ll face the classic downsides: a lack of visibility, a lot of noise, and a clunky integration with CloudWatch alarms and logs.

Dashbird is here to help you get around these limitations so that you can focus more on building a great product instead of fighting with the AWS console. We have just added support for AppSync to help you monitor all of your AppSync endpoints without needing to browse dozens of logs or stumble through traces in the X-Ray UI.

There’s no need to set up anything besides our CloudFormation template and the necessary permissions to ingest logs via CloudWatch and X-Ray. Dashbird will then automatically collect all needed information – without any performance impact on your application – and prepare it graphically.

At a single glance, you’ll see fundamental information about the performance and error rate of your AppSync endpoint. Furthermore, all requests will be displayed with their status code, request latency, and flag if there were any resolver errors. By clicking on the request itself, you can drill down into it and look at the query and context.

This shows all the involved resolvers and how much they contributed to the overall response latency, which is crucial information for finding bottlenecks. There’s also a list of any resolver issues. Clicking on a resolver issue will take you to the error overview, giving you more details about when this error first occurred and how many errors of this type have already appeared.

There’s also out-of-the-box alerting via Slack, email, or SNS. Since Dashbird automatically clusters similar errors, noise will be reduced, and there’s no need to configure anything in advance to receive critical information in your inbox or channel without flooding it and causing cognitive overload.

Also, metrics provided by AWS are enhanced to enable better filtering. For example, CloudWatch’s requests metric will only give you the number of requests that all APIs have processed in a single region, but there’s no way to know how many requests have been made to a single API or endpoint. With Dashbird, you can always pinpoint the number of requests per endpoint for any given period of time.


Future outlook

There’s more to come at Dashbird, as we’re already building more features to help you run the best possible AppSync endpoints. This includes a set of well-architected insights to guide you with best practices.

Key takeaways

Monitoring is fundamental to any web application. Even though AppSync offers a high level of abstraction of the underlying infrastructure, keeping your endpoint and its resolvers healthy and maintaining low latency is still your job.

CloudWatch and X-Ray offer tools that enable you to get the logs required to achieve detailed observability for your application relying on AWS’s managed GraphQL solution. Dashbird takes this to the next level by offering you a structured overview of all your AppSync endpoints, which contain all the details you need to debug errors, resolve bottlenecks, and give your team more time for application development instead of stumbling through CloudWatch logs and X-Ray traces.