AWS Guidance Report

Note

The Cost and Security recommendations in Site24x7 Guidance Report will be available only in ManageEngine CloudSpend as Recommendation Reports. If you use both Site24x7 and CloudSpend, you can continue to get these recommendations from CloudSpend > Reports > Recommendations Reports.

The Recommendations Report in CloudSpend helps you optimize cloud costs, and improve fault tolerance and performance of your cloud infrastructure across AWS, Azure, and GCP accounts. It provides tailored recommendations that can help you achieve significant savings and improve the overall efficiency of your cloud environment.

If you have not subscribed to CloudSpend and want to keep getting these recommendations, you can get started with CloudSpend now.

Site24x7's Guidance Report for Amazon Web Services examines configuration and resource utilization of AWS services like EC2, RDS, IAM, S3, SES, etc. and provides recommendations to improve the performance of your AWS account. Below, we've grouped the Availability best practice checks for AWS service.

Note

Instance Type recommendations work on top of the Guidance Report and recommend which Instance Type will be a better fit by assessing the metrics and configuration details. Instance Type recommendations help you identify a better instance category based on your instance usage.

Resource-level best practice checks

You can now view the best practice recommendations for each AWS resource in the Guidance Report tab. The AWS services for which a built-in dashboard is available are given here.

AWS Best practice checks

The Availability recommendation checks are grouped by the AWS service namespace.

Amazon Elastic Compute Cloud (EC2)

1. Underutilized EC2 instance (Priority: Moderate)

Baseline:

Checks the resource utilization of Amazon Elastic Compute Cloud (EC2) instances and labels them as underutilized if the CPU usage is less than 2% for the past 48hrs.

Recommendation:

For Amazon EC2, you are billed based on the instance type and the number of consumed hours. You can lower your costs by identifying and stopping low utilized instances. In addition, we also show the Current Instance Type and recommend the desired instance type (Suggested Instance Type) that you can downgrade to for better cost cutting.

Required Permissions:

"ec2:DescribeInstances", "cloudwatch:GetMetricData", "cloudwatch:GetMetricStatistics", and "cloudwatch:ListMetrics"

2. EC2 Security Groups – Unrestricted access on specific ports (Priority: High)

Baseline:

Checks the security group of monitored EC2 instances for rules that allow unrestricted access on the following ports: 20, 21, 22, 1433, 1434, 3306, 3389, 4333, 5432, or 5500.

Description:

Unrestricted access can lead to DDoS attacks or malicious traffic reaching your application.

Recommendation:

Expose TCP ports 80 and 443 to the internet, and minimize the opportunities for an attacker.

Required Permissions:

"ec2:DescribeInstances" and "ec2:DescribeSecurityGroups"

3. Amazon EC2-VPC instance - Security group with too many rules assigned (Priority: Moderate)

Baseline:

Checks for Amazon EC2-VPC instances that have security groups with more than 50 rules (inbound and outbound).

Description:

When you launch an instance in a VPC, you can specify up to five security groups that get associated with the instance. For each security group, you can add rules that control inbound and outbound traffic. Instance performance can be affected if the security group has a large number of rules.

Recommendation:

Reduce the number of rules configured in the associated VPC security group.

Permissions Required:

"ec2:DescribeInstances" and "ec2:DescribeSecurityGroups"

4. EC2 instances not attached to a AutoScaling group (Priority: Info)

Baseline:

Checks for Elastic Compute Cloud (EC2) instances not associated with any AutoScaling Groups.

Description:

AutoScaling helps you to scale up and scale down your compute resources based on demand. By creating groups of EC2 instances called as AutoScaling groups, you can specify the desired capacity or assign policies, to ensure an optimum number of EC2 instances are available to handle incoming application requests.

Recommendation:

Organize your EC2 instances as AutoScaling groups.

Required Permissions:

"ec2:DescribeInstances" "autoscaling:DescribeAutoScalingGroups"

5. EC2 instances not launched within a VPC (Priority: Moderate)

Baseline:

Checks for Elastic Compute Cloud (EC2) instances launched in the EC2 Classic platform.

Description:

The Amazon EC2 network is categorized into two platforms - EC2 Classic and EC2 VPC. When you launch an instance within the Classic platform, your instance gets launched in a network that is shared by other AWS tenants. Whereas when you launch instances in a VPC, your resources are logically isolated from the other networks.

Recommendation:

Migrate your instances to a VPC.

Required Permissions:

"ec2:DescribeInstances"

6. Scheduled maintenance for EC2 instance (Priority: Moderate)

Baseline:

Checks Elastic Compute Cloud (EC2) instances for scheduled maintenance events.

Description:

From time to time, AWS would schedule a system maintenance event for your instance to perform routine maintenance tasks on the underlying physical host.

Recommendation:

Associate the monitored EC2 instance to Site24x7's smart maintenance window to suppress alerts and continue monitoring during downtime.

Required Permissions:

"ec2:DescribeInstances" and "ec2:DescribeInstanceStatus"

7. Amazon EC2 - Para-virtual virtualization type instance (Priority: Moderate)

Baseline:

Checks whether there are Amazon EC2 instances of para-virtual virtualization type present.

Description:

Linux Amazon Machine Images (AMIs) use one of the two types of virtualization: para-virtual (PV) or hardware virtual machine (HVM). The main differences between PV and HVM AMIs are the way in which they boot and whether they can take advantage of special hardware extensions (CPU, network, and storage) for better performance. Instances launched based on HVM type virtualization offer better performance compared to PV-based instances.

Recommendation:

Consider changing the instance type of the instance to be based on HVM virtualization type.

Required permission:

"ec2:DescribeInstances"

Amazon Elastic Block Storage (EBS)

1. Unattached EBS volumes (Priority: Moderate)

Baseline:

Checks Amazon Elastic Block Store (EBS) volume configuration for associated instance ID.

Description:

Elastic block store volumes can persist independently even after instance termination or even after you explicitly unmount and detach the volume from the instance. As you may know, unattached volumes are still charged based on the storage provisioned and for IOPS.

Recommendation:

Associate the configured EBS volume, with an active instance or release the storage volume.

Required Permissions:

"ec2:DescribeVolumes"

AWS Elastic Load Balancing

1. Idle Elastic Load Balancer (Priority: High)

Baseline:

Checks the usage stats of monitored classic load balancers deems them as idle if the number of requests received/routed or the number of TCP connections established with the target instance is less than 100 in the past 48hrs.

Description:

Amazon Web Services bills you for each partial or full hour your load balancer runs. If your load balancer is routing less then 100 requests, then it is not adequately being used.

Recommendation:

Consider terminating and running your application without a load balancer.

Required Permission

"elasticloadbalancing:DescribeLoadBalancers", cloudwatch:GetMetricData", "cloudwatch:GetMetricStatistics", and "cloudwatch:ListMetrics"

2. ELB not utilizing multiple Availability Zones (Priority: High)

Baseline:

Checks for load balancers operating in a single Availability Zone.

Description:

If you are launching EC2 instances in a single Availability Zone, then any failure occurring in that data center could render all of your instances unavailable. By deploying multiple EC2 instances to different Availability Zones in the same Region, you remove a single point of failure.

Recommendation:

For increased resilience and fault tolerance, please ensure the EC2 instances registered to your load balancer is attached to different Availability Zones.

Required Permissions:

"elasticloadbalancing:DescribeLoadBalancers"

3. Elastic Load Balancer - Access Logs (Priority: Moderate)

Baseline:

Checks the configuration of your load balancers to see if Access Logs is enabled for your ELB.

Description:

Access logs capture and stores in-depth information for each request received by the Load Balancer. Information like IP address, latencies, request path, back-end server response gets stored in an Amazon S3 bucket that you specify. AWS account holders can use this to analyze traffic patterns and troubleshoot advanced ELB issues.

Recommendation:

This is an optional feature that is disabled by default. Enable Access logs for your Elastic Load Balancer.

Required Permissions:

"elasticloadbalancing:DescribeLoadBalancers"

4. Elastic Load Balancer - Listener security (Priority: High)

Baseline:

Checks the configuration of Elastic load balancers (Classic and Application type) and warns you when there are no listeners that use the secure protocol (HTTPS or SSL).

Description:

A listener is a process that checks for connection requests. When your Elastic Load Balancer has no configured HTTPS listeners, unauthorized parties can read the data sent across the network between the client and your load balancer.

Recommendation:

Enable SSL/TLS support for the HTTP listener by deploying an SSL certificate on your load balancer.

5. Health checks for Auto Scaling groups associated with a load balancer (Priority: Moderate)

Baseline:

Checks if Auto Scaling groups (ASGs) associated with a load balancer have a health check configuration for the load balancer.

Description:

The default health checks for an Auto Scaling group are EC2 status checks only, and the ASG does not consider an instance unhealthy if it fails the health checks provided by a load balancer. You can determine an instance's health from the load balancer by configuring the ASG to use load balancer health checks.

Recommendation:

Add a load balancer health check to an ASG so instance health is determined by both the EC2 status checks and the load balancer health checks.

AWS Identity Access Management (IAM)

1. AWS root account user - Access keys (Priority: High)

Baseline:

Checks the AWS account root user for active access keys.

Description:

Access keys are used to make secure REST API or HTTP query requests to AWS service APIs. Anyone with AWS root account user access keys can gain unrestricted access to all the resources, including billing data using these keys.

Recommendation:

Delete access keys for the root user or make it inactive.

2. Ensure the IAM password policy requires at least one uppercase letter, lowercase letter, symbol, and number (Priority: Moderate)

Baseline:

Checks if the IAM password policy requires at least one uppercase letter, lowercase letter, symbol, and number.

Description:

If an administrator doesn't set a custom password policy, IAM user passwords must meet the default AWS password policy by containing at least one uppercase letter, lowercase letter, symbol, and number.

Recommendation:

Maintain the security of your AWS accounts by implementing a strong AWS IAM password and changing it frequently.

Amazon Simple Storage Service (S3)

1. Amazon S3 - S3 bucket should have cross-region replication enabled (Priority: Moderate)

Baseline:

Checks if S3 buckets have cross-region replication enabled or not.

Description:

With cross-region replication enabled, you can automatically replicate data or copy objects across S3 buckets in different AWS regions. Whether for a disaster recovery plan or performance optimization, data replication will improve the availability and reliability of your application.

Recommendation:

Enable cross-region replication for all S3 buckets.

Amazon Relational Database Service(RDS)

1. Amazon RDS - Event subscription for OS updates (Priority: Moderate)

Baseline:

Checks if the RDS event subscriptions for instance source type has Security Patching event category included.

Description:

RDS DB instances occasionally require operating system updates to improve database performance and overall security posture. Using event subscriptions, you can set up email or text alerts to get notified as soon as an update becomes available. You can use these alerts to plan for updates that may be required to help you meet your compliance obligations.

Recommendation:

Consider subscribing to the security patching event category for instance source type.

Required Permissions:

"rds:DescribeDBInstances"
"rds:DescribeEventSubscriptions"

Amazon Simple Notification Service (SNS)

2. Amazon SNS - Delivery status logging (Priority: Moderate)

Baseline:

Checks if monitored SNS topics have delivery status logging enabled for notification messages sent to the SNS topic.

Description:

Logging is an important part of maintaining the reliability, availability, and performance of services. Logging notification message delivery status helps provide operational insights such as:

Knowing whether a message was delivered to the Amazon SNS endpoint.
Identifying the response sent from the Amazon SNS endpoint to Amazon SNS.
Determining the message dwell time (the time between the publish timestamp and the hand off to an Amazon SNS endpoint).

Recommendation:

Consider enabling delivery status logging for the SNS topic.

Required Permission:

"sns:GetTopicAttributes"

Amazon DynamoDB

1. Amazon DynamoDB - Point-in-time recovery (Priority: Moderate)

Baseline:

Checks whether Amazon DynamoDB tables have point-in-time recovery (PITR) enabled.

Description:

Backups help you to recover more quickly from a security incident. They also strengthen the resilience of your systems. DynamoDB PITR automates backups for DynamoDB tables. It reduces the time to recover from accidental delete or write operations. DynamoDB tables that have PITR enabled can be restored to any point in time in the last 35 days.

Recommendation:

Consider enabling PITR for DynamoDB tables.

Required permission:

"dynamodb:DescribeContinuousBackups"

Amazon RedShift

1. Amazon Redshift - Clusters automatic snapshot (Priority: Moderate)

Baseline:

Checks if the automatic snapshot is enabled for Amazon Redshift clusters.

Description:

Backups help you to recover more quickly from a security incident. They strengthen the resilience of your systems. Amazon Redshift takes periodic snapshots by default.

Recommendation:

Consider updating the snapshot retention period for the cluster to be at least 7 days.

Required Permission:

"redshift:DescribeClusters"

Amazon CloudFront

1. Amazon CloudFront - Origin failover (Priority: Moderate)

Baseline:

Checks if monitored CloudFront distributions have origin failover configured.

Description:

CloudFront distributions with origin failover configured provide high availability. Distributions can be configured to have primary and secondary origin, whereby if the primary satisfies the configured failover criteria, then the secondary can be used to serve content.

Recommendation:

Configure failover for distributions by satisfying all the conditions below:

CloudFront distribution should have at least two origins.
CloudFront distribution should have an origin group with primary and secondary origins configured.
The CloudFront distribution origin group should have failover criteria configured.

Required permission:

"cloudfront:GetDistributionConfig"

Amazon API Gateway

1. Amazon API Gateway - X-Ray tracing (Priority: Moderate)

Baseline:

Checks if monitored API Gateway resources have X-Ray tracing enabled.

Description:

AWS X-Ray can be used to trace and analyze user requests as they travel through your Amazon API Gateway REST APIs to the underlying services. X-Ray gives you an end-to-end view of an entire request, so that you can analyze latencies in your APIs and their backend services. X-Ray tracing enables a more rapid response to performance changes in the underlying infrastructure.

Recommendation:

Consider enabling X-Ray tracing for API Gateway.

Required Permissions:

"apigateway:RestApis"
"apigateway:GetStages"

2. Amazon API Gateway - Encrypt cache data (Priority: Moderate)

Baseline:

Checks if monitored API Gateway resources with API cache has cache encryption enabled.

Description:

Encrypting data at rest reduces the risk of data stored on the disk being accessed by a user not authenticated by AWS. It adds another set of access controls to limit an unauthorized user's ability to access the data. For example, API permissions are required to decrypt the data before it can be read. API Gateway REST API caches should be encrypted at rest for an added layer of security.

Recommendation:

Consider enabling cache encryption for API Gateway REST API cache.

Required Permissions:

"apigateway:RestApis"
"apigateway:GetStages"

AWS EFS

1. Amazon EFS - File system backup (Priority: Low)

Baseline:

Checks if automatic backup is enabled for monitored Elastic File System (EFS) volumes.

Description:

Including the EFS file systems in the backup plans prevents deletion and loss of data.

Recommendation:

Consider enabling automatic backup for EFS file system.

Required Permissions:

"elasticfilesystem:DescribeFileSystems"
"elasticfilesystem:DescribeBackupPolicy"

Amazon Route 53

1. Amazon Route 53 - Auto Renew

Baseline:

Checks whether Auto Renew feature is enabled for your registered domain in order to renew automatically.

Description:

Enabling Auto Renew feature will help to renew our domain before late-renewal period and prevent from the domain available to other register. Even if we restore the domain after its expiry, the cost of restoring domain is higher than renewing.

Recommendation:

It is safe to enable the Auto Renew option to prevent our domain getting expired.

2. Amazon Route 53 - Domain Expired

Baseline:

Checks and identifies if any registered domain has expired currently.

Description:

When your domain gets expires, it is not shown in console. If you don't renew a domain before renewal period, then it will get expired and some registries for top-level domains (TLDs) allow you to restore the domain, before it becomes available for other registries to register. The price for restoring domain is always higher than renew and new registration, so before restore see the price of restoring the expired domain.

Recommendation:

Restoring the domain will help you have full access for your expired domain. It is safe to restore a domain before it becomes available for other registries to register.

Amazon MQ

1. Amazon MQ - Log Exports

Baseline:

Check and identify that the log exports feature is enabled to publish your broker log events to AWS CloudWatch Logs.

Description:

When the Log Exports feature is enabled, Amazon MQ publishes general and audit logs to AWS CloudWatch Logs, allowing you to maintain continuous visibility into your broker's activity, and meet compliance requirements when it comes to auditing.

Recommendation:

Enable the Log Exports feature for your existing Amazon MQ brokers.

2. Amazon MQ - Deployment Mode

Baseline:

Check if AWS MQ brokers are using the active/standby deployment mode for high availability.

Description:

By enabling Deployment Mode, as opposed to the single-broker mode (enabled by default), you can achieve high availability for your Amazon MQ brokers as the service provides automatic failover capability. The MQ active/standby deployment mode includes two broker instances configured by creating a single broker instance in one AZ, and another standby broker instance in a different AZ.

Recommendation:

To enable active/standby deployment mode for your existing Amazon MQ brokers, you need to recreate them with the high availability configuration.

3. Amazon MQ - Auto Minor Version Upgrade

Baseline:

Check if Amazon MQ brokers have the Auto Minor Version Upgrade feature enabled to receive automatically minor engine upgrades.

Description:

With the Auto Minor Version Upgrade feature enabled, the version upgrades will occur automatically during the maintenance window. This way, your AWS MQ brokers can obtain the new software features, bug fixes, and security patches.

Recommendation:

To enable the Auto Minor Version Upgrade feature for your existing Amazon MQ brokers, you need to recreate the brokers with the necessary configuration.

AWS Certificate Manager (ACM)

1. AWS Certificate Manager - Certificates Validity

Baseline:

Check if all the requests made during SSL/TLS certificate issue or renewal process are validated when managed by ACM.

Description:

When your ACM certificates are not validated on time (within 72 hours after the request is made), these become invalid, and you will have to request new SSL/TLS certificates, which could cause interruption to your applications or services.

Recommendation:

Determine if any ACM certificate requests are not currently validated within your AWS account.

2. AWS Certificate Manager - Certificates Renewal

Baseline:

Before the validity periods end, check if any SSL/TLS certificates managed by ACM need to be renewed.

Description:

When ACM certificates are not renewed before their expiration date, they become invalid, and the AWS resource that implements these certificates (the CloudFront distribution) will no longer be secure. The ACM service does not automatically renew certificates that are not in use (i.e. no longer associated with other AWS resources). The renewal process must be performed manually before these certificates become invalid.

Recommendation:

Renew SSL/TLS certificates that are about to expire using the ACM service.

3. AWS Certificate Manager - Certificates Expired

Baseline:

Check if all expired SSL/TLS certificates managed by ACM are removed.

Description:

Removing expired ACM certificates eliminates the risk that an invalid SSL/TLS certificate will be deployed accidentally to another resource, such as Elastic Load Balancing (ELB).

Recommendation:

Delete any expired SSL/TLS certificates managed by ACM.

Amazon WorkSpaces

1. Amazon WorkSpaces - Healthy Instances

Baseline:

Checks whether all WorkSpaces instances are healthy and running properly in order to maintain the working state.

Description:

A WorkSpaces instance that doesn't respond to the service health checks is considered unhealthy. The WorkSpaces service periodically sends status requests to the WorkSpaces instances, and it is determined to be unhealthy when a response to a HealthCheck request is not received.

Recommendation:

Unhealthy WorkSpaces indicators can often be cleared by rebooting.

Amazon Neptune

1. Amazon Neptune - Auto Minor Version Upgrade

Baseline:

Check whether Neptune database instances have the Auto Minor Version Upgrade feature enabled in order to automatically receive minor engine upgrades.

Description:

The Neptune databases upgrades regularly to introduce new software features, bug fixes, security patches, and performance improvements. The automatic upgrades are applied to Neptune instances during the system maintenance window.

Recommendation:

Enable Auto Minor Version Upgrade feature to update the Neptune database instances.

2. Amazon Neptune - Multi-AZ

Baseline:

Ensures that your Neptune graph database clusters are deployed in at least two AZs.

Description:

If you have Neptune graph database clusters in multiple AZs and these share one Neptune graph database cluster, in the event of AZ failure, the Neptune graph database clusters become unavailable, and the resources within other AZs lose internet access. Create fault tolerance by deploying Neptune graph database clusters in at least two AZs.

Recommendation:

Remove a single point of failure and increase the availability of your application by deploying Neptune graph database clusters in at least two AZs.

3. Amazon Neptune - Backup Retention Period

Baseline:

Checks whether Amazon Neptune graph database clusters have set a minimum backup retention period to retain automated snapshots.

Description:

The minimum retention period set for Amazon Neptune clusters will result in backups continuously and incremental so you can quickly restore to any point within the backup retention period. Backups for a longer time will allow you to handle the data restoration process in the event of a failure.

Recommendation:

Update the Neptune cluster configuration to set up a sufficient backup retention period.

Amazon OpenSearch Service

1. OpenSearch domains should have encryption at rest enabled (Priority: High)

Resource-level description:

OpenSearch domains must be encrypted at rest to safeguard the resource from security attacks.

Baseline:

Determines if OpenSearch domains have encryption at rest enabled or not.

Description:

Encryption of data at rest helps prevent unauthorized access so no malicious activities can be performed on the sensitive data within your ES domains (clusters) and their storage systems. ES at rest encryption leverages the AWS KMS service to store and manage the encryption keys.

Recommendation:

Ensure that ES domains are encrypted at rest to protect them from malicious access and to meet any compliance requirements in your organization.

Amazon GuardDuty

1. GuardDuty should be enabled (Priority: Moderate)

Baseline:

Checks if Amazon GuardDuty is enabled or not.

Description:

AWS GuardDuty is a managed threat detection service that continuously monitors your VPC flow logs, AWS CloudTrail event logs, and DNS logs for malicious or unauthorized behavior. When GuardDuty is enabled, it can help identify and generate findings on unauthorized or unusual activities and provides remediation.

Recommendation:

Enable GuardDuty in every region where your AWS resources are available to fortify your infrastructure against security threats.

Amazon CloudTrail

1. Ensure CloudTrail global services is enabled (Priority: High)

Baseline:

Ensure CloudTrail global services is enabled.

Description:

Improve the visibility of the API activity in your AWS accounts by enabling CloudTrail global services. This toughens the security and simplifies management of AWS accounts by also capturing activities that are not region specific, like IAM events. You can also manage trail configurations for all regions from one location and record API calls in unused regions to detect any unusual activity.

Enabling global services in more than one trail generates duplicate entries. To prevent this, enable it in only a single trail and disable it in other trails.

Recommendation:

Enable CloudTrail global services to better manage AWS accounts and fortify the security of your cloud infrastructure.

2. Ensure both a log metric filter and an alarm exist (Priority: Moderate)

Baseline:

Determines if a CloudWatch metric filter and an alarm are present for detecting changes to CloudTrail configurations.

Description:

A metric filter is used to create numeric values from log data coming from CloudTrail to CloudWatch. You can set an alarm based on the incoming logs, and also visualize the stats for the filter in CloudWatch. Each time a configuration change is made at the CloudTrail service level, a CloudWatch alarm, created in your AWS account, is triggered. Use CloudWatch alarms to detect AWS CloudTrail configuration changes to maintain the integrity of the service configuration.

Recommendation:

Ensure that both a log metric filter and an alarm are enabled.

3. CloudTrail should be enabled and configured with at least one multi-region trail (Priority: High)

Baseline:

Checks if CloudTrail is enabled and configured with at least one multi-region trail.

Description:

When you create a multi-region CloudTrail, AWS actually sets up trails in every region (and every account, in the case of an organization trail). They are separate trails that send data to a shared S3 bucket. So, by creating a multi-region trail, the data is collected centrally.

Recommendation:

Turn on CloudTrail and configure at least one multi-region trail.

4. Ensure CloudTrail global services is enabled (Priority: High)

Resource-level description:

Enabling CloudTrail global services captures regional and global events to facilitate better visibility over the API activity of your AWS account.

Baseline:

Ensure CloudTrail global services is enabled.

Description:

Enabling global services in more than one trail generates duplicate entries. To prevent this, enable it in only a single trail and disable it in other trails.

Recommendation:

Enable CloudTrail global services to better manage AWS accounts and fortify the security of your cloud infrastructure.

5. Ensure the S3 bucket CloudTrail logs are not publicly accessible (Priority: High)

Resource-level description:

Public S3 bucket CloudTrail logs can disrupt the security of your resources.

Baseline:

Determines if your S3 bucket CloudTrail logs are publicly accessible or not.

Description:

Amazon S3 buckets and objects are private by default; only the individual who created the bucket can access it and the objects it contains. CloudTrail logs may include detailed events of API activity in your account. If the permissions you provide for your CloudTrail logs to be stored in an S3 bucket are not secure, you may be providing malicious users access to your AWS account log data, which can increase the risk of unauthorized access.

Recommendation:

Ensure that the S3 bucket associated with CloudTrail logging is not publicly accessible, and safeguard your AWS account log data.

6. CloudTrail should be enabled and configured with at least one multi-region trail (Priority: High)

Resource-level description:

CloudTrail trails that aren't configured with at least one multi-region trail can disrupt the availability of resources.

Baseline:

Checks if CloudTrail is enabled and configured with at least one multi-region trail.

Description:

Recommendation:

Turn on CloudTrail and configure at least one multi-region trail.

7. Ensure both a log metric filter and an alarm exist (Priority: Moderate)

Resource-level description:

A log metric filter and an alarm must be enabled to ensure high availability of resources.

Baseline:

Determines if a CloudWatch metric filter and an alarm are present for detecting changes to CloudTrail configurations.

Description:

Recommendation:

Ensure that both a log metric filter and an alarm are enabled.

Amazon Key Management Service (KMS)

1. Ensure rotation for KMS keys is enabled (Priority: Moderate)

Baseline:

Determines if KMS keys can be rotated or not.

Description:

Rotating KMS keys helps reduce the potential impact of a compromised key since the data encrypted using the new key can't be accessed using the previous, exposed key.

Recommendation:

Ensure rotation of customer-created KMS keys and reduce the chance of a compromised key.

2. Ensure rotation for KMS keys is enabled (Priority: Moderate)

Resource-level description:

Rotation of customer-created KMS keys must be enabled to protect the key from being exposed.

Baseline:

Determines if KMS keys can be rotated or not.

Description:

Rotating KMS keys helps reduce the potential impact of a compromised key since the data encrypted using the new key can't be accessed using the previous, exposed key.

Recommendation:

Ensure rotation of customer-created KMS keys and reduce the chance of a compromised key.

Amazon Elastic Container Service (ECS)

1. Amazon ECS cluster - Container insights (Priority: Moderate)

Baseline:

Checks if monitored clusters have container insights enabled.

Description:

Monitoring is an important part of maintaining the reliability, availability, and performance of Amazon ECS clusters. Use AWS CloudWatch container insights to collect, aggregate, and summarize metrics and logs from your containerized applications and micro services. CloudWatch automatically collects metrics for many resources, such as CPU, memory, disk, and network. Container insights also provides diagnostic information, such as container restart failures, to help you isolate issues and resolve them quickly.

Recommendation:

Container insights cannot be enabled after a cluster is created. Consider creating a new cluster with container insights enabled.

Required Permission:

"ecs:DescribeClusters"

Amazon Virtual Private Cloud (Amazon VPC)

1. Unused Virtual Private Gateways (Priority: Low)

Baseline:

Checks configuration for Amazon Virtual Private Gateways (VGWs) and identifies unused VGWs that are not associated with the VPC side of the VPN connection.

Description:

Every unused (detached) AWS Virtual Private Gateway should be removed from AWS account to facilitate better management and to prevent from reaching service limit.

Recommendation:

Identify and remove any unused Virtual Private Gateways provisioned within your AWS account to avoid reaching service limit (by default, you are limited to 5 VGWs - attached or detached - per AWS region).

Required Permission:

"ec2:DescribeVpcs"

2. Amazon VPN Tunnels - UP (Priority: High)

Baseline:

Ensures that the state of your AWS Virtual Private Network (VPN) tunnels is UP to ensure network traffic flow over your Virtual Private Network.

Description:

Continuous monitoring for your VPN tunnels will help you take immediate actions in the event of a failure, in order to maximize uptime and ensure network traffic flow over your Amazon VPN connections at all times.

Recommendation:

If your AWS VPN connection tunnels are currently offline, ensure that your firewall configuration is allowing the VPN connection tunneling in the firewall policy.

Required Permission:

"ec2:DescribeVpnConnections"

3. Amazon VPC - Peering Connection Configuration (Priority: Moderate)

Baseline:

Ensures that the Amazon VPC peering connection configuration is compliant with the desired routing policy.

Description:

Proper configuration of the VPC peering connection routing tables restrict traffic only between the desired resources, hence, leading to an effective way of minimizing the impact of security breaches as AWS resources outside of these routes become inaccessible to the peered VPC.

Recommendation:

Determine if the routing tables associated with your peered VPCs implement the right routing policy.

Required Permission:

"ec2:DescribeVpcPeeringConnections"
"ec2:DescribeRouteTables"

4. Amazon VPC - Flow Logs (Priority: Moderate)

Baseline:

Ensures whether Virtual Private Cloud (VPC) Flow Logs feature is enabled in all applicable AWS regions or not.

Description:

Once enabled, VPC Flow Logs will start collecting network traffic data to and from VPC, thus helping you detect and troubleshoot security issues and make sure network access rules are not overly permissible. You also get notified when abnormal activities are triggered within your VPC network such as rejected connection requests or unusual levels of data transfer.

Recommendation:

Enable Flow Logs for your AWS VPC.

Required Permissions:

"ec2:DescribeVpcs"
"ec2:DescribeFlowLogs"

Amazon EC2 Auto Scaling

1. Health checks for Auto Scaling groups associated with a load balancer (Priority: Moderate)

Resource-level description:

ASGs associated with a load balancer must use load balancer health checks to ensure the availability of resources.

Baseline:

Checks if Auto Scaling groups (ASGs) associated with a load balancer have a health check configuration for the load balancer.

Description:

Recommendation:

Add a load balancer health check to an ASG so instance health is determined by both the EC2 status checks and the load balancer health checks.

2. Amazon EC2 Instance Auto Scaling Groups (EC2) - Use launch template (Priority: Moderate)

Baseline:

Check whether Amazon EC2 Auto Scaling group is created from an EC2 Launch template.

Description:

An EC2 Auto Scaling group can be created from either an EC2 launch template or a launch configuration. However, using a launch template to create an Auto Scaling group ensures that you have access to the latest features and improvements.

Recommendation:

Consider migrating from launch configurations to launch templates for EC2 Auto Scaling groups.

Required permission:

"autoscaling:DescribeAutoScalingGroups"

AWS Glue

1. AWS Glue - Automatically scale number of workers (Priority: Medium)

Baseline:

Checks whether the Automatically scale number of workers option is enabled for the Glue Job.

Description:

With automatic scaling, AWS Glue can dynamically adjust the number of workers used by a job depending on the job's workload. While automatic scaling optimizes resource usage, it is important to note that increasing the maximum capacity of resources can lead to higher costs.

Recommendation:

Enable the Automatically scale number of workers option for the Glue Job to increase availability.

AWS DRS

1. Elastic Disaster Recovery - Data Replication (Priority: Low)

Baseline:

Checks whether data replication is enabled.

Description:

Elastic Disaster Recovery data replication minimizes data loss in the event of a disaster and allows for quick failover to the replicated environment. By replicating data frequently, AWS DRS helps achieve a low reduced recovery point objective (RPO).

Recommendation:

Regular data replication avoids data loss and ensures the recovery process. Therefore, enable data replication for the Elastic Disaster Recovery.

Video

Here's a quick video that discusses the best practices for AWS monitoring:

Compliance checks

Site24x7 conducts compliance checks for the best practice recommendations provided in the AWS Guidance Report. These checks look for security vulnerabilities and help you analyze if your cloud infrastructure complies with global security and compliance standards. You can identify the practices that are not compliant and receive recommendations to comply.

Site24x7 carries out compliance checks for the following security standards and certifications:

PCI DSS: The Payment Card Industry Data Security Standard (PCI DSS) ensures that all entities maintain a secure environment for the credit card information that is processed, stored, or transmitted.
GDPR: The General Data Protection Regulation (GDPR) is a pan-European regulation that requires businesses to protect the personal data and privacy of customers while processing of their personal data.
NIST: Compliance with the National Institute of Standards and Technology (NIST) ensures that federal agencies meet requirements of the Federal Information Security Management Act (FISMA).
APRA: The Australian Prudential Regulatory Authority (APRA) requires organizations in the financial and insurance sectors to strengthen their information security framework.
MAS: The Monetary Authority of Singapore (MAS) enforces guidance for financial institutions on individual accountability and conduct grounds.
HIPAA: The Health Insurance Portability and Accountability Act (HIPAA) of 1996 is a federal law. This law prohibits the disclosure of sensitive patient health information without the patient's consent or knowledge.
CIS: Center for Internet Security (CIS) Benchmarks are security standards for defending IT systems and data against cyberattacks.

Viewing compliance checks

You can view the compliance data along with best practice recommendations for your AWS instances.

Frequently Asked Questions (FAQs)

1. What is Site24x7's Guidance Report for AWS?

The Guidance Report inspects your AWS environment and helps you identify opportunities to effectively use resources like EC2 instances, EBS volumes, ELB nodes and more.

2. Is the Guidance Report available to all users?

Yes. Site24x7's Guidance Report for AWS is available for all Site24x7 subscription holders, both paid and eval. All you need to do is enable access either via IAM user creation or cross-account IAM role and connect your AWS account with Site24x7.

3. Limitations of Guidance Report

Currently, the guidancee report offers recommendation checks for select AWS services only.
The compliance of only monitored resources is examined. Resources that were excluded using various auto discovery filters are not taken into account.

4. How can I access the Guidance Report?

For already monitored AWS accounts

Sign in to the Site24x7 web console. choose AWS from the left navigation pane and choose the AWS account for which you want to view recommendations.
In the menu drop down, choose Guidance Report.

For newly integrated AWS accounts

A duration of one hour (from the time of AWS account integration) is required to build the Guidance Report. Once done, you can sign in to the Site24x7 console, choose the monitored AWS account > Guidance Report to view the recommendations.

5. How frequently will the report update?

The Guidance report will be updated every week from the time of AWS account integration.

6. What about Email notifications?

Weekly email updates to the Super Admin contact associated with your Site24x7 subscription account will be sent.

7. Can I schedule the report?

Yes. You can choose frequency (daily, weekly or monthly), time of the day and the user group using the Scheduled Reports feature.

8. Will newly monitored resources show up in the Guidance Report?

The report will be updated and refreshed every week and any new resources discovered and monitored during this period not complying with our checks will be included in the report.

9. How does Site24x7 collect the data required to make the recommendation?

Site24x7 makes uses of various AWS service level APIs to collect configuration information. The resource usage metrics collected by polling CloudWatch APIs are used to identify idle/unused resources.

10. How does Site24x7 make Instance Type recommendations?

Site24x7, with the Guidance report enabled, analyzes key metrics like CPU and memory usage of your instance along with the current Instance Type. By analyzing both these, Site24x7 provides suggestions to use the right Instance Type based on your usage pattern.

On this page

Resource-level best practice checks
Best practice checks
Compliance checks
Frequently Asked Questions (FAQs)