For over 5+ years we help companies reach their financial and branding goals. oDesk Software Co., Ltd is a values-driven technology agency dedicated

Gallery

Contacts

Address

108 Tran Dinh Xu, Nguyen Cu Trinh Ward, District 1, Ho Chi Minh City, Vietnam

E-Mail Address

info@odesk.me

Phone

(+84) 28 3636 7951

Hotline

(+84) 76 899 4959

Cloud Continuous Development devops Infrastructure & Operator Migration & Consultant
Sending CloudWatch Alarms to Slack via SNS and AWS Lambda using Terraform

Sending CloudWatch Alarms to Slack via SNS and AWS Lambda using Terraform

AWS CloudWatch is an amazing service that has the ability to keep track of the performance and health of the AWS resources in one’s possession. Nonetheless, in order to maintain the stability of the infrastructure, it is vital to make sure that, whenever there are any alarms, the correct teams are the ones that are notified in a timely manner.

we’ll demonstrate using Terraform to configure a pipeline that sends AWS CloudWatch alarms to a Slack channel via SNS and AWS Lambda. This setup ensures your team stays informed and can immediately act on critical issues.

Prerequisites

Before we start, make sure you have the following:

  • An AWS account with sufficient permissions to create IAM roles, SNS topics, Lambda functions, and CloudWatch alarms.
  • A Slack workspace where you can create an incoming webhook for notifications.
  • Terraform is installed on your local machine.

Step 1: Set Up a Slack Incoming Webhook

  1. Create a Slack channel or use an existing one.
  2. Create a workflow by selecting the Webhook option.
  3. Create a variable named SubscribeURL. This name is crucial for the workflow.
  4. Add the above variable in the message body of the workflow.
  5. Publish the workflow and obtain the URL.

Example URL: https://hooks.slack.com/services/webhook_id/randomhash/id

Step 2: Terraform Configuration

We’ll use Terraform to create the AWS resources needed for this setup, including an SNS topic, a Lambda function, and CloudWatch alarms.

1. SNS Topic and Lambda Function

Create an SNS topic that triggers a Lambda function whenever a CloudWatch alarm is fired. The Lambda function will then send the alarm data to Slack.

resource "aws_sns_topic" "this" {
  name = "slack_trigger_topic"
}

resource "aws_sns_topic_subscription" "lambda_subscription" {
  topic_arn  = aws_sns_topic.this.arn
  protocol   = "lambda"
  endpoint   = aws_lambda_function.slack.arn
  depends_on = [aws_lambda_function.slack]
}

## Use an existing IAM role or create a new one
data "aws_iam_role" "lambda" {
  name = "sns-lambda-role"
}

data "archive_file" "lambda" {
  type        = "zip"
  source_file = "${path.module}/index.py"
  output_path = "${path.module}/lambda_function_payload.zip"
}

resource "aws_lambda_function" "slack" {
  filename         = "${path.module}/lambda_function_payload.zip"
  function_name    = "lambda_slack"
  role             = data.aws_iam_role.lambda.arn
  handler          = "index.lambda_handler"
  source_code_hash = data.archive_file.lambda.output_base64sha256
  runtime          = "python3.9"
  
  environment {
    variables = {
      SLACK_WEBHOOK_URL = "https://hooks.slack.com/services/webhook_id/randomhash/id"
    }
  }
}

resource "aws_lambda_permission" "with_sns" {
  statement_id  = "AllowExecutionFromSNS"
  action        = "lambda:InvokeFunction"
  function_name = aws_lambda_function.slack.function_name
  principal     = "sns.amazonaws.com"
  source_arn    = aws_sns_topic.this.arn
}

2. Lambda Function Code

The Lambda function will parse the incoming CloudWatch alarm and send it to Slack. The following Python script will be included in the Lambda function

index.py
--------
import os
import json
import urllib.request

def lambda_handler(event, context):
    for record in event['Records']:
        try:
            process_message(record)
        except Exception as e:
            print(f"Error processing record: {e}")

def process_message(record):
    try:
        message = record['Sns']['Message']
        slack_webhook_url = os.environ['SLACK_WEBHOOK_URL']
        
        try:
            json_message = json.loads(message)
            alarm_name = json_message.get('AlarmName', 'Unknown Alarm')
            new_state_value = json_message.get('NewStateValue', 'UNKNOWN')
            
            color = "danger" if new_state_value == "ALARM" else ("good" if new_state_value == "OK" else "#808080")

            formatted_message = f"*Alarm Name*: {alarm_name}\n\n```{json.dumps(json_message, indent=2)}```"
        except json.JSONDecodeError:
            formatted_message = f"*Alarm Name*: Unknown\n\n```{message}```"
            color = "#808080"
        
        slack_message = {
            "attachments": [
                {
                    "color": color,  
                    "text": formatted_message
                }
            ]
        }
        
        slack_payload = json.dumps(slack_message).encode('utf-8')
        
        req = urllib.request.Request(
            slack_webhook_url,
            data=slack_payload,
            headers={'Content-Type': 'application/json'}
        )
        with urllib.request.urlopen(req) as response:
            response_body = response.read().decode('utf-8')

        if response.getcode() == 200:
            print("Message sent to Slack successfully")
        else:
            print(f"Error sending message to Slack: {response.getcode()} - {response_body}")
            raise Exception(f"Error sending message to Slack: {response_body}")
    except Exception as e:

3. CloudWatch Alarms

Finally, we’ll create CloudWatch alarms for EC2 and RDS instances. These alarms will trigger when specific thresholds are breached, and the alerts will be sent to Slack.

main.tf
-------
resource "aws_cloudwatch_metric_alarm" "this" {
  for_each = var.alarms

  alarm_name                = each.value.alarm_name
  alarm_description         = each.value.alarm_description
  comparison_operator       = each.value.comparison_operator
  evaluation_periods        = each.value.evaluation_periods
  threshold                 = each.value.threshold
  period                    = each.value.period
  unit                      = each.value.unit
  namespace                 = each.value.namespace
  metric_name               = each.value.metric_name
  statistic                 = each.value.statistic
  alarm_actions             = [aws_sns_topic.this.arn]
  insufficient_data_actions = each.value.insufficient_data_actions
  ok_actions                = [aws_sns_topic.this.arn]
}

variable.tf
-----------
variable "alarms" {
  description = "Map of alarms for RDS and EC2 instances"
  type = map(object({
    alarm_name                = string
    alarm_description         = string
    comparison_operator       = string
    evaluation_periods        = number
    threshold                 = number
    period                    = number
    unit                      = string
    namespace                 = string
    metric_name               = string
    statistic                 = string
  }))
  default = {
    rds_cpu_utilization = {
      alarm_name                = "RDS-CPU-Utilization"
      alarm_description         = "Triggers when CPU utilization for the RDS instance exceeds 80% for 1 minute. High CPU usage could indicate resource constraints."
      comparison_operator       = "GreaterThanOrEqualToThreshold"
      evaluation_periods        = 1
      threshold                 = 80
      period                    = 60
      unit                      = "Percent"
      namespace                 = "AWS/RDS"
      metric_name               = "CPUUtilization"
      statistic                 = "Average"
    }
   ec2_cpu_utilization = {
      alarm_name                = "EC2-CPU-Utilization"
      alarm_description         = "Triggers when CPU utilization for the EC2 instance exceeds 80% for 5 consecutive minutes. Consistently high CPU usage may indicate that the instance is under heavy load."
      comparison_operator       = "GreaterThanOrEqualToThreshold"
      evaluation_periods        = 1
      threshold                 = 80
      period                    = 60
      unit                      = "Percent"
      namespace                 = "AWS/EC2"
      metric_name               = "CPUUtilization"
      statistic                 = "Average"
    }
## add more alarms here
  }
}

4. Deploy Terraform Configuration:

  • Run terraform init/plan/apply to deploy the resources.
  • After deploying, test the setup by manually triggering a CloudWatch alarm and verify that the notification is sent to Slack.

Final

By following these steps, you can set up an effective alerting system that sends AWS CloudWatch alarm notifications directly to your Slack channels. This Terraform-based solution enhances your monitoring capabilities, ensuring that your team is promptly informed about critical issues, allowing them to take immediate action and maintain the stability of your infrastructure.

Finally, I hope that you find this article and my experience helpful, and thanks for taking the time to read my article.

Author

oDesk Software

Leave a comment