This comprehensive guide will take you through the process of streaming logs from Amazon CloudWatch to Amazon OpenSearch via AWS Kinesis Data Streams and Firehose, with a transformation step using AWS Lambda.
Architecture Diagram.
Prerequisites:
- Active AWS account with administrative access to ensure the creation and management of necessary resources.
- IAM user with attached policies granting full access to Amazon CloudWatch, Kinesis Data Stream and Firehose, AWS Lambda, and Amazon OpenSearch Service.
- Need to have an OpenSearch Service domain up and running.
- The process is based on logs and alerts aggregated in a CloudWatch log group in the security account. This is the same account that has the OpenSearch Service domain deployed.
Step 1: Creating a CloudWatch Log Group
- Access CloudWatch:
- Open the AWS Management Console.
- Navigate to CloudWatch under the Management & Governance section.
- Create Log Group:
- In the CloudWatch dashboard, find the ‘Logs’ section in the sidebar.
- Click ‘Create log group’ and enter a descriptive name (e.g., “/aws/lambda/my-function-logs”).
- Configure retention settings to determine how long you want to keep the logs.
- Log Streams:
- In your log group, create a log stream to categorize logs further.
Step 2: Configuring Kinesis Data Stream
- Open Kinesis Service:
- Go to the Kinesis service from the AWS Management Console.
- Create Data Stream:
- Click ‘Create data stream’ and enter a name (e.g., “MyCloudWatchLogsStream”).
- Decide on the number of shards. A shard provides a fixed unit of capacity. More shards mean higher throughput.
Step 3: Streaming from CloudWatch to Kinesis Data Stream
- Set Up Streaming:
- Return to the CloudWatch log group you created.
- Select your log group, and from the ‘Actions’ dropdown, choose ‘Stream to Amazon Kinesis Data Stream’.
- In the wizard, select the data stream you created in Step 2.
- Configure Log Format:
- Choose the log format for your data stream records.
- You can stream the entire log data or select specific fields using a subscription filter pattern.
Step 4: Transforming Logs with Lambda
- Create Lambda Function:
- Go to the Lambda service from the AWS Management Console.
- Click ‘Create function,’ select ‘Author from scratch,’ and input a name for your function (e.g., “TransformLogData”).
- Choose an execution role with permission to read from Kinesis and write to OpenSearch and Firehose.
- Write Transformation Code:
- Input your transformation logic into the Lambda code editor or upload a ZIP file if your code is more complex.
- Set environment variables if your function requires them.
- Configure timeout and memory settings according to the expected workload.
- Deploy and Test Function:
- Save and deploy your function.
- You can test it by configuring a test event in the Lambda console that simulates a Kinesis data record.
Step 5: Setting Up Kinesis Firehose Delivery Stream
- Create Firehose Stream:
- Navigate to Kinesis Firehose in the AWS Management Console.
- Click ‘Create delivery stream,’ and select ‘Kinesis data stream’ for the source.
- Choose the data stream from the list provided.
- Configure Lambda Transformation:
- Select the Lambda function you created for data transformation.
- Set buffer size and interval to group records before sending them to the function.
- Configure Destination:
- Select ‘Amazon OpenSearch Service’ for the destination.
- Provide the domain details and specify the index, type, and retry options.
- Backup Failed Data:
- Optionally, you can configure S3 backup for records not delivered to OpenSearch.
Step 6: Establishing OpenSearch Domain
- Create or Choose Domain:
- If you don’t have an OpenSearch domain, create one by navigating to the OpenSearch Service and clicking ‘Create a new domain.’
- If you already have a domain, ensure it’s configured to accept incoming data from Firehose.
- Secure Your Domain:
- Apply access policies to control who can send data to and query your OpenSearch domain.
- Ensure that your VPC and security group settings allow traffic from Firehose.
Step 7: Monitoring and Validation
- Monitor Log Group:
- Check the CloudWatch log group to see if logs are being generated and streamed to Kinesis.
- Check Lambda Invocation:
- In the Lambda console, monitor the function’s invocation metrics to ensure it’s being triggered as expected.
- Validate Firehose Delivery:
- In the Firehose console, review the monitoring graphs and logs to confirm that records are being transformed and sent to OpenSearch.
- Verify in OpenSearch:
- Log into the OpenSearch dashboard.
- Verify that new indices are created, and logs are indexed correctly.
- Set Up Alerts:
- Configure CloudWatch alarms or OpenSearch alerts to notify you of any issues in the pipeline.
Conclusion: You have set up a pipeline to monitor and analyze log data in near real-time by completing these steps. You can now leverage the power of OpenSearch to gain insights into your application’s performance and troubleshoot any issues.