Tuesday, May 16, 2017

Accessing Files in S3 via a Lambda Function in a VPC using an S3 Endpoint

This post explores creation of a lambda function inside a VPC that retrieves a file from an S3 bucket over an S3 endpoint. The Lambda function below is written in Python. 

Why Use a VPC S3 Endpoint? 

Traffic to a VPC Endpoint creates a private connection between the specified VPC and AWS service. By creating the appropriate policies on our bucket and the role used by our Lambda function, we can enforce any requests for files in the bucket from the Lambda function to use the S3 endpoint and remain within the Amazon network. If we only allow GetObject via the endpoint, any requests or files must come from within our VPC. Putting a policy on the VPC endpoint we can limit what S3 actions the Lambda role can take on our bucket over the network. By putting additional restrictions on the bucket policy we can limit who can upload to the bucket, enforce MFA and specific IP addresses. All these things work together to protect the data in the bucket. Of course you have to remember that anyone who has permissions to change the policies would be able to remove these restrictions and get to the data in the bucket, so give permissions to change permissions sparingly and consider segregation of duties.

For a more detailed explanation of how data flows via an S3 endpoint see this post:


CloudFormation Templates

The following resources need to be created before we can write a lambda and run our test:

VPC: 

Subnet and Security group for Lambda:

S3 bucket: 

Lambda Role:

S3 Bucket Policy that allows our Lambda role to access the bucket:

And S3 Endpoint and Policy that grants access to bucket via route in subnet route table and S3 Endpoint Policy:

Lambda Function:

Resources:

The S3 Endpoint was created and assigned to the desired route table. The Route Tables tab displays the reference to the subnet and route table where the route has been added for the S3 VPC Endpoint.


A look at the route table for the subnet shows the route listed for the S3 endpoint. The route table has a route that allows access to S3 via a Prefix List. 

Wait...what's a Prefix List? A service (S3) in a VPC endpoint is identified by a prefix list—the name and ID of a service for a region. A prefix list ID uses the form pl-xxxxxxx and that ID needs to be added to the outbound rules of the security group to allow resources in that security group to access the service (in this case S3 in the Oregon or us-west-2 region). Basically it appears to allow traffic to be routed to that service within the AWS network.



A security group was created and an an egress (outbound) rule was added to it that allows access to S3 via a Prefix List. 



The Lambda function shows that it has been created inside a VPC, using our subnet and specified security group. You'll notice one thing that's a bit odd - the destination for our S3 VPC Endpoint rule is blank. But will it still work?


Lambda Python Code

The Lambda Python code should allow retrieving a file from the bucket - in my case retrieving a key for use in administration purposes:

from __future__ import print_function
import boto3
import os
import subprocess

def configure_firebox(event, context):
    
    s3=boto3.client('s3')
    
    bucket=os.environ['Bucket']
    key="firebox-cli-ec2-key.pem"
    
    response = s3.get_object(Bucket=bucket, Key=key) 
   ...


Success

If you have successfully created your networking and policies, you will be able to access the files in your bucket over the S3 endpoint (and only over the S3 endpoint if you desire). In fact you can restrict access to the bucket to the S3 endpoint only using the following policy:

{
   "Version": "2012-10-17",
   "Id": "Policy1415115909152",
   "Statement": [
     {
       "Sid": "Access-to-specific-VPCE-only",
       "Action": "s3:*",
       "Effect": "Deny",
       "Resource": ["arn:aws:s3:::examplebucket",
                    "arn:aws:s3:::examplebucket/*"],
       "Condition": {
         "StringNotEquals": {
           "aws:sourceVpce": "vpce-1a2b3c4d"
         }
       },
       "Principal": "*"
     }
   ]
}


Time Out

If you get a timeout error, likely one of the networking rules is not set up correctly. 

  • Make sure the route is in the route table associated with the subnet used by the Lambda function.
  • Make sure the Security Group used by the Lambda function has the Prefix List
  • Make sure the Lambda function is assigned to the correct subnet and security group that show the rules above.
  • Make sure the S3 endpoint policy allows access to the bucket by the Lambda role. I learned from AWS that you cannot use a role in an S3 bucket policy, just as you cannot use a role in an S3 bucket policy. It seems logical that you should be able to and this stumps a lot of people who simply change the arn to the role arn. Sorry ...doesn't work! You can limit the actions that can be taken on the bucket however such as GetObject, PutObject, DeleteObject.

Access Denied

If this error occurs:
An error occurred (AccessDenied) when calling the GetObject operation: Access Denied: ClientError
Check the following:
  • The bucket policy allows s3:Get on the individual keys (files): "arn:aws:s3:::MyExampleBucket/*"
  • The bucket policy allows s3:ListBucket for the Lambda role on the bucket itself: "arn:aws:s3:::MyExampleBucket"
  • The Lambda role allows access to the S3 bucket.
  • Make sure the file name and the bucket name are correct.
  • Make sure you have a principal in your S3 Endpoint Policy. CloudFormation allows creating a policy without a principal at the time of this writing and results in this error.
Timeout Errors 

There seems to be an issue around 8:30 p.m. PST right now which AWS is working to fix and will likely be resolved soon. See this p post: http://websitenotebook.blogspot.com/2017/07/timeout-connecting-to-s3-endpoint-from.html