Upgrade the Upload Flow to Save File Metadata with DynamoDB

Save upload metadata from an S3-triggered Lambda function into a DynamoDB table with a composite primary key.

25 min

Introductory

AWS Free TierFREE TIER

All services used in this lesson are covered by the AWS Free Tier.

AWS Services Used

DynamoDB— 25 GB + 25 RCU/WCU always freeLambda— 1M requests/month always free

Learning Outcomes

By the end of this lesson, you will be able to:

Save upload metadata from an S3-triggered Lambda function into DynamoDB.
Explain why DynamoDB is a good fit for simple metadata records in a serverless workflow.
Use a composite primary key to store metadata by bucket and object key.
Use Lambda environment variables for table configuration.
Explain one overwrite risk with PutItem and one way to reduce it.

Key Terms

DynamoDB table: A collection of items. Each item is uniquely identified by its primary key. DynamoDB supports a simple primary key or a composite primary key made of a partition key and sort key.
Item: A collection of attributes in a DynamoDB table. Basic CRUD operations include PutItem, GetItem, UpdateItem, and DeleteItem.
Partition key / sort key: In a composite primary key, the partition key groups related items and the sort key distinguishes items within that partition.
Environment variable: A configuration value stored on the Lambda function, useful for passing operational parameters like a table name without hard-coding them in code.

The Core Idea

Your S3 event already contains useful metadata, including the bucket name, object key, object size, eTag, event time, and a sequencer value. The object key in the event is URL-encoded, so your Lambda code should decode it before saving it.

In this lesson, you will keep the S3 trigger from the previous lesson, then upgrade the Lambda function so it writes one metadata record into DynamoDB for each uploaded object. DynamoDB is a strong fit here because it is schemaless beyond the primary key attributes, and PutItem gives you a simple way to create an item.

What You Will Build

You will create:

One DynamoDB table for upload metadata
One Lambda environment variable named TABLE_NAME
One small IAM permission update so Lambda can write to DynamoDB
One Lambda function that stores:
- bucket
- object_key
- size
- etag
- event_time
- event_name
- sequencer

Table Design for this Lesson

Use this DynamoDB key design:

Partition key: bucket (String)
Sort key: object_key (String)

Why this design works:

Many uploads can belong to the same bucket.
Each object key distinguishes one upload record within that bucket.
Composite keys are a normal DynamoDB pattern for related items with shared grouping and unique secondary identity.

A simple table like this is also flexible because DynamoDB does not require you to predefine non-key attributes and their data types ahead of time.

Part 1: Create the DynamoDB Table

Open DynamoDB and create a table with:

Table name: upload_metadata
Partition key: bucket (String)
Sort key: object_key (String)

When you create a DynamoDB table, you must specify the primary key. For a composite primary key, you provide both the partition key and the sort key.

Part 2: Update the Lambda Execution Role

Your Lambda function now needs permission to write to the DynamoDB table. A minimal policy for this lesson can look like this:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "WriteMetadataToDynamoDB",
      "Effect": "Allow",
      "Action": ["dynamodb:PutItem"],
      "Resource": "arn:aws:dynamodb:REGION:ACCOUNT_ID:table/upload_metadata"
    },
    {
      "Sid": "WriteLogs",
      "Effect": "Allow",
      "Action": [
        "logs:CreateLogGroup",
        "logs:CreateLogStream",
        "logs:PutLogEvents"
      ],
      "Resource": "*"
    }
  ]
}

Attach that to the Lambda execution role, replacing the placeholders with your account details. The role is the identity Lambda uses to call AWS services on your behalf.

Part 3: Add an Environment Variable

Set a Lambda environment variable:

Key: TABLE_NAME
Value: upload_metadata

AWS recommends using environment variables to pass operational parameters instead of hard-coding them. Lambda environment variables are stored as function configuration and are available to your code at runtime.

Part 4: Update the Lambda Code

Replace the simple logging-only function with this version:

import json
import os
import urllib.parse
import boto3

dynamodb = boto3.resource("dynamodb")
table = dynamodb.Table(os.environ["TABLE_NAME"])

def lambda_handler(event, context):
    records = event.get("Records", [])
    written = 0

    for record in records:
        if "s3" not in record:
            continue

        bucket = record["s3"]["bucket"]["name"]
        object_key = urllib.parse.unquote_plus(
            record["s3"]["object"]["key"],
            encoding="utf-8"
        )

        item = {
            "bucket": bucket,
            "object_key": object_key,
            "size": int(record["s3"]["object"].get("size", 0)),
            "etag": record["s3"]["object"].get("eTag", ""),
            "event_time": record.get("eventTime", ""),
            "event_name": record.get("eventName", ""),
            "sequencer": record["s3"]["object"].get("sequencer", "")
        }

        table.put_item(Item=item)
        print(f"Saved metadata for {bucket}/{object_key}")
        written += 1

    return {
        "statusCode": 200,
        "body": json.dumps({"written": written})
    }

Why this code works:

The S3 event carries the bucket name, object key, size, eTag, event time, and sequencer.
The object key must be URL-decoded before you treat it like a normal path or filename.
PutItem creates the DynamoDB item for the metadata record.

Part 5: Test the Flow

Upload a file like incoming/notes.txt. Because your S3 trigger already listens for uploads in incoming/, Lambda should run asynchronously and process the event.

Then verify two places:

CloudWatch Logs: You should see a log line like Saved metadata for ... from the function.
DynamoDB Table Items: You should see one item with keys bucket and object_key, plus the extra attributes.

Important Note about Overwrites

PutItem writes an item for the key you provide. If you use the same primary key again, the new write can replace the old item.

For this lesson, that means:

If the same bucket and object key are uploaded again, you may overwrite the previous metadata row.
That is acceptable for a learning lab.
Later, you can add a condition or include a version/timestamp in the key if you want every event stored separately.

Optional Improvement: Think about Ordering and Duplicates

AWS notes two useful things for S3 event processing:

Event notifications are not guaranteed to arrive in the exact order events occurred.
The sequencer field can help compare order for events on the same object key.
Lambda best practices also recommend writing idempotent code, because duplicate processing can happen in event-driven systems.

For now, just store the sequencer field so you have it available later.

Lab Checklist

Step	Success Condition
Create DynamoDB table	`upload_metadata` exists
Add Lambda permissions	Execution role can call `dynamodb:PutItem`
Add environment variable	`TABLE_NAME=upload_metadata` is set
Update Lambda code	Function writes item to DynamoDB
Upload a test file	S3 trigger fires
Check logs	Function logs success
Check table	One metadata item appears

Micro-activity 1: Inspect the Saved Item

Think about it

After your test upload, check the saved item: What bucket value was saved? What object key? What file size? What event name? Was a sequencer value stored? These values come from the S3 event structure that Lambda receives.

Micro-activity 2: Match DynamoDB Operations

Micro-Activity

Match each DynamoDB operation to what it does

Examples

Choose one, then match it on the right

Characteristics

Select an example first

0 of 5 matched so far.

Summary

In this lesson, you upgraded a simple S3-to-Lambda workflow into a real metadata pipeline. S3 sent the upload event, Lambda parsed the event, and DynamoDB stored a structured metadata record.

You also used two good serverless habits:

Storing configuration in a Lambda environment variable instead of hard-coding it.
Using a DynamoDB primary key that matches the shape of the event data you are saving.

Quiz

Knowledge Check

1 / 5

Which AWS service is used in this lesson to store metadata records?