TL;DR

Deploying a LangGraph agent to AWS Lambda is a fit when traffic is bursty, you want pay-per-call billing, and your graph runs in under 15 minutes. Steps:

  1. Package the agent code + dependencies as a container image (faster cold starts than zip for graphs > 50 MB).
  2. Set Lambda memory to 1024–3008 MB — LangGraph’s CPU is memory-coupled, more memory = faster reasoning.
  3. Set timeout to your graph’s longest expected run (max 15 min).
  4. Pull secrets from AWS Secrets Manager / Parameter Store at cold start, cache in module scope.
  5. Use Provisioned Concurrency on the entry handler if 1-second cold starts are unacceptable.

When Lambda is the right fit

Lambda fits LangGraph agents when:

Lambda is not the right fit when:

Packaging

Use a container image, not a zip. LangGraph plus LangChain plus your model SDK plus a vector store client easily exceeds the 250 MB unzipped Lambda zip limit. Container images go up to 10 GB.

A minimal Dockerfile:

FROM public.ecr.aws/lambda/python:3.12
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY app/ ${LAMBDA_TASK_ROOT}/app/
CMD ["app.handler.lambda_handler"]

Cold start mitigation

Cold starts on a 2 GB-memory Lambda with LangGraph + LangChain are typically 1.5–3 seconds. To mitigate:

Secrets

Don’t bake API keys into the image. Use Secrets Manager or Parameter Store:

import boto3, json
ssm = boto3.client("ssm")
_secrets = None
def get_secrets():
    global _secrets
    if _secrets is None:
        _secrets = json.loads(
            ssm.get_parameter(Name="/myagent/secrets", WithDecryption=True)["Parameter"]["Value"]
        )
    return _secrets

Cache at module scope (top-level _secrets = None) so the SSM call only happens on cold start.

IAM: scope tool permissions tight

Give the Lambda an IAM role that only allows the AWS APIs the agent’s tools need. If your agent reads from S3, s3:GetObject on one prefix only — never s3:*. The agent will eventually try to do something it shouldn’t; least-privilege is your safety net.

Cost math

For an agent invocation taking 4 seconds at 2 GB:

Token cost dominates. Lambda overhead is rounding error.

April 24, 2026 Musketeers Tech Musketeers Tech
← Back