Docs

Overview

Spilled is a network observability platform for collecting, storing, and querying network telemetry. It is designed for interactive investigation: instead of relying only on predefined dashboards, operators can ask questions of their network data as an incident unfolds.

Spilled uses a Bring Your Own Cloud (BYOC) deployment model. Fusion runs in your cloud and is responsible for ingesting and querying telemetry, while the Spilled control plane remains managed by Spilled. Raw telemetry data is stored in your object storage, such as Amazon S3. Building on cloud object storage follows the modern pattern for scalable data systems, making petabyte-scale retention practical while keeping storage costs manageable.

Fusion is the stateless Spilled executable responsible for both flow ingestion and query execution.

Fusion can be deployed anywhere you can run container images, including Kubernetes, Nomad, Docker, and cloud platforms that support container workloads.

Deploying Fusion

Fusion is the all-in-one runtime for smaller workloads. It combines flow ingestion and query execution in a single deployment.

Fusion deployments are stateless. They can be replaced or rescheduled freely because durable data lives in object storage.

The exact image name depends on how you publish Fusion. The example below uses a placeholder image reference and the default ingest ports.

docker run -d \
  --name spilled-fusion \
  -p 2055:2055/udp \
  -p 2056:2056/udp \
  -p 6343:6343/udp \
  -e AWS_ACCESS_KEY_ID=<access-key> \
  -e AWS_SECRET_ACCESS_KEY=<secret-key> \
  -e AWS_ENDPOINT_URL_S3=<s3-endpoint> \
  -e AWS_REGION=<region> \
  -e SPX_INGEST_API_KEY=<ingest-api-key> \
  -e SPX_QUERY_API_KEY=<query-api-key> \
  -e SPX_CLUSTER_ID=<cluster-id> \
  -e SPX_BUCKET_URL=<bucket-url> \
  <image>

This configuration gives Fusion credentials for object storage, access to the Spilled control plane, and UDP listeners for incoming telemetry.

Required configuration

The following environment variables are required for a standard Fusion deployment.

Variable Description
AWS_ACCESS_KEY_ID Access key for the S3-compatible object store.
AWS_SECRET_ACCESS_KEY Secret key for the S3-compatible object store.
AWS_ENDPOINT_URL_S3 Object storage endpoint URL. Required for S3-compatible providers.
AWS_REGION Object storage region. Some providers use values such as auto.
SPX_INGEST_API_KEY Credential used by the ingest path.
SPX_QUERY_API_KEY Credential used by the query path.
SPX_CLUSTER_ID Spilled cluster identifier.
SPX_BUCKET_URL Bucket URL used for persisted flow data, for example s3://spilled?region=us-east-1.

Ingress ports

By default, Fusion listens on the following UDP ports.

Port Protocol Purpose
2055 UDP IPFIX ingestion.
2056 UDP NetFlow ingestion.
6343 UDP sFlow ingestion.

Only expose the protocols you intend to receive. If flow exporters are internal to the same VPC or cluster network, keep these listeners private.

Object storage

Object storage is the durable backing store for Spilled. We recommend using a dedicated bucket for your Spilled data. All data written by Fusion is stored under `/spilled`.

Bucket URL

SPX_BUCKET_URL identifies the bucket Spilled should use for persisted data. The exact URL shape depends on the object store implementation.

Provider Example
AWS S3 s3://my-bucket?region=us-east-1
Cloudflare R2 s3://my-bucket?region=auto
Other S3-compatible stores s3://my-bucket?region=<region>

When using an S3-compatible provider, set AWS_ENDPOINT_URL_S3 to the provider-specific API endpoint. This allows Fusion to talk to the correct object storage service even when it is not AWS S3 itself.

Permissions

Fusion needs permission to create, read, list, and delete objects in the bucket. Those operations are required for normal writes, reads, and cleanup workflows such as compaction and retention.

In production, prefer workload identity, instance roles, or another short-lived credential mechanism instead of long-lived static access keys.

Network placement

Keep object storage traffic on private networking where possible. For example, use an S3 VPC endpoint or the equivalent private path provided by your cloud. This reduces latency, avoids unnecessary egress, and keeps telemetry movement inside your infrastructure boundary.

Production guidance

High availability

Run more than one Fusion instance in production. Because Fusion instances share object storage and do not depend on host-local durability, horizontal scaling is usually simpler than scaling up a single large instance.

Failure handling

Health checks and restart policies should operate at the container level. If either fluid or siphon stops making progress, the safest default is to replace the whole Fusion instance.

Secrets management

Store Spilled API keys and object storage credentials in your platform's secret manager. Avoid baking them into images or committing static environment files to source control.

Capacity planning

Fusion sizing is mostly a function of ingest rate, query concurrency, and object storage latency. Start with a small replicated deployment, observe queueing and query latency under real traffic, and scale out before scaling up.

Next documentation to add

This page covers the base deployment model. The next useful deployment docs are operational details:

  • Publishing and versioning the Fusion image.
  • Kubernetes deployment examples.
  • Provider-specific object storage configuration.
  • Monitoring, health checks, and operational troubleshooting.