๐Ÿ”ถ

Amazon Kinesis Data Firehose

Purpose
  • Fully managed service for loading streaming data into AWS destinations without requiring manual infrastructure management.
  • No need to write your own consumer applications โ€” Firehose automatically handles scaling, buffering, transformation, and delivery.
Typical Destinations
  • Amazon S3 (raw storage)
  • Amazon Redshift (via S3 + COPY)
  • Amazon OpenSearch Service (for search and analytics)
  • Custom HTTP endpoints
  • Datadog, Splunk, New Relic (third-party integrations)
Key Features
  • Near real-time delivery: typically 60 seconds latency (or based on buffer size).
  • Automatic scaling: no shard management like Kinesis Data Streams.
  • Data transformation: optional AWS Lambda function for filtering, enrichment, or format conversion (e.g., JSON โ†’ Parquet).
  • Data format conversion: built-in conversion to Apache Parquet or ORC for analytics efficiency.
  • Compression: GZIP, Snappy, or ZIP before storage.
  • Encryption:
    • In-flight via HTTPS
    • At-rest via AWS KMS
Pricing
  • Pay for the volume of data ingested into Firehose.
  • Additional charges for transformations, format conversion, and compression.
When to Use Firehose
  • You want automatic delivery to storage or analytics services without building consumer apps.
  • You donโ€™t need fine-grained replay or custom consumers (those are Kinesis Data Streams use cases)