loggie - Cloud-native log agent and aggregator with high performance

Introduction to Loggie

Loggie is a cutting-edge, lightweight, and efficient cloud-native agent and aggregator crafted with the power of Golang. It stands out as a high-performance tool ideal for building scalable log data platforms in cloud environments. Its primary functions include supporting multiple pipeline configurations, offering pluggable components for data transfer, filtering, parsing, alerting, and utilizing Kubernetes Custom Resource Definitions (CRDs) for seamless operation and management. Loggie is designed to ensure observability, reliability, and automation, making it suitable for production use.

Features of Loggie

Advanced Log Collection and Transmission

CRD-Based Pipeline Building

Loggie introduces a straightforward method to set up data pipelines through easy YAML configuration files, utilizing CRDs such as LogConfig, ClusterLogConfig, Interceptor, and Sink. This makes it incredibly versatile for data collection, processing, and sending.

Example configuration:

apiVersion: loggie.io/v1beta1
kind: LogConfig
metadata:
  name: tomcat
  namespace: default
spec:
  selector:
    type: pod
    labelSelector:
      app: tomcat
  pipeline:
    sources: |
      - type: file
        name: common
        paths:
          - stdout
          - /usr/local/tomcat/logs/*.log
    sinkRef: default
    interceptorRef: default

Supports Multiple Architectures

Agent Architecture: Loggie can be deployed as a DaemonSet, enabling log collection without requiring containers to mount volumes.
Sidecar Architecture: Allows for non-intrusive sidecar injection without manual addition in Deployment/StatefulSet templates.
Aggregator Architecture: Deploys Loggie as an intermediate machine to handle aggregated data, processing diverse data sources efficiently.

Enhanced Performance

Performance Benchmarking

Loggie demonstrates superior performance compared to other log collection tools, such as Filebeat. When configured under the same conditions—collecting logs and sending them to a Kafka topic—Loggie uses fewer resources while achieving higher transmission rates.

Agent	File Size	Sink Concurrency	CPU	MEM (rss)	Transmission Rates
Filebeat	3.2G	3	8c	63.8MiB	75.9MiB/s
Loggie	3.2G	3	2c	60MiB	120MiB/s

Adaptive Sink Concurrency

Loggie dynamically adjusts its data sending parallelism based on downstream response time, optimizing server performance without causing bottlenecks or performance degradation.

Streamlined Data Analysis and Monitoring

Loggie transforms raw log data into actionable insights by providing:

Real-time data parsing and transformation through transformer interceptors and configurable actions.
Effective anomaly detection and alerting, integrating seamlessly with custom alert channels.
Log data aggregation capabilities, converting logs into meaningful metrics.

Observability and Troubleshooting

Loggie offers comprehensive metrics and dashboards, easily integrated with Grafana, to facilitate quick troubleshooting and terminal data transmission analysis.

Comparisons

When compared to similar tools like Filebeat, Fluentd, Logstash, and Flume, Loggie presents unique advantages such as low resource usage, support for multiple pipelines and outputs, integrated log alarms, and Kubernetes container log collection through CRD.

Documentation and Resources

Loggie provides extensive documentation, featuring quickstart guides, setup instructions for Kubernetes and Node environments, architectural overviews, and detailed user guides for monitoring and reference materials.

Future Roadmap and Contributions

Loggie continues its development with a clear roadmap aimed at enhancing its features and usability. It welcomes contributions from the community, encouraging pull requests, feedback, and suggestions. All contributions are governed under the Apache 2.0 license, offering open-source flexibility and collaboration.