Logging with Fluent Bit and Fluentd in Kubernetes, pt.3

Fluent Bit is a fast and lightweight log processor, stream processor and forwarder. It’s gained popularity as the younger sibling of Fluentd due to its tiny memory footprint(~650KB compared to Fluentd’s ~40MB), and zero dependencies - making it ideal for cloud and edge computing use cases.

This post is part 3 in a series of posts about logging using Fluent Bit and the Fluentd forwarder in Kubernetes, and it describes the steps to deploy Fluentd and Fluent Bit.

Other posts in this series:

  • Part 1 Motivation and Architecture Overview.

  • Part 2 Deploying a single-node Elasticsearch, along with Kibana and Cerebro.

  • Part 4 Viewing the End Result.

Deploying Fluentd

Fluentd is the log aggregator and processor stage before Elasticsearch, and we will deploy this now. We will use the Bitnami Fluentd Helm chart.

  1. Extend the Bitnami image by installing the rewrite_tag_filter plugin. We will push this up to docker hub as a custom image, to be used later.

    CUSTOM_DOCKER_IMG=georgegoh/fluentd:1.10.4-debian-10-r2-rewrite_tag_filter
    cat <<EOF | docker build -t ${CUSTOM_DOCKER_IMG} -
    FROM bitnami/fluentd:1.10.4-debian-10-r2
    LABEL maintainer "Bitnami <containers@bitnami.com>"
    
    ## Install custom Fluentd plugins
    RUN fluent-gem install 'fluent-plugin-rewrite-tag-filter'
    EOF
    docker push ${CUSTOM_DOCKER_IMG}
  2. Add the Bitnami Helm repo.

    helm repo add bitnami https://charts.bitnami.com/bitnami
  3. Create a custom ConfigMap that can send output to Elasticsearch. Substitute ES_HOST=es.lab.example.com with your own Elasticsearch hostname.

    ES_HOST=es.lab.example.com
    cat <<EOF | sed "s/<elasticsearch-host>/${ES_HOST}/" > fluentd-elasticsearch-output-configmap.yaml
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: fluentd-elasticsearch-output
      namespace: k8s-system-logging
    data:
      fluentd.conf: |
        # Prometheus Exporter Plugin
        # input plugin that exports metrics
        <source>
          @type prometheus
          port 24231
        </source>
    
        # input plugin that collects metrics from MonitorAgent
        <source>
          @type prometheus_monitor
          <labels>
            host \${hostname}
          </labels>
        </source>
    
        # input plugin that collects metrics for output plugin
        <source>
          @type prometheus_output_monitor
          <labels>
            host \${hostname}
          </labels>
        </source>
    
        # Ignore fluentd own events
        <match fluent.**>
          @type null
        </match>
    
        # TCP input to receive logs from the forwarders
        <source>
          @type forward
          bind 0.0.0.0
          port 24224
        </source>
    
        # HTTP input for the liveness and readiness probes
        <source>
          @type http
          bind 0.0.0.0
          port 9880
        </source>
    
        # Throw the healthcheck to the standard output instead of forwarding it
        <match fluentd.healthcheck>
          @type stdout
        </match>
    
        # rewrite tags based on which namespace the logs come from.
        <match kube.**>
          @type rewrite_tag_filter
          <rule>
            key     \$['kubernetes']['namespace_name']
            pattern /^(kube-system|kubeapps|k8s-system-[\S]+)$/
            tag     ops.\${tag}
          </rule>
          <rule>
            key     \$['kubernetes']['namespace_name']
            pattern /.+/
            tag     prod.\${tag}
          </rule>
        </match>
    
        <match ops.kube.**>
          @type copy
          @id output_copy_ops
          <store>
            @type elasticsearch_dynamic
            @id output_elasticsearch_ops
            host <elasticsearch-host>
            port 9200
            logstash_format true
            logstash_prefix kube-ops.\${record['kubernetes']['namespace_name']}
            <buffer>
              @type file
              path /opt/bitnami/fluentd/logs/buffers/ops-logs.buffer
              flush_thread_count 2
              flush_interval 5s
            </buffer>
          </store>
        </match>
    
        <match prod.kube.**>
          @type copy
          @id output_copy
          <store>
            @type elasticsearch_dynamic
            @id output_elasticsearch
            host <elasticsearch-host>
            port 9200
            logstash_format true
            logstash_prefix kube.\${record['kubernetes']['namespace_name']}
            <buffer>
              @type file
              path /opt/bitnami/fluentd/logs/buffers/logs.buffer
              flush_thread_count 2
              flush_interval 5s
            </buffer>
          </store>
        </match>
    EOF
    kubectl apply -f fluentd-elasticsearch-output-configmap.yaml
  4. Install the Fluentd Helm chart. Substitute the image.repository and image.tag values with the relevant values from step 1.

    helm install fluentd \
         --set image.repository=georgegoh/fluentd \
         --set image.tag=1.10.4-debian-10-r2-rewrite_tag_filter \
         --set forwarder.enabled=false \
         --set aggregator.enabled=true \
         --set aggregator.replicaCount=1 \
         --set aggregator.port=24224 \
         --set aggregator.configMap=fluentd-elasticsearch-output \
         bitnami/fluentd -n k8s-system-logging

    Consider using a Horizontal Pod Autoscaler for the fluentd StatefulSet to react to higher volumes of incoming logs.

    Now that we’ve completed setup of Fluentd, Elasticsearch and Kibana, it’s time to move on to Fluent Bit and complete the logging setup.

Deploy Fluent Bit

  1. Add the Fluent Helm repo.

    helm repo add fluent https://fluent.github.io/helm-charts
  2. Save the following in a file called values.yaml.

    on_minikube: false
    
    image:
      fluent_bit:
        repository: fluent/fluent-bit
        tag: v1.3.7
      pullPolicy: Always
    
    # When enabled, exposes json and prometheus metrics on {{ .Release.Name }}-metrics service
    metrics:
      enabled: true
      service:
        labels:
           k8s-app: fluent-bit
        annotations:
          'prometheus.io/path': "/api/v1/metrics/prometheus"
          'prometheus.io/port': "2020"
          'prometheus.io/scrape': "true"
        port: 2020
        type: ClusterIP
      serviceMonitor:
        enabled: false
        additionalLabels: {}
        # namespace: monitoring
        # interval: 30s
        # scrapeTimeout: 10s
    
    backend:
      type: forward
      forward:
        host: fluentd-0.lab.spodon.com
        port: 24224
        tls: "off"
        tls_verify: "on"
        tls_debug: 1
        shared_key: thisisunsafe
    
    parsers:
      enabled: true
      ## List the respective parsers in key: value format per entry
      ## Regex required fields are name and regex. JSON and Logfmt required field
      ## is name.
      regex:
        - name: cri-mod
          regex: "^(?<time>[^ ]+) (?<stream>stdout|stderr) (?<logtag>[^ ]*) (?<log>.*)$"
          timeFormat: "%Y-%m-%dT%H:%M:%S.%L%z"
          timeKey: time
        - name: catchall
          regex: "^(?<message>.*)$ }"
      logfmt: []
      json: []
    
    env: []
    podAnnotations: {}
    fullConfigMap: false
    existingConfigMap: ""
    rawConfig: |-
      @INCLUDE fluent-bit-service.conf
      @INCLUDE fluent-bit-input.conf
      @INCLUDE fluent-bit-filter.conf
      @INCLUDE fluent-bit-output.conf
    
    extraEntries:
      input: ""
      audit: ""
      filter: |
        Merge_Parser     catchall
        Keep_Log         Off
      output: ""
    
    extraPorts: []
    
    extraVolumes: []
    
    extraVolumeMounts: []
    
    resources: {}
    hostNetwork: false
    dnsPolicy: ClusterFirst
    tolerations: []
    nodeSelector: {}
    affinity: {}
    service:
      flush: 1
      logLevel: info
    
    input:
      tail:
        memBufLimit: 5MB
        parser: cri-mod
        path: /var/log/containers/*.log
        ignore_older: ""
      systemd:
        enabled: false
        filters:
          systemdUnit:
            - docker.service
            - kubelet.service
            - node-problem-detector.service
        maxEntries: 1000
        readFromTail: true
        stripUnderscores: false
        tag: host.*
    
    audit:
      enable: false
      input:
        memBufLimit: 35MB
        parser: docker
        tag: audit.*
        path: /var/log/kube-apiserver-audit.log
        bufferChunkSize: 2MB
        bufferMaxSize: 10MB
        skipLongLines: true
        key: kubernetes-audit
    
    filter:
      kubeURL: https://kubernetes.default.svc:443
      kubeCAFile: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      kubeTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
      kubeTag: kube
      kubeTagPrefix: kube.var.log.containers.
      mergeJSONLog: true
      mergeLogKey: ""
      enableParser: true
      enableExclude: true
      useJournal: false
    
    rbac:
      create: true
      pspEnabled: false
    
    taildb:
      directory: /var/lib/fluent-bit
    
    serviceAccount:
      # Specifies whether a ServiceAccount should be created
      create: true
      # Annotations to add to the service account
      annotations: {}
      # The name of the ServiceAccount to use.
      # If not set and create is true, a name is generated using the fullname template
      name:
    
    ## Specifies security settings for a container
    ## Ref: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-container
    securityContext: {}
      # securityContext:
      #   privileged: true
    
    ## Specifies security settings for a pod
    ## Ref: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod
    podSecurityContext: {}
      # podSecurityContext:
      #   runAsUser: 1000
  3. Install the Fluent Bit helm chart, using values created in the previous step.

    helm install --name fluent-bit -f values.yaml stable/fluent-bit

Summary

In this post, I shared the steps for deploying Fluentd and Fluent Bit in a Forwarding pattern.

In Part 4 I’ll wrap up with creating indices in Kibana and viewing the results.