# Log collection

This document explains how to enable centralized log collection for the MoAI Inference Framework using Loki (log aggregation) and Vector (log collection agent).

# Overview

flowchart TB
    pods["`**Inference Service Pods**`"]
    vector["`**Vector**`"]
    grafana["`**Grafana**`"]

    subgraph log_storage["Log Storage"]
        loki["`**Loki**`"]
        minio[("**MinIO**")]
    end

    pods -->|"container logs"| vector
    vector -->|"transforms + labels"| loki
    loki -->|"LogQL"| grafana

# Architecture details

# Loki

Property	Value
Helm chart	`grafana/loki` v6.30.0
App version	3.5.1
Storage backend	S3 (MinIO), TSDB index
Retention	90 days (2160 h)
Ingestion limit	30 MB/s, 60 MB burst
Max entries/query	50 000
Deployment	Distributed (gateway / read / write / backend)

# Vector

Property	Value
Helm chart	`vector/vector` v0.39.0
Deployment	DaemonSet (Agent mode, one pod per node)
Log source	Pods labelled `mif.moreh.io/log.collect=true` (`kubernetes_logs`)
Log format	JSON parsing applied only to pods labelled `mif.moreh.io/log.format=json`
Tolerations	unschedulable, compute, `amd.com/gpu`

# MinIO

Property	Value
Helm chart	`minio/minio` v5.4.0
Mode	Standalone
Bucket	`loki` (created via post-install Job on startup)
Loki credentials	Dedicated `loki` user with S3 policy scoped to `loki` bucket
Resources	2 Gi memory (requests)
Persistence	emptyDir (ephemeral by default)
Deployment	Single pod

# Component naming

Service names are derived from the Helm release name. With the default release name mif:

Service	Name (same-namespace access)
MinIO	`mif-minio`
Loki gateway	`mif-loki-gateway`
Loki read	`mif-loki-read`
Loki write	`mif-loki-write`

Vector connects to Loki using the release-prefixed service name since all components are co-located in the same namespace.

# Prerequisites

The moai-inference-framework Helm chart installed (or being installed).

Info

MinIO, Loki, and Vector are all enabled by default in the moai-inference-framework chart. No additional configuration is required to get started.

# Installation

Log collection is installed as part of the moai-inference-framework Helm chart. See Prerequisites for the required values and install command.

# Verifying the installation

Check that all Loki components are running.

kubectl get pods -n mif -l app.kubernetes.io/name=loki

Expected output (all pods Running)
NAME                           READY   STATUS    RESTARTS   AGE
loki-backend-0                 1/1     Running   0          2m
loki-gateway-xxxxxxxxx-xxxxx   1/1     Running   0          2m
loki-read-xxxxxxxxx-xxxxx      1/1     Running   0          2m
loki-write-0                   1/1     Running   0          2m

Check that Vector is running on all nodes.

kubectl get pods -n mif -l app.kubernetes.io/name=vector

Expected output (one pod per node, all Running)
NAME           READY   STATUS    RESTARTS   AGE
vector-xxxxx   1/1     Running   0          2m
vector-yyyyy   1/1     Running   0          2m

Check Vector logs to confirm it is shipping to Loki without errors.

kubectl logs -n mif -l app.kubernetes.io/name=vector --tail=50

# Enabling log collection for a pod

Vector collects logs only from pods that explicitly opt in. Two pod labels control this behavior.

# Opt-in label

Add the mif.moreh.io/log.collect=true label to a pod to include its logs in Vector's collection. Pods without this label are ignored entirely.

metadata:
  labels:
    mif.moreh.io/log.collect: "true"

# Log format label

Add the mif.moreh.io/log.format=json label to enable structured JSON log parsing for a pod. When set, Vector parses each log line as JSON and promotes the following fields:

JSON field	Mapped to
`msg`	`message`
`time`	`timestamp`
`level`	`level` (Loki label)
others	merged into the event

Without this label, the log line is forwarded as-is without any JSON parsing.

metadata:
  labels:
    mif.moreh.io/log.collect: "true"
    mif.moreh.io/log.format: "json"

Info

The level Loki label is only populated for pods with mif.moreh.io/log.format=json. For plain-text pods, level remains empty.

# Searching logs in Grafana

# Accessing Grafana

If you have not yet accessed Grafana, follow the Accessing Grafana guide to retrieve admin credentials, set up port forwarding, and log in.

# Opening the Explore view

After logging in to Grafana, click on the Explore icon (compass) in the left sidebar. You will see the Explore view with a query editor:

# Selecting the Loki datasource

If the datasource is not already set to Loki, click the datasource dropdown at the top of the page and select Loki:

# Switching to Code mode

The query editor defaults to Builder mode, which provides a visual query builder. To write LogQL queries directly, click the Code button to switch to Code mode:

# Running a log query

Enter a LogQL query in the query editor and click Run query (or press Shift+Enter). For example, {namespace="default"} returns all logs from the default namespace. The screenshot below shows the results, which include both plain-text and JSON-formatted logs collected from different pods:

# Labels available for log search

Vector enriches every log entry with the following labels, which can be used as LogQL selectors:

Label	Source	Example value
`namespace`	`kubernetes.pod_namespace`	`default`
`inference_service`	pod label `app.kubernetes.io/instance`	`llama-3-2-1b`
`pool_name`	pod label `mif.moreh.io/pool`	`heimdall`
`role`	pod label `mif.moreh.io/role`	`prefill`, `decode`
`app`	pod label `app.kubernetes.io/name`	`vllm`
`node_name`	`VECTOR_SELF_NODE_NAME` env var (injected by Vector)	`gpu-node-01`
`level`	parsed from JSON log field `level` (pods with `mif.moreh.io/log.format=json` only)	`info`, `warn`, `error`

# Query examples

Filter by a single label:

{namespace="default"}
{inference_service="llama-3-2-1b"}
{pool_name="heimdall"}
{role="decode"}

Combine multiple labels and search for a keyword in the log line:

{namespace="default", inference_service="llama-3-2-1b", role="prefill"} |= "error"

Filter by log level (available only for JSON-formatted pods):

{namespace="default", level="error"}

Info

The level label is only available for pods with the mif.moreh.io/log.format=json label. To filter plain-text logs by level, use a pipeline filter instead:

{namespace="default"} |= "ERROR"

# Using an external MinIO

If MinIO is already deployed outside this chart, set minio.enabled: false and configure lokiBucket with the host and credentials of a MinIO user that has read/write access to the loki bucket.

Same namespace — if the existing MinIO service name matches <release>-minio, only credentials are required:

moai-inference-framework-values.yaml
minio:
  enabled: false
lokiBucket:
  accessKey: <accessKey>
  secretKey: <secretKey>

Different namespace — set lokiBucket.host to the FQDN so that Loki can resolve it cross-namespace:

moai-inference-framework-values.yaml
minio:
  enabled: false
lokiBucket:
  host: <minio.minio.svc.cluster.local>
  accessKey: <accessKey>
  secretKey: <secretKey>

# Disabling log collection

moai-inference-framework-values.yaml
minio:
  enabled: false
loki:
  enabled: false
vector:
  enabled: false