Grafana Loki，AlertManager-无法读取规则目录，打开/tmp/loki/rules/fake：没有这样的文件或目录

2024-6-2 • tag-icon

Grafana Loki，AlertManager-无法读取规则目录，打开/tmp/loki/rules/fake：没有这样的文件或目录

我已经在本地机器上的 k3d 集群上使用 Helm 图表部署了 promtail、Grafana、Loki 和 AlertManager。我希望在 Loki 中设置一些规则，这样如果发生某些事情，AlertManager 应该得到通知。现在我只尝试了一些简单的规则，只是为了检查它是否有效。

我的 Loki 版本：{"version":"2.6.1","revision":"6bd05c9a4","branch":"HEAD","buildUser":"root@ea1e89b8da02","buildDate":"2022-07-18T08:49:07Z","goVersion":""}

我的 Grafana 版本：

Loki 的配置如下：

loki:
  # should loki be deployed on cluster?
  enabled: true

  image:
    repository: grafana/loki
    pullPolicy: Always
    pullSecrets:
      - registry
  priorityClassName: normal
  resources:
    limits:
      memory: 3Gi
      cpu: 0
    requests:
      memory: 0
      cpu: 0
  config:
    chunk_store_config:
      max_look_back_period: 30d
    table_manager:
      retention_deletes_enabled: true
      retention_period: 30d
    query_range:
      split_queries_by_interval: 0
      parallelise_shardable_queries: false
    querier:
      max_concurrent: 2048
    frontend:
      max_outstanding_per_tenant: 4096
      compress_responses: true
    ingester:
      wal:
        enabled: true
        dir: /tmp/wal
    schema_config:
      configs:
        - from: 2022-12-05
          store: boltdb-shipper
          object_store: filesystem
          schema: v11
          index:
            prefix: index_
            period: 24h
    storage_config:
      boltdb_shipper:
        active_index_directory: /tmp/loki/boltdb-shipper-active
        cache_location: /tmp/loki/boltdb-shipper-cache
        cache_ttl: 24h         # Can be increased for faster performance over longer query periods, uses more disk space 
        shared_store: filesystem
      filesystem:
        directory: /tmp/loki/chunks
    compactor:
      working_directory: /tmp/loki/boltdb-shipper-compactor
      shared_store: filesystem
    ruler:
      storage:
        type: local
        local:
          directory: /tmp/loki/rules/
      ring:
        kvstore:
          store: inmemory
      rule_path: /tmp/loki/rules-temp
      alertmanager_url: http://onprem-kube-prometheus-alertmanager.svc.mylocal-monitoring:9093
      enable_api: true
      enable_alertmanager_v2: true
  write:
    extraVolumeMounts:
      - name: rules-config
        mountPath: /tmp/loki/rules/fake/
    extraVolumes:
      - name: rules-config
        configMap:
          name: rules-cfgmap
          items:
            - key: "rules.yaml"
              path: "rules.yaml"
  read:
    extraVolumeMounts:
      - name: rules-config
        mountPath: /tmp/loki/rules/fake/
    extraVolumes:
      - name: rules-config
        configMap:
          name: rules-cfgmap
          items:
            - key: "rules.yaml"
              path: "rules.yaml"

promtail:
  image:
    registry: docker
    pullPolicy: Always
  imagePullSecrets:
    - name: registry
  priorityClassName: normal
  resources:
    limits:
      memory: 256Mi
      cpu: 0
    requests:
      memory: 0
      cpu: 0
  livenessProbe:
    failureThreshold: 5
    httpGet:
      path: /ready
      port: http-metrics
    initialDelaySeconds: 10
    periodSeconds: 10
    successThreshold: 1
    timeoutSeconds: 1
  config:
    snippets:
      pipelineStages:
        - cri: {}
      common:
        - action: replace
          source_labels:
            - __meta_kubernetes_pod_node_name
          target_label: node_name
        - action: replace
          source_labels:
            - __meta_kubernetes_namespace
          target_label: namespace
        - action: replace
          source_labels:
            - __meta_kubernetes_pod_container_name
          target_label: container
        - action: replace
          replacement: /var/log/pods/*$1/*.log
          separator: /
          source_labels:
            - __meta_kubernetes_pod_uid
            - __meta_kubernetes_pod_container_name
          target_label: __path__
        - action: replace
          replacement: /var/log/pods/*$1/*.log
          regex: true/(.*)
          separator: /
          source_labels:
            - __meta_kubernetes_pod_annotationpresent_kubernetes_io_config_hash
            - __meta_kubernetes_pod_annotation_kubernetes_io_config_hash
            - __meta_kubernetes_pod_container_name
          target_label: __path__

monitoring:
  enabled: false

networkPolicies:
  enabled: false

问题是，当我想要检查规则时，却curl -X GET localhost:3100/loki/api/v1/rules显示：unable to read rule dir /tmp/loki/rules/fake: open /tmp/loki/rules/fake: no such file or directory。

看来它找不到规则文件。

我也尝试过像这样更改配置：

write:
    extraVolumeMounts:
      - name: rules-conf
        mountPath: /tmp/loki/rules/fake/rules.yaml
    extraVolumes:
      - name: rules-conf
  read:
    extraVolumeMounts:
      - name: rules-conf
        mountPath: /tmp/loki/rules/fake/rules.yaml
    extraVolumes:
      - name: rules-conf

我的配置图：

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: rules-cfgmap
  namespace: mylocal-monitoring
data:
  rules.yaml: |
    groups:
      - name: PrometheusAlertsGroup
    rules:
      - alert: test1
      expr: |
        1 > 0
        for: 0m
        labels:
          severity: critical
        annotations:
          summary: TEST: testing test
          description: test

规则文件：

groups:
  - name: PrometheusAlertsGroup
  rules:
    - alert: test1
      expr: |
        1 > 0
        for: 0m
        labels:
          severity: critical
        annotations:
          summary: TEST: testing test
          description: test

但问题是一样的。有什么想法吗？

当我手动创建时它最终起作用了/tmp/loki/rules/fake/rules.yaml，但这不是手动创建它的重点。

相关内容