dehio3’s diary

仕事、生活、趣味のメモ

datadog-agentでdockerの情報が採取できない

環境

datadog-agent

# /opt/datadog-agent/bin/agent/agent version
Agent 6.9.0 - Commit: 4bbd2c9 - Serialization version: 4.7.1

datadog-agentはansibleで以下のroleでインストール

github.com

$ ansible-galaxy info Datadog.datadog

Role: Datadog.datadog
        description: Install Datadog agent and configure checks
        active: True
        commit: 81e3921afa069678e60ece56ea6a164494d55b6c
        commit_message: Release 2.4.0
        commit_url: https://api.github.com/repos/DataDog/ansible-datadog/git/commits/81e3921afa069678e60e
        company: 
        created: 2018-07-23T20:15:50.491026Z
        download_count: 399542
        forks_count: 125
        github_branch: master
        github_repo: ansible-datadog
        github_user: DataDog
        id: 27743
        imported: 2018-10-25T17:52:51.891854-04:00
        is_valid: True
        issue_tracker_url: https://github.com/DataDog/ansible-datadog/issues
        license: Apache2
        min_ansible_version: 2.2
        modified: 2018-10-25T21:52:51.892008Z
        open_issues_count: 19
        path: ['/Users/s04270/.ansible/roles', '/usr/share/ansible/roles', '/etc/ansible/roles']
        role_type: ANS
        stargazers_count: 136
        travis_status_url: 

事象

/opt/datadog-agent/bin/agent/agent statusでdockerのステータスを確認すると以下のエラー

    docker
    ------
      Instance ID: docker [ERROR]
      Total Runs: 95
      Metric Samples: Last Run: 0, Total: 0
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 0s
      Error: permanent failure in dockerutil: retry number exceeded
      No traceback
      Warning: Error initialising check: permanent failure in dockerutil: retry number exceeded

原因

dd-agentユーザーがdockerグループに所属してない為、情報を取得する権限がない

github.com

issueが立ってて対応するプルリク出てるけどまだマージされてないのでroleのバージョン上げても改善されない(というか現時点では2.4.0が最新)

対応

dd-agnetユーザーをdockerグループに所属させる

issueにも書いてあるけどansibleだと以下を実行

    - name: ensure dd-agent is in docker group
      become: yes
      user:
        name: dd-agent
        groups: docker
        append: yes
      notify: restart datadog-agent
      tags: datadog

実行後はステータスがOKに

    docker
    ------
      Instance ID: docker [OK]
      Total Runs: 2
      Metric Samples: Last Run: 92, Total: 184
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 1, Total: 2
      Average Execution Time : 23ms