如何使用 Puppet 检测 NVIDIA GPU

如何使用 Puppet 检测 NVIDIA GPU

我有一些任务只想在具有 NVIDIA GPU 的机器上运行。有没有使用 Puppet 的好方法来确定特定代理是否具有 NVIDIA GPU?我可以通过检查 /usr/bin/nvidia-smi 是否存在来在 bash 中执行此操作,但我不确定如何在 Puppet 中执行此操作。此外,如果有更好的方法可以在 bash 中执行此操作而不是这种方式,请告诉我。

答案1

你应该创建一个定制事实检查是否存在/usr/bin/nvidia-smi(如果足够的话),例如:

Facter.add(:nvidia_gpu) do
  confine :kernel => 'Linux'
  setcode do
    FileTest.executable?('/usr/bin/nvidia-smi')
  end
end

或者可能进行更彻底的检查以查看特定的 PCI 设备是否存在,如果它显示为一个,则使用输出lspci或遍历/sys/bus/pci目录。

在您的 Puppet 清单中,您可以使用的值来$facts['nvidia_gpu']控制您所做的事情。

答案2

可以修改pci_devices 事实检测计算机中是否安装了 GPU。它使用lspci而不是查找工具包,因此可以使用 puppet 安装工具包。

# Copyright: Pieter Lexis <[email protected]>

# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.

# There are no dependencies needed for this script, except for lspci.
# This script is only tested on Debian (Lenny and Squeeze), if you
# have any improvements, send a pull request, ticket or email.
# The latest version of this script is available on github at
# https://github.com/kumina/fact-pci_devices

def add_fact(fact, code)
  Facter.add(fact) { setcode { code } }
end

case Facter.value(:operatingsystem)
  when /Debian|Ubuntu/i
    lspci = "/usr/bin/lspci"
  when /RedHat|CentOS|Fedora|Scientific|SLES/i
    lspci = "/sbin/lspci"
  else
    lspci = ""
end

# We can't do this if we don't know the location of lspci
if !lspci.empty? and FileTest.exists?(lspci)
  # Create a hash of ALL PCI devices, the key is the PCI slot ID.
  # { SLOT_ID => { ATTRIBUTE => VALUE }, ... }
  slot=""
  # after the following loop, devices will contain ALL PCI devices and the info returned from lspci
  devices = {}
  %x{#{lspci} -v -mm -k}.each_line do |line|
    if not line =~ /^$/ # We don't need to parse empty lines
      splitted = line.split(/\t/)
      # lspci has a nice syntax:
      # ATTRIBUTE:\tVALUE
      # We use this to fill our hash
      if splitted[0] =~ /^Slot:$/
        slot=splitted[1].chomp
        devices[slot] = {}
      else
        # The chop is needed to strip the ':' from the string
        devices[slot][splitted[0].chop] = splitted[1].chomp
      end
    end
  end

  # To create your own facts, edit the following code:
  raid_counter = 0
  raidcontrollers = []
  gpus = {}
  scsicontrollers = {}
  devices.each_key do |a|
    case devices[a].fetch("Class")
    when /^RAID/
      # ignore AHCI "fake" RAID, because we don't use it
      if devices[a].fetch('Driver') != "ahci"
        add_fact("raidcontroller_#{raid_counter}_vendor", "#{devices[a].fetch('Vendor')}")
        add_fact("raidcontroller_#{raid_counter}_model", "#{devices[a].fetch('SDevice')}")
        raid_counter += 1
        raidcontrollers.insert(-1,"#{devices[a].fetch('Driver')}")
      end
    when /^3D/
       if gpus.key?("#{devices[a].fetch('Device')}")
         gpus["#{devices[a].fetch('Device')}"]['count'] += 1
       else
         gpus["#{devices[a].fetch('Device')}"] = {
           'count' => 1, 
           'vendor' => "#{devices[a].fetch('Vendor')}",
         }
         # Driver might not be defined
         if devices[a].key?('Driver')
           gpus["#{devices[a].fetch('Device')}"]['driver'] = "#{devices[a].fetch('Driver')}"
         end
       end
    when /.*SCSI controller.*/
       if scsicontrollers.key?("#{devices[a].fetch('Device')}")
         scsicontrollers["#{devices[a].fetch('Device')}"]['count'] += 1
       else
         scsicontrollers["#{devices[a].fetch('Device')}"] = {
           'count' => 1, 
           'vendor' => "#{devices[a].fetch('Vendor')}",
           'driver' => "#{devices[a].fetch('Driver')}"
         }
       end
    end
  end
  add_fact("raidcontrollers", raidcontrollers.join(","))
  add_fact("gpus", gpus)
  add_fact("scsicontrollers", scsicontrollers)
end

相关内容