我运行基于 Heketi 的 3 节点 glusterfs 3.10 集群,通过 Kubernetes 自动配置和取消配置存储。目前,有 20 个卷处于活动状态 - 大多数卷的最小允许大小为 10gb,但每个卷仅保留了几百 MB 的数据。每个卷在两个节点上复制(相当于 RAID-1)。
但是,节点上的 gluster 进程会占用每个节点上的大量内存(~13gb)。创建一个声明转储查看结果,每个卷使用 1 到 30mb 的内存:
# for i in $(gluster volume list); do gluster volume statedump $i nfs; done
# grep mallinfo_uordblks -hn *.dump.*
11:mallinfo_uordblks=1959056
11:mallinfo_uordblks=20888896
11:mallinfo_uordblks=2793760
11:mallinfo_uordblks=23316944
11:mallinfo_uordblks=1917536
11:mallinfo_uordblks=29287872
11:mallinfo_uordblks=14807280
11:mallinfo_uordblks=2170592
11:mallinfo_uordblks=2077088
11:mallinfo_uordblks=15463760
11:mallinfo_uordblks=2030032
11:mallinfo_uordblks=2079856
11:mallinfo_uordblks=2079920
11:mallinfo_uordblks=2167808
11:mallinfo_uordblks=2396160
11:mallinfo_uordblks=34000240
11:mallinfo_uordblks=2649920
11:mallinfo_uordblks=1683776
11:mallinfo_uordblks=6316944
所有卷均具有默认设置表现。由于某种原因,缓存大小显示了两次 - 一次是 32mb,一次是 128mb:
# gluster volume get <volumeId> all | grep performance | sort
performance.cache-capability-xattrs true
performance.cache-ima-xattrs true
performance.cache-invalidation false
performance.cache-max-file-size 0
performance.cache-min-file-size 0
performance.cache-priority
performance.cache-refresh-timeout 1
performance.cache-samba-metadata false
performance.cache-size 128MB
performance.cache-size 32MB
performance.cache-swift-metadata true
performance.client-io-threads off
performance.enable-least-priority on
performance.flush-behind on
performance.force-readdirp true
performance.high-prio-threads 16
performance.io-cache on
performance.io-thread-count 16
performance.lazy-open yes
performance.least-prio-threads 1
performance.low-prio-threads 16
performance.md-cache-timeout 1
performance.nfs.flush-behind on
performance.nfs.io-cache off
performance.nfs.io-threads off
performance.nfs.quick-read off
performance.nfs.read-ahead off
performance.nfs.stat-prefetch off
performance.nfs.strict-o-direct off
performance.nfs.strict-write-ordering off
performance.nfs.write-behind on
performance.nfs.write-behind-window-size1MB
performance.normal-prio-threads 16
performance.open-behind on
performance.parallel-readdir off
performance.quick-read on
performance.rda-cache-limit 10MB
performance.rda-high-wmark 128KB
performance.rda-low-wmark 4096
performance.rda-request-size 131072
performance.read-after-open no
performance.read-ahead on
performance.read-ahead-page-count 4
performance.readdir-ahead on
performance.resync-failed-syncs-after-fsyncoff
performance.stat-prefetch on
performance.strict-o-direct off
performance.strict-write-ordering off
performance.write-behind on
performance.write-behind-window-size 1MB
尽管如此,即使将所有缓存和值加起来,我仍然只能计算每个节点 2.5gb 的内存。
重新启动守护进程不会减少内存使用量,而且我找不到有关如何减少内存的更多信息。每个卷有 750mb 内存似乎太多了,很快就会导致严重问题。
有什么提示吗?