我们通过 datastax 社区包在具有私有网络的集群上安装了 opscenter 和 datastax 代理。配置似乎没问题,我们在jmx
或上没有连接问题stomp
。只发生了一个与命令权限相关的错误df
,但集群注册后最终没有可用的数据。有什么想法吗?您可以在下面看到我们的address.yaml
配置文件和 opscenter + datastax-agent 日志。非常感谢
地址.yaml:
DataStax Agent address.yaml file (20141219)
Set stomp_interface to OpsCenter server
stomp_interface: 10.0.0.2
local_interface: 10.0.0.4
local_address: 10.0.0.4
stomp_port: 61620
Set SSL mode to 0
use_ssl: 0
cassandra_conf: /etc/cassandra/conf/cassandra.yaml
cassandra_log_location: /data/cassandra/log
代理日志
INFO [main] 2014-12-19 16:21:40,446 Waiting for the config from OpsCenter
INFO [main] 2014-12-19 16:21:40,449 Using 10.0.0.4 as the cassandra broadcast address
INFO [main] 2014-12-19 16:21:40,493 cassandra RPC address is nil
INFO [main] 2014-12-19 16:21:40,493 agent RPC address is 10.0.0.4
INFO [main] 2014-12-19 16:21:40,493 agent RPC broadcast address is 10.0.0.4
INFO [main] 2014-12-19 16:21:40,495 Clearing ssl.truststore
INFO [main] 2014-12-19 16:21:40,495 Clearing ssl.truststore.password
INFO [main] 2014-12-19 16:21:40,496 Setting ssl.store.type to JKS
INFO [main] 2014-12-19 16:21:40,496 Clearing kerberos.service.principal.name
INFO [main] 2014-12-19 16:21:40,496 Clearing kerberos.principal
INFO [main] 2014-12-19 16:21:40,496 Clearing kerberos.useTicketCache
INFO [main] 2014-12-19 16:21:40,496 Clearing kerberos.ticketCache
INFO [main] 2014-12-19 16:21:40,496 Clearing kerberos.useKeyTab
INFO [main] 2014-12-19 16:21:40,497 Clearing kerberos.keyTab
INFO [main] 2014-12-19 16:21:40,497 Clearing kerberos.renewTGT
INFO [main] 2014-12-19 16:21:40,497 Clearing kerberos.debug
INFO [main] 2014-12-19 16:21:40,497 Starting Stomp
INFO [main] 2014-12-19 16:21:40,498 SSL communication is disabled
INFO [main] 2014-12-19 16:21:40,498 Creating stomp connection to 10.0.0.2:61620
INFO [thrift-init] 2014-12-19 16:21:40,499 Connecting to Cassandra cluster: 10.0.0.4 (port 9160)
INFO [StompConnection receiver] 2014-12-19 16:21:40,501 Reconnecting in 0s.
INFO [thrift-init] 2014-12-19 16:21:40,506 Downed Host Retry service started with queue size -1 and retry delay 10s
INFO [thrift-init] 2014-12-19 16:21:40,508 Registering JMX me.prettyprint.cassandra.service_Agent Cluster:ServiceType=hector,MonitorType=hector
INFO [main] 2014-12-19 16:21:40,514 Starting Jetty server: {:port 61621, :host nil, :ssl? false, :join? false}
INFO [StompConnection receiver] 2014-12-19 16:21:40,516 Connected to 10.0.0.2:61620
INFO [pdp-loader] 2014-12-19 16:21:40,534 in execute with client org.apache.cassandra.thrift.Cassandra$Client@37571a58
INFO [thrift-init] 2014-12-19 16:21:40,534 Connected to Cassandra cluster: project1
INFO [thrift-init] 2014-12-19 16:21:40,537 in execute with client org.apache.cassandra.thrift.Cassandra$Client@37571a58
INFO [thrift-init] 2014-12-19 16:21:40,537 Using partitioner: org.apache.cassandra.dht.Murmur3Partitioner
INFO [pdp-loader] 2014-12-19 16:21:40,541 Attempting to load stored metric values.
INFO [Jetty] 2014-12-19 16:21:40,545 Jetty server started
INFO [StompConnection receiver] 2014-12-19 16:21:40,552 Got new config from OpsCenter: {:kerberos_use_keytab true, :rollups300_ttl 2419200, :kerberos_use_ticket_cache true, :rollups60_ttl 604800, :thrift_port 9160, :ec2_metadata_api_host "169.254.169.254", :metrics_enabled 1, :rollups7200_ttl 31536000, :thrift_ssl_truststore nil, :metrics_ignored_column_families "", :cassandra_log_location "/var/log/cassandra/system.log", :thrift_rpc_interface "10.0.0.4", :config_md5 "c8a350163b1373d9ec5b84d5141ca026", :jmx_port 7199, :provisioning 0, :use_ssl 0, :kerberos_debug false, :rollups86400_ttl -1, :api_port "61621", :storage_keyspace "OpsCenter", :kerberos_renew_tgt true, :metrics_ignored_solr_cores "", :thrift_ssl_truststore_type "JKS", :metrics_ignored_keyspaces "system, system_traces, system_auth, dse_auth, OpsCenter", :rollup_subscriptions [], :cassandra_install_location ""}
INFO [StompConnection receiver] 2014-12-19 16:21:40,553 Starting up agent collection.
INFO [StompConnection receiver] 2014-12-19 16:21:40,568 Starting OS metric collectors (Linux)
INFO [StompConnection receiver] 2014-12-19 16:21:40,573 Starting Cassandra JMX metric collectors
INFO [StompConnection receiver] 2014-12-19 16:21:40,580 New JMX connection (127.0.0.1:7199)
INFO [jmx-metrics-1] 2014-12-19 16:21:45,577 New JMX connection (127.0.0.1:7199)
ERROR [os-metrics-8] 2014-12-18 17:44:11,029 Short os-stats collector failed: Process failed: df --print-type --no-sync --block-size=1G --local
Exit val: 1
Output:
df: `/var/named/chroot/etc/named': Permission denied
df: `/var/named/chroot/var/named': Permission denied
df: `/var/named/chroot/etc/named.conf': Permission denied
df: `/var/named/chroot/etc/named.rfc1912.zones': Permission denied
df: `/var/named/chroot/etc/rndc.key': Permission denied
df: `/var/named/chroot/usr/lib64/bind': Permission denied
df: `/var/named/chroot/etc/named.iscdlv.key': Permission denied
df: `/var/named/chroot/etc/named.root.key': Permission denied
Filesystem Type 1G-blocks Used Available Use% Mounted on
rootfs rootfs 20 2 17 8% /
/dev/root ext3 20 2 17 8% /
devtmpfs devtmpfs 16 1 16 1% /dev
tmpfs tmpfs 16 0 16 0% /dev/shm
/dev/md3 ext3 2753 1 2614 1% /data
opscenterd.log
2014-12-19 16:21:30+0100 [project1] WARN: Ignoring scheduled job with type=best-practice, which is only supported with DataStax Enterprise.
2014-12-19 16:21:30+0100 [project1] WARN: Ignoring scheduled job with type=best-practice, which is only supported with DataStax Enterprise.
2014-12-19 16:21:30+0100 [project1] WARN: Ignoring scheduled job with type=best-practice, which is only supported with DataStax Enterprise.
2014-12-19 16:21:30+0100 [project1] WARN: Ignoring scheduled job with type=best-practice, which is only supported with DataStax Enterprise.
2014-12-19 16:21:30+0100 [project1] INFO: Done loading persisted scheduled job descriptions
2014-12-19 16:21:30+0100 [project1] INFO: Done initializing event storage.
2014-12-19 16:21:30+0100 [project1] INFO: OpsCenter starting up.
2014-12-19 16:21:30+0100 [] DEBUG: Persisting config file /etc/opscenter/clusters/project1.conf
2014-12-19 16:21:30+0100 [] INFO: Finished starting new cluster services for project1
2014-12-19 16:21:30+0100 [project1] INFO: Stopping repair service
2014-12-19 16:21:30+0100 [project1] WARN: HTTP request http://node4.project:61621/load-stomp-conf? failed: [<twisted.python.failure.Failure <class 'twisted.internet.error.ConnectionLost'>>]
2014-12-19 16:21:30+0100 [project1] INFO: Agent automatic setup failed for (10.0.0.6,10.0.0.10,10.0.0.3): Connection refused.
2014-12-19 16:21:30+0100 [project1] INFO: Agent automatic setup failed for (10.0.0.4): [<twisted.python.failure.Failure <class 'twisted.internet.error.ConnectionLost'>>]
2014-12-19 16:21:35+0100 [project1] DEBUG: Sending agent config to 10.0.0.4 on stomp channel /424343461/conf
2014-12-19 16:21:35+0100 [project1] INFO: Agent for ip 10.0.0.4 is version u'5.0.2'
2014-12-19 16:21:35+0100 [project1] DEBUG: Version for node None changed from {'search': None, 'jobtracker': None, 'tasktracker': None, 'spark': {u'master': None, u'version': None, u'worker': None}, 'dse': None, 'cassandra': u'2.0.11'} to {'search': None, 'jobtracker': None, 'tasktracker': None, 'spark': {u'master': None, u'version': None, u'worker': None}, 'dse': None, 'cassandra': u'2.0.11'}
2014-12-19 16:21:35+0100 [project1] INFO: Node 10.0.0.4 changed its mode to normal
2014-12-19 16:21:36+0100 [project1] INFO: Using 10.0.0.4 as the RPC address for node 10.0.0.4
2014-12-19 16:21:36+0100 [project1] INFO: Using 10.0.0.4 as the RPC address for node 10.0.0.4
2014-12-19 16:21:50+0100 [] DEBUG: EC2 Instance type: None
2014-12-19 16:21:50+0100 [] DEBUG: EC2 Instance ami: None
2014-12-19 16:21:50+0100 [] INFO: Package Manager: Unknown
2014-12-19 16:21:59+0100 [project1] DEBUG: Collecting node/token list over Thrift
2014-12-19 16:22:20+0100 [] DEBUG: Average opscenterd CPU usage: 2.14%, memory usage: 45 MB