我正在运行 CentOS 6.2 服务器 1GB 内存 98GB 磁盘大小
Elasticsearch 版本 1.4.2,包含大约 9000 个文档,并使用 JDBC 从 mysql 数据库检索记录
在过去几周里,它平均每天崩溃 2 次。发生这种情况时,服务器无法访问,除非重新启动。(没有 ssh 访问权限,尝试访问网站时出现 500 错误等)
频繁重启 Elasticsearch 似乎可以防止服务器宕机,但我更愿意找到并修复问题的根源。以下是几天前发生的崩溃的日志:
Elasticsearch.log
[2015-10-04 10:20:58,247][WARN ][monitor.jvm ] [Speedball] [gc][young][152940][2627] duration [2.2s], collections [1]/[3.9s], total [2.2s]/[12.5m], memory [89.8mb]->[35.9mb]/[197.2mb], all_pools {[young] [54.1mb]->[110.1kb]/[54.5mb]}{[survivor] [472.9kb]->[656.6kb]/[6.7mb]}{[old] [35.2mb]->[35.2mb]/[136mb]}
[2015-10-04 10:21:05,538][INFO ][index.shard.service ] [Speedball] [rondreis][0] updating refresh_interval from [1s] to [-1000]
[2015-10-04 10:21:06,795][INFO ][index.shard.service ] [Speedball] [tblproductoffer][0] updating refresh_interval from [1s] to [-1000]
[2015-10-04 10:22:54,653][WARN ][monitor.jvm ] [Speedball] [gc][young][153002][2628] duration [1.3s], collections [1]/[1.9s], total [1.3s]/[12.5m], memory [90.3mb]->[36.8mb]/[197.2mb], all_pools {[young] [54.5mb]->[24kb]/[54.5mb]}{[survivor] [656.6kb]->[1.5mb]/[6.7mb]}{[old] [35.2mb]->[35.2mb]/[136mb]}
[2015-10-04 10:22:57,406][INFO ][index.shard.service ] [Speedball] [rondreis][0] updating refresh_interval from [-1000] to [1s]
[2015-10-04 10:22:59,971][WARN ][monitor.jvm ] [Speedball] [gc][young][153006][2630] duration [1.7s], collections [1]/[2.2s], total [1.7s]/[12.5m], memory [83.1mb]->[36.8mb]/[197.2mb], all_pools {[young] [42.8mb]->[45.3kb]/[54.5mb]}{[survivor] [5mb]->[835.8kb]/[6.7mb]}{[old] [35.2mb]->[35.9mb]/[136mb]}
[2015-10-04 10:23:08,215][INFO ][river.jdbc.RiverMetrics ] pipeline org.xbib.elasticsearch.plugin.jdbc.RiverPipeline@f106d7f complete: river jdbc/rondreis metrics: 3304 rows, 0.1295726757143051 mean, (6.604331351617896 1.365257914334395 0.45761655479553015), ingest metrics: elapsed 2 minutes 2 seconds, 915.27 KB bytes, 283.0 bytes avg, 0.007 MB/s
[2015-10-04 10:25:33,021][WARN ][monitor.jvm ] [Speedball] [gc][young][153151][2632] duration [1.5s], collections [1]/[1s], total [1.5s]/[12.5m], memory [64.7mb]->[91.3mb]/[197.2mb], all_pools {[young] [27.8mb]->[54.5mb]/[54.5mb]}{[survivor] [917.6kb]->[2.1mb]/[6.7mb]}{[old] [35.9mb]->[35.9mb]/[136mb]}
[2015-10-04 10:25:43,724][INFO ][monitor.jvm ] [Speedball] [gc][young][153154][2639] duration [1.4s], collections [2]/[2.2s], total [1.4s]/[12.6m], memory [91.7mb]->[45.3mb]/[197.2mb], all_pools {[young] [48.4mb]->[168.9kb]/[54.5mb]}{[survivor] [4.3mb]->[4.2mb]/[6.7mb]}{[old] [39mb]->[40.9mb]/[136mb]}
[2015-10-04 10:25:57,116][WARN ][monitor.jvm ] [Speedball] [gc][young][153165][2641] duration [3s], collections [1]/[3.3s], total [3s]/[12.7m], memory [90.7mb]->[52.1mb]/[197.2mb], all_pools {[young] [44.5mb]->[2kb]/[54.5mb]}{[survivor] [4.2mb]->[6.7mb]/[6.7mb]}{[old] [41.8mb]->[45.4mb]/[136mb]}
[2015-10-04 10:26:06,283][WARN ][monitor.jvm ] [Speedball] [gc][young][153171][2642] duration [3.6s], collections [1]/[3.9s], total [3.6s]/[12.7m], memory [89.4mb]->[54.8mb]/[197.2mb], all_pools {[young] [37.2mb]->[4.3kb]/[54.5mb]}{[survivor] [6.7mb]->[5mb]/[6.7mb]}{[old] [45.4mb]->[49.8mb]/[136mb]}
[2015-10-04 10:26:12,194][WARN ][monitor.jvm ] [Speedball] [gc][young][153172][2643] duration [4.9s], collections [1]/[5.9s], total [4.9s]/[12.8m], memory [54.8mb]->[57.9mb]/[197.2mb], all_pools {[young] [4.3kb]->[4.1kb]/[54.5mb]}{[survivor] [5mb]->[3.7mb]/[6.7mb]}{[old] [49.8mb]->[54.2mb]/[136mb]}
[2015-10-04 10:26:38,171][WARN ][monitor.jvm ] [Speedball] [gc][young][153176][2644] duration [3s], collections [1]/[22.2s], total [3s]/[12.9m], memory [101.9mb]->[42.2mb]/[197.2mb], all_pools {[young] [43.9mb]->[4.1kb]/[54.5mb]}{[survivor] [3.7mb]->[0b]/[6.7mb]}{[old] [54.2mb]->[42.2mb]/[136mb]}
[2015-10-04 10:26:39,186][WARN ][monitor.jvm ] [Speedball] [gc][old][153176][209] duration [18.3s], collections [1]/[22.2s], total [18.3s]/[4.4m], memory [101.9mb]->[42.2mb]/[197.2mb], all_pools {[young] [43.9mb]->[4.1kb]/[54.5mb]}{[survivor] [3.7mb]->[0b]/[6.7mb]}{[old] [54.2mb]->[42.2mb]/[136mb]}
[2015-10-04 10:26:47,155][WARN ][monitor.jvm ] [Speedball] [gc][young][153182][2646] duration [2.6s], collections [1]/[2.9s], total [2.6s]/[12.9m], memory [102.9mb]->[67.5mb]/[197.2mb], all_pools {[young] [47mb]->[6.1kb]/[54.5mb]}{[survivor] [6.7mb]->[6.7mb]/[6.7mb]}{[old] [49.1mb]->[60.8mb]/[136mb]}
[2015-10-04 10:26:54,247][WARN ][monitor.jvm ] [Speedball] [gc][young][153186][2647] duration [1.4s], collections [1]/[3.1s], total [1.4s]/[12.9m], memory [106mb]->[77.7mb]/[197.2mb], all_pools {[young] [38.4mb]->[842.4kb]/[54.5mb]}{[survivor] [6.7mb]->[6.7mb]/[6.7mb]}{[old] [60.8mb]->[70.2mb]/[136mb]}
[2015-10-04 10:27:03,745][WARN ][monitor.jvm ] [Speedball] [gc][young][153189][2648] duration [1.6s], collections [1]/[7s], total [1.6s]/[13m], memory [118.3mb]->[51.7mb]/[197.2mb], all_pools {[young] [41.3mb]->[8.1kb]/[54.5mb]}{[survivor] [6.7mb]->[0b]/[6.7mb]}{[old] [70.2mb]->[51.7mb]/[136mb]}
[2015-10-04 10:27:58,542][INFO ][index.shard.service ] [Speedball] [tblproductoffer][0] updating refresh_interval from [-1000] to [1s]
[2015-10-04 10:28:27,313][INFO ][river.jdbc.RiverMetrics ] pipeline org.xbib.elasticsearch.plugin.jdbc.RiverPipeline@72a191c1 complete: river jdbc/tblproductoffer metrics: 76480 rows, 2.962182052284631 mean, (98.523212205934 28.85622315906601 10.274778138333478), ingest metrics: elapsed 7 minutes 21 seconds, 16.72 MB bytes, 229.0 bytes avg, 0.038 MB/s
消息
Oct 4 10:47:33 kernel: ps invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0
Oct 4 10:47:35 kernel: ps cpuset=/ mems_allowed=0
Oct 4 10:47:35 kernel: Pid: 31557, comm: ps Not tainted 2.6.32-504.3.3.el6.x86_64 #1
Oct 4 10:47:35 kernel: Call Trace:
Oct 4 10:47:35 kernel: [<ffffffff810d40c1>] ? cpuset_print_task_mems_allowed+0x91/0xb0
Oct 4 10:47:35 kernel: [<ffffffff81127300>] ? dump_header+0x90/0x1b0
Oct 4 10:47:35 kernel: [<ffffffff8122eabc>] ? security_real_capable_noaudit+0x3c/0x70
Oct 4 10:47:35 kernel: [<ffffffff81127782>] ? oom_kill_process+0x82/0x2a0
Oct 4 10:47:35 kernel: [<ffffffff811276c1>] ? select_bad_process+0xe1/0x120
Oct 4 10:47:35 kernel: [<ffffffff81127bc0>] ? out_of_memory+0x220/0x3c0
Oct 4 10:47:35 kernel: [<ffffffff811344df>] ? __alloc_pages_nodemask+0x89f/0x8d0
Oct 4 10:47:35 kernel: [<ffffffff8116c69a>] ? alloc_pages_current+0xaa/0x110
Oct 4 10:47:35 kernel: [<ffffffff811246f7>] ? __page_cache_alloc+0x87/0x90
Oct 4 10:47:35 kernel: [<ffffffff811240de>] ? find_get_page+0x1e/0xa0
Oct 4 10:47:35 kernel: [<ffffffff81125697>] ? filemap_fault+0x1a7/0x500
Oct 4 10:47:35 kernel: [<ffffffff8114eae4>] ? __do_fault+0x54/0x530
Oct 4 10:47:35 kernel: [<ffffffff8114f0b7>] ? handle_pte_fault+0xf7/0xb00
Oct 4 10:47:35 kernel: [<ffffffff81198b9b>] ? pipe_wait+0x5b/0x80
Oct 4 10:47:35 kernel: [<ffffffff8114fcea>] ? handle_mm_fault+0x22a/0x300
Oct 4 10:47:35 kernel: [<ffffffff8104d0d8>] ? __do_page_fault+0x138/0x480
Oct 4 10:47:35 kernel: [<ffffffff8128c245>] ? _atomic_dec_and_lock+0x55/0x80
Oct 4 10:47:35 kernel: [<ffffffff811b07d0>] ? mntput_no_expire+0x30/0x110
Oct 4 10:47:35 kernel: [<ffffffff8118f981>] ? __fput+0x1a1/0x210
Oct 4 10:47:35 kernel: [<ffffffff8152ff7e>] ? do_page_fault+0x3e/0xa0
Oct 4 10:47:35 kernel: [<ffffffff8152d335>] ? page_fault+0x25/0x30
Oct 4 10:47:35 kernel: Mem-Info:
Oct 4 10:47:35 kernel: Node 0 DMA per-cpu:
Oct 4 10:47:35 kernel: CPU 0: hi: 0, btch: 1 usd: 0
Oct 4 10:47:35 kernel: Node 0 DMA32 per-cpu:
Oct 4 10:47:35 kernel: CPU 0: hi: 186, btch: 31 usd: 173
Oct 4 10:47:35 kernel: active_anon:87918 inactive_anon:88009 isolated_anon:5376
Oct 4 10:47:35 kernel: active_file:136 inactive_file:122 isolated_file:332
Oct 4 10:47:35 kernel: unevictable:0 dirty:0 writeback:284 unstable:0
Oct 4 10:47:35 kernel: free:12258 slab_reclaimable:3027 slab_unreclaimable:14974
Oct 4 10:47:35 kernel: mapped:478 shmem:22 pagetables:36389 bounce:0
Oct 4 10:47:35 kernel: Node 0 DMA free:4648kB min:668kB low:832kB high:1000kB active_anon:4400kB inactive_anon:4648kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):768kB isolated(file):48kB present:15368kB mlocked:0kB dirty:0kB writeback:20kB mapped:44kB shmem:0kB slab_reclaimable:36kB slab_unreclaimable:244kB kernel_stack:16kB pagetables:864kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:12 all_unreclaimable? no
Oct 4 10:47:35 kernel: lowmem_reserve[]: 0 994 994 994
Oct 4 10:47:35 kernel: Node 0 DMA32 free:44384kB min:44384kB low:55480kB high:66576kB active_anon:347272kB inactive_anon:347388kB active_file:544kB inactive_file:488kB unevictable:0kB isolated(anon):20736kB isolated(file):1280kB present:1018016kB mlocked:0kB dirty:0kB writeback:1116kB mapped:1868kB shmem:88kB slab_reclaimable:12072kB slab_unreclaimable:59652kB kernel_stack:3440kB pagetables:144692kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:3034 all_unreclaimable? yes
Oct 4 10:47:35 kernel: lowmem_reserve[]: 0 0 0 0
Oct 4 10:47:35 kernel: Node 0 DMA: 34*4kB 64*8kB 22*16kB 2*32kB 8*64kB 8*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 4648kB
Oct 4 10:47:35 kernel: Node 0 DMA32: 2240*4kB 2782*8kB 373*16kB 81*32kB 34*64kB 3*128kB 2*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 44384kB
Oct 4 10:47:35 kernel: 17908 total pagecache pages
Oct 4 10:47:35 kernel: 17292 pages in swap cache
Oct 4 10:47:35 kernel: Swap cache stats: add 2986384, delete 2969092, find 1357308/1573932
Oct 4 10:47:35 kernel: Free swap = 0kB
Oct 4 10:47:35 kernel: Total swap = 2097148kB
Oct 4 10:47:35 kernel: 262127 pages RAM
Oct 4 10:47:35 kernel: 7514 pages reserved
Oct 4 10:47:35 kernel: 132906 pages shared
Oct 4 10:47:35 kernel: 231782 pages non-shared
Oct 4 10:47:35 kernel: [ pid ] uid tgid total_vm rss cpu oom_adj oom_score_adj name
Oct 4 10:47:35 kernel: [ 294] 0 294 2725 36 0 -17 -1000 udevd
Oct 4 10:47:35 kernel: [ 812] 0 812 6905 74 0 -17 -1000 auditd
Oct 4 10:47:35 kernel: [ 842] 0 842 62271 98 0 0 0 rsyslogd
Oct 4 10:47:35 kernel: [ 863] 25 863 40013 68 0 0 0 named
Oct 4 10:47:35 kernel: [ 878] 81 878 5358 38 0 0 0 dbus-daemon
Oct 4 10:47:35 kernel: [ 905] 0 905 1020 37 0 0 0 acpid
Oct 4 10:47:35 kernel: [ 925] 0 925 16672 39 0 -17 -1000 sshd
Oct 4 10:47:35 kernel: [ 937] 0 937 13145 14 0 0 0 vsftpd
Oct 4 10:47:35 kernel: [ 974] 0 974 27075 41 0 0 0 mysqld_safe
Oct 4 10:47:35 kernel: [ 1289] 27 1289 237303 9646 0 0 0 mysqld
Oct 4 10:47:35 kernel: [ 1369] 0 1369 4816 35 0 0 0 dovecot
Oct 4 10:47:35 kernel: [ 1375] 97 1375 3246 43 0 0 0 anvil
Oct 4 10:47:35 kernel: [ 1376] 0 1376 3278 45 0 0 0 log
Oct 4 10:47:35 kernel: [ 1378] 0 1378 3589 44 0 0 0 config
Oct 4 10:47:35 kernel: [ 1394] 498 1394 271775 3584 0 0 0 java
Oct 4 10:47:35 kernel: [ 1408] 93 1408 23430 42 0 0 0 exim
Oct 4 10:47:35 kernel: [ 1420] 0 1420 112748 522 0 0 0 httpd
Oct 4 10:47:35 kernel: [ 1437] 0 1437 16578 52 0 0 0 nginx
Oct 4 10:47:35 kernel: [ 1439] 496 1439 16713 139 0 0 0 nginx
Oct 4 10:47:35 kernel: [ 1441] 496 1441 16762 137 0 0 0 nginx
Oct 4 10:47:35 kernel: [ 1442] 496 1442 16579 85 0 0 0 nginx
Oct 4 10:47:35 kernel: [ 1450] 0 1450 13814 11 0 0 0 vesta-nginx
Oct 4 10:47:35 kernel: [ 1452] 502 1452 13814 29 0 0 0 vesta-nginx
Oct 4 10:47:35 kernel: [ 1462] 0 1462 34253 56 0 0 0 vesta-php
Oct 4 10:47:35 kernel: [ 1471] 502 1471 34253 31 0 0 0 vesta-php
Oct 4 10:47:35 kernel: [ 1472] 502 1472 34253 31 0 0 0 vesta-php
Oct 4 10:47:35 kernel: [ 1482] 0 1482 29333 68 0 0 0 crond
Oct 4 10:47:35 kernel: [ 1499] 0 1499 204119 485 0 0 0 fail2ban-server
Oct 4 10:47:35 kernel: [ 1539] 0 1539 1016 34 0 0 0 mingetty
Oct 4 10:47:35 kernel: [ 1541] 0 1541 1016 34 0 0 0 mingetty
Oct 4 10:47:35 kernel: [ 1543] 0 1543 1016 34 0 0 0 mingetty
Oct 4 10:47:35 kernel: [ 1545] 0 1545 1016 34 0 0 0 mingetty
Oct 4 10:47:35 kernel: [ 1547] 0 1547 1016 34 0 0 0 mingetty
Oct 4 10:47:35 kernel: [ 1549] 0 1549 1016 34 0 0 0 mingetty
Oct 4 10:47:35 kernel: [ 1560] 0 1560 2724 31 0 -17 -1000 udevd
Oct 4 10:47:35 kernel: [ 1561] 0 1561 2724 31 0 -17 -1000 udevd
Oct 4 10:47:35 kernel: [10565] 0 10565 62866 239 0 0 0 httpd
Oct 4 10:47:35 kernel: [10567] 503 10567 141456 1109 0 0 0 httpd
Oct 4 10:47:35 kernel: [10568] 503 10568 141469 1260 0 0 0 httpd
Oct 4 10:47:35 kernel: [10569] 503 10569 142692 1186 0 0 0 httpd
Oct 4 10:47:35 kernel: [10570] 503 10570 141602 1183 0 0 0 httpd
Oct 4 10:47:35 kernel: [10571] 503 10571 141610 589 0 0 0 httpd
Oct 4 10:47:35 kernel: [10572] 503 10572 141756 1256 0 0 0 httpd
Oct 4 10:47:35 kernel: [10573] 503 10573 141157 1020 0 0 0 httpd
Oct 4 10:47:35 kernel: [10574] 503 10574 142386 1227 0 0 0 httpd
Oct 4 10:47:35 kernel: [12531] 503 12531 141259 606 0 0 0 httpd
Oct 4 10:47:35 kernel: [12532] 503 12532 141667 945 0 0 0 httpd
Oct 4 10:47:35 kernel: [12533] 503 12533 141454 609 0 0 0 httpd
Oct 4 10:47:35 kernel: [29641] 502 29641 35055 52 0 0 0 crond
Oct 4 10:47:35 kernel: [29645] 0 29645 43896 51 0 0 0 sudo
Oct 4 10:47:35 kernel: [29648] 0 29648 26582 46 0 0 0 v-backup-users
Oct 4 10:47:35 kernel: [29958] 0 29958 26681 46 0 0 0 v-backup-user
Oct 4 10:47:35 kernel: [30067] 0 30067 28084 67 0 0 0 find
Oct 4 10:47:35 kernel: [30068] 0 30068 30701 45 0 0 0 tar
Oct 4 10:47:35 kernel: [30826] 503 30826 116293 1098 0 0 0 httpd
Oct 4 10:47:35 kernel: [30834] 503 30834 116141 630 0 0 0 httpd
Oct 4 10:47:35 kernel: [30835] 503 30835 116329 1236 0 0 0 httpd
Oct 4 10:47:35 kernel: [30837] 503 30837 116329 997 0 0 0 httpd
Oct 4 10:47:35 kernel: [30838] 503 30838 116397 1242 0 0 0 httpd
Oct 4 10:47:35 kernel: [30839] 503 30839 116329 1057 0 0 0 httpd
Oct 4 10:47:35 kernel: [30844] 503 30844 116236 1131 0 0 0 httpd
Oct 4 10:47:35 kernel: [30845] 503 30845 116108 673 0 0 0 httpd
Oct 4 10:47:35 kernel: [30848] 503 30848 140602 988 0 0 0 httpd
Oct 4 10:47:35 kernel: [30849] 503 30849 116104 614 0 0 0 httpd
—
——continues for a while—
—
Oct 4 10:47:35 kernel: [31576] 503 31576 112875 833 0 0 0 httpd
Oct 4 10:47:35 kernel: [31577] 503 31577 112879 978 0 0 0 httpd
Oct 4 10:47:35 kernel: [31578] 503 31578 112785 797 0 0 0 httpd
Oct 4 10:47:35 kernel: [31579] 503 31579 112876 841 0 0 0 httpd
Oct 4 10:47:35 kernel: [31580] 503 31580 112781 496 0 0 0 httpd
Oct 4 10:47:35 kernel: [31581] 503 31581 112782 726 0 0 0 httpd
Oct 4 10:47:35 kernel: [31582] 503 31582 112875 845 0 0 0 httpd
Oct 4 10:47:35 kernel: [31583] 48 31583 112748 480 0 0 0 httpd
Oct 4 10:47:35 kernel: [31584] 48 31584 112748 480 0 0 0 httpd
Oct 4 10:47:35 kernel: [31585] 48 31585 112748 480 0 0 0 httpd
Oct 4 10:47:35 kernel: [31586] 0 31586 112748 445 0 0 0 httpd
Oct 4 10:47:35 kernel: [31587] 48 31587 112748 441 0 0 0 httpd
Oct 4 10:47:35 kernel: [31588] 0 31588 112748 414 0 0 0 httpd
Oct 4 10:47:35 kernel: [31589] 48 31589 112748 434 0 0 0 httpd
Oct 4 10:47:35 kernel: [31590] 48 31590 112748 441 0 0 0 httpd
Oct 4 10:47:35 kernel: [31591] 0 31591 112748 423 0 0 0 httpd
Oct 4 10:47:35 kernel: [31592] 0 31592 112748 404 0 0 0 httpd
Oct 4 10:47:35 kernel: [31593] 0 31593 112748 445 0 0 0 httpd
Oct 4 10:47:35 kernel: Out of memory: Kill process 1394 (java) score 107 or sacrifice child
Oct 4 10:47:35 kernel: Killed process 1394, UID 498, (java) total-vm:1087100kB, anon-rss:14132kB, file-rss:204kB
以及今天发生故障时的弹性搜索日志(这次能够通过使用“service elasticsearch reboot”使用 ssh 恢复它,而无需重新启动)。
[2015-10-09 10:21:08,062][INFO ][index.shard.service ] [Black Panther] [tblproductoffer][0] updating refresh_interval from [1s] to [-1000]
[2015-10-09 10:22:22,647][DEBUG][action.admin.indices.settings.put] [Black Panther] failed to update settings on indices [rondreis]
org.elasticsearch.cluster.metadata.ProcessClusterEventTimeoutException: failed to process cluster event (update-settings) within 30s
at org.elasticsearch.cluster.service.InternalClusterService$2$1.run(InternalClusterService.java:263)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
[2015-10-09 10:22:23,002][WARN ][monitor.jvm ] [Black Panther] [gc][young][83250][1221] duration [5.5s], collections [1]/[14.2s], total [5.5s]/[3.8m], memory [125.8mb]->[74.4mb]/[197.2mb], all_pools {[young] [53.1mb]->[1.5mb]/[54.5mb]}{[survivor] [1018kb]->[1.1mb]/[6.7mb]}{[old] [71.7mb]->[71.7mb]/[136mb]}
[2015-10-09 10:22:23,101][ERROR][river.jdbc.SimpleRiverFlow] failed to process cluster event (update-settings) within 30s
org.elasticsearch.cluster.metadata.ProcessClusterEventTimeoutException: failed to process cluster event (update-settings) within 30s
at org.elasticsearch.cluster.service.InternalClusterService$2$1.run(InternalClusterService.java:263)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
[2015-10-09 10:25:02,504][WARN ][monitor.jvm ] [Black Panther] [gc][young][83345][1222] duration [3.4s], collections [1]/[4.2s], total [3.4s]/[3.9m], memory [127.3mb]->[76mb]/[197.2mb], all_pools {[young] [54.5mb]->[1.1mb]/[54.5mb]}{[survivor] [1.1mb]->[3.1mb]/[6.7mb]}{[old] [71.7mb]->[71.7mb]/[136mb]}
[2015-10-09 10:25:42,385][WARN ][monitor.jvm ] [Black Panther] [gc][young][83361][1223] duration [8.8s], collections [1]/[9.7s], total [8.8s]/[4m], memory [128.8mb]->[75.4mb]/[197.2mb], all_pools {[young] [53.9mb]->[66kb]/[54.5mb]}{[survivor] [3.1mb]->[2.8mb]/[6.7mb]}{[old] [71.7mb]->[72.5mb]/[136mb]}
[2015-10-09 10:25:45,132][INFO ][river.jdbc.RiverMetrics ] pipeline org.xbib.elasticsearch.plugin.jdbc.RiverPipeline@322e6dcf complete: river jdbc/rondreis metrics: 3288 rows, 0.12815869356635873 mean, (6.572349117469625 1.358646495863042 0.45540049399748883), ingest metrics: elapsed 4 minutes 38 seconds, 908.23 KB bytes, 282.0 bytes avg, 0.003 MB/s
[2015-10-09 10:27:44,274][INFO ][node ] [Black Panther] stopping ...
[2015-10-09 10:28:01,713][WARN ][river.jdbc.RiverThread ] interrupted
[2015-10-09 10:29:45,413][WARN ][common.jna ] Unable to lock JVM memory (ENOMEM). This can result in part of the JVM being swapped out. Increase RLIMIT_MEMLOCK (ulimit).
[2015-10-09 10:30:01,866][INFO ][node ] [Vance Astrovik] version[1.4.2], pid[11796], build[927caff/2014-12-16T14:11:12Z]
[2015-10-09 10:30:01,867][INFO ][node ] [Vance Astrovik] initializing ...
[2015-10-09 10:30:04,408][INFO ][plugins ] [Vance Astrovik] loaded [jdbc-1.4.0.6-664f819], sites []
[2015-10-09 10:32:14,502][INFO ][node ] [Vance Astrovik] initialized
[2015-10-09 10:32:14,502][INFO ][node ] [Vance Astrovik] starting ...
[2015-10-09 10:32:17,555][INFO ][transport ] [Vance Astrovik] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/185.13.224.246:9300]}
[2015-10-09 10:32:18,209][INFO ][discovery ] [Vance Astrovik] elasticsearch/3s3H31MrSCmyEze2t2L_jA
[2015-10-09 10:32:22,383][INFO ][cluster.service ] [Vance Astrovik] new_master [Vance Astrovik][3s3H31MrSCmyEze2t2L_jA][sensead.localdomain][inet[/185.13.224.246:9300]], reason: zen-disco-join (elected_as_master)
[2015-10-09 10:32:22,864][INFO ][http ] [Vance Astrovik] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/185.13.224.246:9200]}
[2015-10-09 10:32:22,864][INFO ][node ] [Vance Astrovik] started
[2015-10-09 10:32:34,032][INFO ][gateway ] [Vance Astrovik] recovered [3] indices into cluster_state
[2015-10-09 10:32:37,313][DEBUG][action.search.type ] [Vance Astrovik] All shards failed for phase: [query]
[2015-10-09 10:32:40,213][INFO ][river.jdbc.JDBCRiver ] scheduled river instance with cron expressions [0 21 * * * ?]
[2015-10-09 10:32:42,423][INFO ][river.jdbc.JDBCRiver ] scheduled river instance with cron expressions [0 21 * * * ?]
[2015-10-09 10:33:14,578][WARN ][indices.ttl ] [Vance Astrovik] failed to execute ttl purge
每 60 秒将最后一行打印到日志中。
我已经尝试解决这个问题几个星期了。我是不是忽略了一些简单的事情,还是发生了一些更大的事情?
答案1
根据日志条目:
Unable to lock JVM memory (ENOMEM). This can result in part of the JVM being swapped out. Increase RLIMIT_MEMLOCK (ulimit).
除了升级 RAM 之外,你还需要更改 elasticsearch 配置这里
答案2
好吧,这条消息清楚地表明您遇到的问题是内存不足:
Oct 4 10:47:35 kernel: Out of memory: Kill process 1394 (java) score 107 or sacrifice child
或者限制 java/httpd 实例的内存大小。