AWS 托管的 Elasticsearch 偶尔会出现“响应失败”的情况

AWS 托管的 Elasticsearch 偶尔会出现“响应失败”的情况

我们有一个 Elasticsearch 集群托管在亚马逊弹性搜索服务(AWS)

我们正在使用ElasticSearch 的 Jest Java HTTP Rest 客户端

有时(大概 10,000 个请求中 1 个),它似乎会在没有响应的情况下关闭连接。

我们的应用程序中的堆栈跟踪如下所示:

ERROR [2016-04-11 09:18:43,497] io.dropwizard.jersey.errors.LoggingExceptionMapper: Error handling a request: b9b9ee1e4eefadd2
! org.apache.http.NoHttpResponseException: search-xxx.eu-west-1.es.amazonaws.com:443 failed to respond
! at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:143) ~[my-app-0.0.1.jar:0.0.1]
! at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57) ~[my-app-0.0.1.jar:0.0.1]
! at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:261) ~[my-app-0.0.1.jar:0.0.1]
! at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:165) ~[my-app-0.0.1.jar:0.0.1]
! at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:167) ~[my-app-0.0.1.jar:0.0.1]
! at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:272) ~[my-app-0.0.1.jar:0.0.1]
! at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:124) ~[my-app-0.0.1.jar:0.0.1]
! at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:271) ~[my-app-0.0.1.jar:0.0.1]
! at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184) ~[my-app-0.0.1.jar:0.0.1]
! at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:88) ~[my-app-0.0.1.jar:0.0.1]
! at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110) ~[my-app-0.0.1.jar:0.0.1]
! at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184) ~[my-app-0.0.1.jar:0.0.1]
! at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82) ~[my-app-0.0.1.jar:0.0.1]
! at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:107) ~[my-app-0.0.1.jar:0.0.1]
! at io.searchbox.client.http.JestHttpClient.execute(JestHttpClient.java:48) ~[my-app-0.0.1.jar:0.0.1]

“ ”中的相关代码org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead如下:

final int i = sessionBuffer.readLine(this.lineBuf);
if (i == -1 && count == 0) {
    // The server just dropped connection on us
    throw new NoHttpResponseException("The target server failed to respond");
}

据我所知,亚马逊不允许我访问 Elasticsearch 服务器的日志。

所以:

  1. 我该如何诊断并修复此错误的原因?
  2. 如果最好的解决方法是我的应用重试这些失败,那么有没有一种使用 Jest 的简单重试方法?我没看到任何配置选项都可以自动执行此操作。

短暂性脑缺血发作

答案1

1:(尚不知道)

2:你可以配置 Jest 来重试因网络错误而失败的 Elasticsearch 操作,例如:

new JestClientFactory() {
    @Override
    protected HttpClientBuilder configureHttpClient(HttpClientBuilder builder) {
        builder = super.configureHttpClient(builder);

        // See DefaultHttpRequestRetryHandler.requestSentRetryEnabled
        //
        // true if it's OK to retry non-idempotent requests that have been sent
        // and then fail with network issues (not HTTP failures).
        //
        // "true" here will retry POST requests which have been sent but where
        // the response was not received. This arguably is a bit risky.
        //
        // Retries are logged at INFO level to org.apache.http.impl.execchain.RetryExec
        boolean requestSentRetryEnabled = true;

        builder.setRetryHandler(new DefaultHttpRequestRetryHandler(
                3,
                requestSentRetryEnabled));

        return builder;
    }
}

相关内容