Skip to content

Invalid Argument on EPoll.wait #1226

@stefanos-kalantzis

Description

@stefanos-kalantzis

Java API client version

9.3.2

Java version

21

Elasticsearch Version

9.3.2

Problem description

First of all, we had the same issue with Elasticsearch client/server 18.19.12, but we're now running with 9.3.2 and the exact same thing happens.

We are using the elasticsearch-java client directly.

The behavior of the service when the error occurs is like this:

  1. At some seemingly random point in time the EPoll.wait exception is logged
  2. For the next ~30 seconds, the next 5 Elasticsearch requests fail with: "I/O reactor has been shut down"
  3. After that, all subsequent Elasticsearch requests fail with: "thread waiting for the response was interrupted"

But basically, after the Invalid Argument on EPoll.wait nothing works, and the application needs to be restarted.

Regarding infra, the application is containerized and runs with Kubernetes on AWS nodes with Bottlerocket OS (latest version). The Docker image is based on public.ecr.aws/docker/library/ibm-semeru-runtimes:open-21.0.7_6-jre-focal.


Exception 1:

java.io.IOException: Invalid argument
	at java.base/sun.nio.ch.EPoll.wait(Native Method)
	at java.base/sun.nio.ch.EPollSelectorImpl.doSelect(Unknown Source)
	at java.base/sun.nio.ch.SelectorImpl.lockAndDoSelect(Unknown Source)
	at java.base/sun.nio.ch.SelectorImpl.select(Unknown Source)
	at org.apache.hc.core5.reactor.SingleCoreIOReactor.doExecute(SingleCoreIOReactor.java:113)
	at org.apache.hc.core5.reactor.AbstractSingleCoreIOReactor.execute(AbstractSingleCoreIOReactor.java:86)
	at org.apache.hc.core5.reactor.IOReactorWorker.run(IOReactorWorker.java:44)
	at java.base/java.lang.Thread.run(Unknown Source)

Exception 2:

java.lang.RuntimeException: I/O reactor has been shut down
	at co.elastic.clients.transport.rest5_client.low_level.Rest5Client.extractAndWrapCause(Rest5Client.java:953)
	at co.elastic.clients.transport.rest5_client.low_level.Rest5Client.performRequest(Rest5Client.java:308)
	at co.elastic.clients.transport.rest5_client.low_level.Rest5Client.performRequest(Rest5Client.java:293)
	at co.elastic.clients.transport.rest5_client.Rest5ClientHttpClient.performRequest(Rest5ClientHttpClient.java:93)
	at co.elastic.clients.transport.ElasticsearchTransportBase.performRequest(ElasticsearchTransportBase.java:153)
	at co.elastic.clients.elasticsearch.ElasticsearchClient.index(ElasticsearchClient.java:3148)
	at co.elastic.clients.elasticsearch.ElasticsearchClient.index(ElasticsearchClient.java:3359)


...


Caused by: org.apache.hc.core5.reactor.IOReactorShutdownException: I/O reactor has been shut down
	at org.apache.hc.core5.reactor.IOWorkers.validate(IOWorkers.java:51)
	at org.apache.hc.core5.reactor.IOWorkers.access$000(IOWorkers.java:31)
	at org.apache.hc.core5.reactor.IOWorkers$PowerOfTwoSelector.next(IOWorkers.java:67)
	at org.apache.hc.core5.reactor.AbstractIOReactorBase.connect(AbstractIOReactorBase.java:53)
	at org.apache.hc.client5.http.impl.nio.MultihomeIOSessionRequester$2.executeNext(MultihomeIOSessionRequester.java:136)
	at org.apache.hc.client5.http.impl.nio.MultihomeIOSessionRequester$2.run(MultihomeIOSessionRequester.java:185)
	at org.apache.hc.client5.http.impl.nio.MultihomeIOSessionRequester.connect(MultihomeIOSessionRequester.java:189)
	at org.apache.hc.client5.http.impl.nio.DefaultAsyncClientConnectionOperator.connect(DefaultAsyncClientConnectionOperator.java:100)
	at org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager.connect(PoolingAsyncClientConnectionManager.java:449)
	at org.apache.hc.client5.http.impl.async.InternalHttpAsyncExecRuntime.connectEndpoint(InternalHttpAsyncExecRuntime.java:216)
	at org.apache.hc.client5.http.impl.async.AsyncConnectExec.proceedToNextHop(AsyncConnectExec.java:201)
	at org.apache.hc.client5.http.impl.async.AsyncConnectExec.access$000(AsyncConnectExec.java:82)
	at org.apache.hc.client5.http.impl.async.AsyncConnectExec$1.completed(AsyncConnectExec.java:153)
	at org.apache.hc.client5.http.impl.async.AsyncConnectExec$1.completed(AsyncConnectExec.java:142)
	at org.apache.hc.client5.http.impl.async.InternalHttpAsyncExecRuntime$1.completed(InternalHttpAsyncExecRuntime.java:119)
	at org.apache.hc.client5.http.impl.async.InternalHttpAsyncExecRuntime$1.completed(InternalHttpAsyncExecRuntime.java:110)
	at org.apache.hc.core5.concurrent.BasicFuture.completed(BasicFuture.java:123)
	at org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager$3$1.leaseCompleted(PoolingAsyncClientConnectionManager.java:328)
	at org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager$3$1.completed(PoolingAsyncClientConnectionManager.java:313)
	at org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager$3$1.completed(PoolingAsyncClientConnectionManager.java:274)
	at org.apache.hc.core5.concurrent.BasicFuture.completed(BasicFuture.java:123)
	at org.apache.hc.core5.pool.StrictConnPool.fireCallbacks(StrictConnPool.java:402)
	at org.apache.hc.core5.pool.StrictConnPool.lease(StrictConnPool.java:220)
	at org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager$3.<init>(PoolingAsyncClientConnectionManager.java:271)
	at org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager.lease(PoolingAsyncClientConnectionManager.java:266)
	at org.apache.hc.client5.http.impl.async.InternalHttpAsyncExecRuntime.acquireEndpoint(InternalHttpAsyncExecRuntime.java:105)
	at org.apache.hc.client5.http.impl.async.AsyncConnectExec.execute(AsyncConnectExec.java:141)
	at org.apache.hc.client5.http.impl.async.AsyncExecChainElement.execute(AsyncExecChainElement.java:54)
	at org.apache.hc.client5.http.impl.async.AsyncProtocolExec.internalExecute(AsyncProtocolExec.java:207)
	at org.apache.hc.client5.http.impl.async.AsyncProtocolExec.execute(AsyncProtocolExec.java:172)
	at org.apache.hc.client5.http.impl.async.AsyncExecChainElement.execute(AsyncExecChainElement.java:54)
	at org.apache.hc.client5.http.impl.async.AsyncHttpRequestRetryExec.internalExecute(AsyncHttpRequestRetryExec.java:97)
	at org.apache.hc.client5.http.impl.async.AsyncHttpRequestRetryExec.execute(AsyncHttpRequestRetryExec.java:184)
	at org.apache.hc.client5.http.impl.async.AsyncExecChainElement.execute(AsyncExecChainElement.java:54)
	at org.apache.hc.client5.http.impl.async.AsyncRedirectExec.internalExecute(AsyncRedirectExec.java:112)
	at org.apache.hc.client5.http.impl.async.AsyncRedirectExec.execute(AsyncRedirectExec.java:278)
	at org.apache.hc.client5.http.impl.async.AsyncExecChainElement.execute(AsyncExecChainElement.java:54)
	at org.apache.hc.client5.http.impl.async.InternalAbstractHttpAsyncClient.executeImmediate(InternalAbstractHttpAsyncClient.java:347)
	at org.apache.hc.client5.http.impl.async.InternalAbstractHttpAsyncClient.lambda$doExecute$0(InternalAbstractHttpAsyncClient.java:205)
	at org.apache.hc.core5.http.nio.support.BasicRequestProducer.sendRequest(BasicRequestProducer.java:93)
	at org.apache.hc.client5.http.impl.async.InternalAbstractHttpAsyncClient.doExecute(InternalAbstractHttpAsyncClient.java:178)
	at org.apache.hc.client5.http.impl.async.CloseableHttpAsyncClient.execute(CloseableHttpAsyncClient.java:97)
	at org.apache.hc.client5.http.impl.async.CloseableHttpAsyncClient.execute(CloseableHttpAsyncClient.java:107)
	at co.elastic.clients.transport.rest5_client.low_level.Rest5Client.performRequest(Rest5Client.java:302)

Exception 3:

java.lang.RuntimeException: thread waiting for the response was interrupted
	at co.elastic.clients.transport.rest5_client.low_level.Rest5Client.extractAndWrapCause(Rest5Client.java:914)
	at co.elastic.clients.transport.rest5_client.low_level.Rest5Client.performRequest(Rest5Client.java:308)
	at co.elastic.clients.transport.rest5_client.low_level.Rest5Client.performRequest(Rest5Client.java:293)
	at co.elastic.clients.transport.rest5_client.Rest5ClientHttpClient.performRequest(Rest5ClientHttpClient.java:93)
	at co.elastic.clients.transport.ElasticsearchTransportBase.performRequest(ElasticsearchTransportBase.java:153)
	at co.elastic.clients.elasticsearch.ElasticsearchClient.healthReport(ElasticsearchClient.java:2936)


...


Caused by: java.lang.InterruptedException: null
	at java.base/java.lang.Object.waitImpl(Native Method)
	at java.base/java.lang.Object.wait(Unknown Source)
	at java.base/java.lang.Object.wait(Unknown Source)
	at org.apache.hc.core5.concurrent.BasicFuture.get(BasicFuture.java:83)
	at co.elastic.clients.transport.rest5_client.low_level.Rest5Client.performRequest(Rest5Client.java:304)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions