JCR-4369: Avoid S3 Incomplete Read Warning by elegant aborting#61
Open
woonsan wants to merge 7994 commits intoapache:trunkfrom
woonsan:feature/JCR-4369
Open
JCR-4369: Avoid S3 Incomplete Read Warning by elegant aborting#61woonsan wants to merge 7994 commits intoapache:trunkfrom woonsan:feature/JCR-4369
woonsan wants to merge 7994 commits intoapache:trunkfrom
woonsan:feature/JCR-4369
Conversation
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
AWS S3 SDK recommends to abort the
S3ObjectInputStreamif not intended to consume the data because the http connection pool behavior (by consuming data on close to reuse the connection) of HttpClient under the hood could cause a big performance issue when reading big amount of data. By aborting, it's better to simply abort the underlyingHttpRequestBaseand kick out the connection from the pool from AWS S3 SDK perspective.In multi-threaded working environment (due to multiple requests and/or
proactiveCachingmode ofCachingDataStore), the reading and storing actions ino.a.j.c.data.CachingDataStore.getStream(DataIdentifier)results in falling in the else block ofo.a.j.core.data.LocalCache.store(String, InputStream)while the file by the name could already exist when executing the else block. In that case, theS3ObjectInputStreamis never read and aborted. As a result,com.amazonaws.services.s3.internal.S3AbortableInputStream#close()ends up complaining about non-aborted/non-read-fully input stream.Therefore, my fix includes the following:
LocalCachechecks if the backend resource input stream is abortable. If abortable, it tries to abort the backend resource stream. For this purpose,BackendResourceAbortableinterface in jackrabbit-data is introduced.S3Backendwraps theS3ObjectInputStreamto implementBackendResourceAbortableby leveraging commons-io'sProxyInputStream.