Skip to content

Releases: GoogleCloudDataproc/hadoop-connectors

2.2.22

29 Apr 09:48
Compare
Choose a tag to compare
  1. Upgrade java-storage to 2.37.0
  2. [Performance] Remove buffer copy in write path for gRPC

2.2.21

18 Mar 05:20
Compare
Choose a tag to compare
  1. [Bug Fix] Set user agent in gRPC requests

2.2.20

23 Feb 09:29
Compare
Choose a tag to compare
  1. Fix downscoping not working when gRPC is enabled issue
  2. Add support for renaming folders using rename backend API for Hierarchical namespace buckets
  3. Upgrade java-storage to 2.32.1 and upgrade the version of related dependencies
  4. Introduce separate config for enabling operation tracing.

2.2.19

09 Jan 10:05
Compare
Choose a tag to compare
  1. Upgrade java-storage to 2.29.0
  2. Upgrade guava to 32.1.2-jre

3.0.0

20 Dec 10:08
Compare
Choose a tag to compare
  1. Remove Hadoop 2.x support.
  2. Update all dependencies to the latest versions.
  3. Add support for downscoped tokens in AccessTokenProvider.
  4. Implement FileSystem.openFile to take advantage of the FileStatus if
    passed.
  5. Remove an obsolete AuthorizationHandler and related properties:
    fs.gs.authorization.handler.impl
    fs.gs.authorization.handler.properties.<AUTHORIZATION_HANDLER_PROPERTY>
    
  6. Remove support for Apache HTTP transport and related property:
    fs.gs.http.transport.type
    
  7. Support GCS fine-grained action in AuthorizationHandlers.
  8. Decrease log level for hflush rate limit log message.
  9. Remove Cooperative Locking support for directory operations and related
    properties:
    fs.gs.cooperative.locking.enable
    fs.gs.cooperative.locking.expiration.timeout.ms
    fs.gs.cooperative.locking.max.concurrent.operations
    
  10. Migrate authentication to com.google.auth.Credentials and remove obsolete
    properties:
    fs.gs.auth.service.account.email
    fs.gs.auth.service.account.keyfile
    fs.gs.auth.service.account.private.key
    fs.gs.auth.service.account.private.key.id
    
  11. Refactor authentication configuration to use an explicit fs.gs.auth.type
    enum property, instead of relying on inference of the authentication type
    based on the set configuration properties, and remove obsolete properties:
    fs.gs.auth.null.enable
    fs.gs.auth.service.account.enable
    
  12. Add support for a new USER_CREDENTIALS authentication type that retrieves
    a refresh token using the authorisation code grant flow configured via the
    following properties:
    fs.gs.auth.client.id
    fs.gs.auth.client.secret
    fs.gs.auth.refresh.token
    
  13. Merge all output stream types functionality in the default output stream
    that behaves similarly to the FLUSHABLE_COMPOSITE stream, and remove
    obsolete fs.gs.outputstream.type property.
  14. Set default value for fs.gs.list.max.items.per.call property to 5000.
  15. Set socket read timeout (fs.gs.http.read-timeout) as early as possible on
    new sockets returned from the custom SSLSocketFactory. This guarantees the
    timeout is enforced during TLS handshakes when using Conscrypt as the
    security provider.
  16. The Google Cloud Storage Connector now can be used as a
    Hadoop Credential Provider.
  17. Added dependency on the Cloud Storage Client Library
    (google-cloud-storage).
  18. Rename fs.gs.rewrite.max.bytes.per.call property to
    fs.gs.rewrite.max.chunk.size.
  19. Remove support of the deprecated fs.gs.io.buffersize.write property.
  20. Add support for size suffixes (k, m, g, etc) in values of size-related
    properties:
    fs.gs.inputstream.inplace.seek.limit
    fs.gs.inputstream.min.range.request.size
    fs.gs.outputstream.buffer.size
    fs.gs.outputstream.pipe.buffer.size
    fs.gs.outputstream.upload.cache.size
    fs.gs.outputstream.upload.chunk.size
    fs.gs.rewrite.max.chunk.size
    
  21. Remove .ms suffix from names and add support for time suffixes (ms, s,
    m, etc) in values of time-related properties:
    fs.gs.http.connect-timeout
    fs.gs.http.read-timeout
    fs.gs.max.wait.for.empty.object.creation
    fs.gs.outputstream.sync.min.interval
    fs.gs.performance.cache.max.entry.age
    
  22. Change default values of properties:
    fs.gs.http.connect-timeout (default: 20s -> 5s)
    fs.gs.http.read-timeout (default: 20s -> 5s)
    fs.gs.outputstream.upload.chunk.size (default: 64m -> 24m)
    
  23. Upgrade Hadoop to 3.3.5.
  24. Upgrade java-storage to 2.25.0
  25. Add support for WORKLOAD_IDENTITY_FEDERATION_CREDENTIAL_CONFIG_FILE authentication type that retrieves a refresh token using workload identity federation configuraiton defined in: fs.gs.auth.workload.identity.federation.credential.config.file

3.0.0-RC01

05 Dec 18:30
Compare
Choose a tag to compare
3.0.0-RC01 Pre-release
Pre-release
  1. Remove Hadoop 2.x support.
  2. Update all dependencies to the latest versions.
  3. Add support for downscoped tokens in AccessTokenProvider.
  4. Implement FileSystem.openFile to take advantage of the FileStatus if
    passed.
  5. Remove an obsolete AuthorizationHandler and related properties:
    fs.gs.authorization.handler.impl
    fs.gs.authorization.handler.properties.<AUTHORIZATION_HANDLER_PROPERTY>
    
  6. Remove support for Apache HTTP transport and related property:
    fs.gs.http.transport.type
    
  7. Support GCS fine-grained action in AuthorizationHandlers.
  8. Decrease log level for hflush rate limit log message.
  9. Remove Cooperative Locking support for directory operations and related
    properties:
    fs.gs.cooperative.locking.enable
    fs.gs.cooperative.locking.expiration.timeout.ms
    fs.gs.cooperative.locking.max.concurrent.operations
    
  10. Migrate authentication to com.google.auth.Credentials and remove obsolete
    properties:
    fs.gs.auth.service.account.email
    fs.gs.auth.service.account.keyfile
    fs.gs.auth.service.account.private.key
    fs.gs.auth.service.account.private.key.id
    
  11. Refactor authentication configuration to use an explicit fs.gs.auth.type
    enum property, instead of relying on inference of the authentication type
    based on the set configuration properties, and remove obsolete properties:
    fs.gs.auth.null.enable
    fs.gs.auth.service.account.enable
    
  12. Add support for a new USER_CREDENTIALS authentication type that retrieves
    a refresh token using the authorisation code grant flow configured via the
    following properties:
    fs.gs.auth.client.id
    fs.gs.auth.client.secret
    fs.gs.auth.refresh.token
    
  13. Merge all output stream types functionality in the default output stream
    that behaves similarly to the FLUSHABLE_COMPOSITE stream, and remove
    obsolete fs.gs.outputstream.type property.
  14. Set default value for fs.gs.list.max.items.per.call property to 5000.
  15. Set socket read timeout (fs.gs.http.read-timeout) as early as possible on
    new sockets returned from the custom SSLSocketFactory. This guarantees the
    timeout is enforced during TLS handshakes when using Conscrypt as the
    security provider.
  16. The Google Cloud Storage Connector now can be used as a
    Hadoop Credential Provider.
  17. Added dependency on the Cloud Storage Client Library
    (google-cloud-storage).
  18. Rename fs.gs.rewrite.max.bytes.per.call property to
    fs.gs.rewrite.max.chunk.size.
  19. Remove support of the deprecated fs.gs.io.buffersize.write property.
  20. Add support for size suffixes (k, m, g, etc) in values of size-related
    properties:
    fs.gs.inputstream.inplace.seek.limit
    fs.gs.inputstream.min.range.request.size
    fs.gs.outputstream.buffer.size
    fs.gs.outputstream.pipe.buffer.size
    fs.gs.outputstream.upload.cache.size
    fs.gs.outputstream.upload.chunk.size
    fs.gs.rewrite.max.chunk.size
    
  21. Remove .ms suffix from names and add support for time suffixes (ms, s,
    m, etc) in values of time-related properties:
    fs.gs.http.connect-timeout
    fs.gs.http.read-timeout
    fs.gs.max.wait.for.empty.object.creation
    fs.gs.outputstream.sync.min.interval
    fs.gs.performance.cache.max.entry.age
    
  22. Change default values of properties:
    fs.gs.http.connect-timeout (default: 20s -> 5s)
    fs.gs.http.read-timeout (default: 20s -> 5s)
    fs.gs.outputstream.upload.chunk.size (default: 64m -> 24m)
    
  23. Upgrade Hadoop to 3.3.5.
  24. Upgrade java-storage to 2.25.0
  25. Add support for WORKLOAD_IDENTITY_FEDERATION_CREDENTIAL_CONFIG_FILE authentication type that retrieves a refresh token using workload identity federation configuraiton defined in: fs.gs.auth.workload.identity.federation.credential.config.file

2.2.18

03 Nov 11:08
Compare
Choose a tag to compare
  1. Upgrade java-storage to 2.28.0
  2. Integrate journaling , bufferToDIskThenUpload and ParallelCompositeUpload APIs.

2.2.17

17 Aug 05:56
2609846
Compare
Choose a tag to compare
  1. Upgrade java-storage to 2.25.0

2.2.16

30 Jun 15:06
2ffc8e6
Compare
Choose a tag to compare
  1. Upgrade java-storage to 2.23.0

2.2.15

02 Jun 10:09
8b79f02
Compare
Choose a tag to compare
  1. Upgrade java-storage to 2.22.3.
  2. Add more instrumentation to GCS connector.