SOLR-15111: Use JDK8 Base64 instead of own implementation #2252

asalamon74 · 2021-01-27T13:41:07Z

Description

JDK8 has a builtin Base64 encoder and decoder, there is no need to use own implementaion for this.

Solution

Eliminate own implementation.

Tests

Unit tests.

Checklist

Please review the following and check all that apply:

I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
I have created a Jira issue and added the issue ID to my pull request title.
I have given Solr maintainers access to contribute to my PR branch. (optional but recommended)
I have developed this patch against the master branch.
I have run ./gradlew check.
I have added tests for my changes.
I have added documentation for the Ref Guide (for Solr changes only).

madrob

I'm a little surprised to see StandardCharsets.ISO_8859_1 in use, most of the bytes we work with are UTF-8. Can you clarify?

madrob · 2021-01-28T21:12:17Z

solr/core/src/java/org/apache/solr/handler/admin/LukeRequestHandler.java

@@ -298,7 +299,7 @@ private static String getFieldFlags( SchemaField f )

      BytesRef bytes = field.binaryValue();
      if (bytes != null) {
-        f.add( "binary", Base64.byteArrayToBase64(bytes.bytes, bytes.offset, bytes.length));
+        f.add( "binary", StandardCharsets.ISO_8859_1.decode(Base64.getEncoder().encode(bytes.wrapToByteBuffer())).toString());


I don't think this is correct, why are we decoding something that we just encoded?

asalamon74 · 2021-01-29T07:40:38Z

@madrob

ISO_8859_1

In the first version I used UTF_8 but later I checked the source code of java.util.Base64 ( http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/687fd7c7986d/src/share/classes/java/util/Base64.java ) and they are using ISO-8859-1 internally, so I thought to use the same.

public byte[] decode(String src) {
    return decode(src.getBytes(StandardCharsets.ISO_8859_1));
}

Probably we could use both character encodings, since base64 encoded strings only use a small subset of the characters.

encode + decode

StandardCharsets.ISO_8859_1.decode(Base64.getEncoder().encode(bytes.wrapToByteBuffer())).toString());

First I base64 encode the ByteBuffer which gives me a ByteBuffer then I convert this ByteBuffer to String using Charsets decode ( https://stackoverflow.com/a/39845152 ). So it's an encode + a decode but it's a different type of encoding/decoding.

madrob · 2021-01-29T13:32:29Z

Unfortunately, we can’t use the solution provided from stack overflow -http://apache.org/legal/resolved.html#stackoverflow

Can you contact the original author and ask for permission to use this? Otherwise we will need somebody who hasn’t looked this code to create a clean room implementation.

asalamon74 · 2021-02-04T11:49:43Z

We also used a different way for String conversion, I modified the lines.

madrob reviewed Jan 28, 2021

View reviewed changes

Use JDK8 Base64 instead of own implementation

856d60d

asalamon74 force-pushed the SOLR-15111 branch from 0d759fc to 856d60d Compare February 4, 2021 11:49

asalamon74 mentioned this pull request Mar 17, 2021

SOLR-15111 Use JDK8 Base64 instead of own implementation apache/solr#24

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SOLR-15111: Use JDK8 Base64 instead of own implementation #2252

SOLR-15111: Use JDK8 Base64 instead of own implementation #2252

asalamon74 commented Jan 27, 2021

madrob left a comment

madrob Jan 28, 2021

asalamon74 commented Jan 29, 2021

madrob commented Jan 29, 2021

asalamon74 commented Feb 4, 2021

SOLR-15111: Use JDK8 Base64 instead of own implementation #2252

Are you sure you want to change the base?

SOLR-15111: Use JDK8 Base64 instead of own implementation #2252

Conversation

asalamon74 commented Jan 27, 2021

Description

Solution

Tests

Checklist

madrob left a comment

Choose a reason for hiding this comment

madrob Jan 28, 2021

Choose a reason for hiding this comment

asalamon74 commented Jan 29, 2021

madrob commented Jan 29, 2021

asalamon74 commented Feb 4, 2021