Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

馃挕 [Feature] - Ability to set CCSID for MQ connector to non-default value #124

Open
theodore-evans opened this issue Dec 15, 2023 · 6 comments 路 May be fixed by #125
Open

馃挕 [Feature] - Ability to set CCSID for MQ connector to non-default value #124

theodore-evans opened this issue Dec 15, 2023 · 6 comments 路 May be fixed by #125

Comments

@theodore-evans
Copy link

Description

The current implementation does not allow for a configuration value to be passed in to com.ibm.mq.jms.MQConnectionFactory.setCCSID, resulting in only the default Latin-1 string encoding (CCSID=819) being supported.

For some IBM MQ installations, an encoding other than 819 is in use, limiting the utility of this Kafka Connector

Suggested Solution

Add a ccsid configuration parameter to the mq configuration group for MQSourceConnector, and pass this to the MQConnectionFactory.setCCSID method. This parameter should have a default of 819, making the default behavior explicit.

Alternatives

No response

Additional Context

No response

@theodore-evans theodore-evans linked a pull request Dec 15, 2023 that will close this issue
9 tasks
@dalelane
Copy link
Contributor

dalelane commented Jan 11, 2024

Hi @theodore-evans - thanks very much for the issue, sorry it's taken a while to get to this.

Can you describe a little more about why you feel this is needed, please? MQConnectionFactory.setCCSID is not a config option I'm very familiar with, so I've had to ask around about it - and from what I've learned, it is apparently used to define character encoding for the initial connection to the queue manager.

If you're looking to control the encoding for the content of text messages, this is instead something that we would need to set on the MQQueue (using setCCSID and setReceiveCCSID).

Or have you found that it's important to have this specifically for the initial connection?

@theodore-evans
Copy link
Author

theodore-evans commented Feb 6, 2024

Hi @dalelane, no problem, I am also only just coming back to this topic

This is required for consuming messages from an RMQ MQ where the text encoding is non-ASCII (i.e. not ccsid=819). In cases where the MQ is not under the control of the maintainers of the connector, this needs to be configurable on the Connector side i.e. can be parsed through the connector config, and passed in to MQConnectionFactory

mqConnFactory.setCCSID(getCCSIDOrDefault(props));

I created a draft PR to resolve this issue a while back (see above) but ran into some issues, because it seemed that the default value for the ccsid configuration parameter was not yet available at the point where this would be passed to setCCSID (the validate method of ConfigDef having not yet been called).

This raised a question of whether to set the default in the connector config to 819 i.e. ASCII encoding and include some custom logic to handle the default or to leave the configuration default as null, and rely on the MQConnectionFactory's own defaults.

Do you have a suggestion about how to resolve this?

@dalelane
Copy link
Contributor

dalelane commented Feb 6, 2024

This is required for consuming messages from an RMQ MQ where the text encoding is non-ASCII (i.e. not ccsid=819). In cases where the MQ is not under the control of the maintainers of the connector...

I think perhaps my message wasn't clear.

I was saying that my understanding is that the way to achieve what you describe is to add a setReceiveCCSID() call to the MQQueue created here

@theodore-evans
Copy link
Author

Ah, I understand now. Thank you for the clarification.

Yes, I believe the implementation you are suggesting would resolve this issue in a canonical way, i.e. that we do not need to set the ccsid on the connection factory for the initial connection.

The other points regarding configuration parsing remain open though, I think?

@theodore-evans
Copy link
Author

@dalelane Is there any update on this issue?

@dalelane
Copy link
Contributor

dalelane commented Feb 26, 2024

Sorry about the delay - I was on vacation.

This is an area of MQ behaviour that I'm not familiar with (I'm more of a Kafka person than an MQ person), so I've been getting advice from the MQ developers on what they feel would be helpful here.

What they pointed out to me was that if you're sending text-based messages, these would map to a JMSTextMessaage. Apparently the payload of a JMSTextMessage is always a Java UTF-16 string.

As a result, this means the MQ classes for JMS that the Kafka Connector uses is always converting the message data to UTF-16 - this is done automatically based on the CCSID contained in the message payload in the MQMD portion of the message.

The EBCIDC/CCSID500 message payload you decscribe will always be converted to UTF-16. The setting we've been discussing above doesn't alter that - it simply means that the conversion would be performed by the queue manager rather than the JVM.

They pointed me at some documentation for this at https://www.ibm.com/docs/en/ibm-mq/9.3?topic=conversion-queue-manager-data which has useful context. In particular:

The reason for changing WMQ_RECEIVE_CCSID is specialized; the chosen CCSID makes no difference to the text objects created in the JVM. However, some JVMs, on some platforms, might not be able to handle conversion from the CCSID of text in the message into Unicode. The option gives you a choice of CCSID for any text delivered to the client in the message. Some JMS client platforms have had problems with message text being delivered in UTF-8.

Before we add the option I think it would be helpful to understand if it will solve your problem. I fear that adding this option might not change what you're currently seeing.

Please can you describe the problem you're trying to solve, in terms of the current configuration you are using and the current behaviour it results in. We can hopefully use this to better understand the right fix to make.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants