Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create BreakIterator with report locale #280

Open
digulla opened this issue Jun 9, 2022 · 3 comments
Open

Create BreakIterator with report locale #280

digulla opened this issue Jun 9, 2022 · 3 comments

Comments

@digulla
Copy link

digulla commented Jun 9, 2022

Currently, BreakIterator in SimpleTextLineWrapper and ComplexTextLineWrapper is created using

BreakIterator.getCharacterInstance()

instead of

BreakIterator.getCharacterInstance(Locale)

The former uses the VMs default locate while the latter uses the supplied one which should be filled with the value from JRParameter.REPORT_LOCALE. This can cause problems with scripts like Italian which use U+2019 as apostrophe while English uses U+0027. So even through the report locale it Locale.ITALIAN, it will split d’investimento into two words.

Test case: After creating a BreakIterator with ITALIAN, the text una strategia d’investimento should be three words and two split positions (after una and after strategia).

https://github.com/TIBCOSoftware/jasperreports/blob/master/jasperreports/src/net/sf/jasperreports/engine/fill/ComplexTextLineWrapper.java#L106

https://github.com/TIBCOSoftware/jasperreports/blob/master/jasperreports/src/net/sf/jasperreports/engine/fill/SimpleTextLineWrapper.java

@digulla
Copy link
Author

digulla commented Jun 9, 2022

Alternatively, allow to supply a factory for BreakIterator via a report parameter. That would allow to inject custom solutions or the BreakIterator from ICU4J.

@dadza
Copy link
Collaborator

dadza commented Jun 17, 2022

Using java.text.BreakIterator.getLineInstance (getCharacterInstance is only called when truncating the last line inside a word) with Locale.ITALIAN doesn't seem to work in this case, it still breaks after U+2019. Still, I assume there are cases when using the report locale for java.text.BreakIterator.getLineInstance would make a difference.

ICU4J's BreakIterator.getLineInstance (with either Locale.ITALIAN or Locale.ENGLISH) does work. As a note, using ICU4J's BreakIterator would involve creating an adapter to java.text.BreakIterator as we use the break iterator with java.awt.font.LineBreakMeasurer.

Allowing custom implementations does of course make sense.

@dadza
Copy link
Collaborator

dadza commented Dec 7, 2022

FWIW you can use ICU4J as Java locale provider by including icu4j and icu4j-localespi jars in your classpath and adding -Djava.locale.providers=SPI,CLDR,COMPAT when launching the Java process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants