Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

L10N module (for discussion only) #5059

Draft
wants to merge 16 commits into
base: master
Choose a base branch
from
Draft

Conversation

hiiamboris
Copy link
Collaborator

@hiiamboris hiiamboris commented Jan 31, 2022

You can build the consoles to check it out (I added the module temporarily into them for testing):

>> ? system/locale
SYSTEM/LOCALE is an object! with the following words and values:
     name               string!       "English (United States)"
     lang-name          string!       "English"
     region-name        string!       "United States"
     locale             word!         en_US
     language           string!       "en"
     region             string!       "US"
     currency           word!         USD
     numbers            map!          [system latn ordinal-suffixes]
     calendar           map!          [standalone format masks day1]
     months             block!        length: 12  ["January" "February" "...
     days               block!        length: 7  ["Monday" "Tuesday" "Wed...
     list               map!          [af af_ZA ar ar_AE ar_EG ar_SA bg b...
     numbering-systems  map!          [adlm ahom arab arabext bali beng b...
     cardinal           map!          [af bg tr ar cs de en es jp ko zh f...
     ordinal            map!          [af ar bg cs de es he jp ko pl pt r...
     tools              object!       [get-user-locale-id* get-user-local...
     currencies         object!       [names list on-change* on-deep-chan...

Biggest questions are (as always): data structure and naming.

See the Files changed tab for more.

Biggest issue is how to load best locale for the user?

I added loading code into the console init script for now, but for compiled scripts one will have to manually insert do bind [load-locale get-best-locale-id] system/locale/tools before the code, which is something I'd like to avoid.

Problem is, currently compiled code has the following structure: boot.red -> modules -> user's script. There's nothing at all after modules (including data from chosen locales) have been loaded and before user script starts.

Another problem is that modules cannot declare their own needs (they can, but it will just be ignored), so writing a module that depends on L10N module being loaded before it is impossible until we get needs that can include other needs. Consequently needs is also currently ordered, although I managed to write it so that one specify modules in any order.

Also, Format module I think should include Locales/red and it's dependencies because /red contains the data most useful for Format. Another place where needs needs dependency resolution.

Ultimately it should be possible to write a module that not only includes L10N but also lets it load the best available locale before evaluating itself.

Planned usage

  • Needs: "L10N to include every available locale and needed functions
  • Needs: [L10N-Core Locales] same as above
  • Needs: [L10N-Core Locales/red Locales/en Locales/root] minimal set needed to able to format dates and times for networking (red is based on en which is based on root - this is where I wish I could write Needs: Locales/en in the header of red and Locales/root in the header of en).
  • Needs: [L10N-Core Locales/ru Locales/ru_RU Locales/cs Locales/cs_CZ Locales/root] - example to support only 2 locales Russia and Czech Rep (again, list is long because there is no implicit inheritance, and if I used #include I would end up with multiple inclusions of the same data both between different #includes and between #includes and needs itself)

Licensing

If I understood Unicode license at all, derivative data still has to be licensed under their license. This however gets rather blurry to me, for example for plural.red file which contains my code based on their data. Suggestions?

@@ -105,14 +103,61 @@ red: context [
iterators: [loop until while repeat foreach forall forever remove-each]

standard-modules: [
;-- Name ------ Entry file -------------- OS availability -----
View %modules/view/view.red [Windows macOS Linux]
;-- Name ---------- Entry file -------------------- OS availability -----
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to list all locales instead of loading them by mask, because:

  • security (e.g. someone leaves malicious script in %temp%/modules/l10n/ and it gets included into the runtime)
  • I always have a lot of temporary scripts in the Red tree, don't want them to affect the build

@@ -310,26 +310,33 @@ system: context [
ports: context []

locale: context [
language:
language*: ;-- in locale language
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the * names because IMO they are very cryptic. I couldn't figure out what they are for until I found comments in the code.

numbers: none ;-- digits, symbols, numeric masks
calendar: none ;-- standalone, format, date masks
months: none ;-- shortcut for standalone month names (R2-compatibility)
days: none ;-- shortcut for standalone day names (R2-compatibility)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

months and days seems to come from R2, but I should question their usefulness.
We have data for both standalone (used in calendar) and formatted (used in dates) month and day names as system/locale/calendar/format/months/full and ../standalone/months/full (and days), but these two are shortcuts for the standalone version only.
Moreover, in our data we have days as a map: #(sun "Sunday" .. mon "Monday") etc, because first day of the week is different in different locales (in fact, about 50/50 split between Monday and Sunday), and days order as a block of 1-7 indexes where 1=Monday (as it came from R2) may be confusing.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe, that this area is not easy to get right in al its complexity. But having months / days available under calendar/format/months/full makes me ask - what is format here? What is full here? Sounds quite cryptic, but I can understand that most probably inner structure is not something for an end user to care about, if there are accessor functions to easily get such elements.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be too big to include here. Just see for example cs.red file.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Data is meant not only for format functions (in fact, only format branch is used by them), but also if you want to localize your UI or smth. Or you can write your own locale or modify already existing (e.g. by adding strings: map for all UI elements text). So it's for more involved users.

;@@ we should expect rule for n=10 to succeed but it won't
;@@ TODO: reconsider `n` value

system/locale/cardinal: to map! to block! object [
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lack of object to map conversion or ability to put functions into maps directly makes this a little awkward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants