Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WASM demo takes 40 seconds to load and is not interactive #2769

Open
sffc opened this issue Oct 19, 2022 · 20 comments · May be fixed by #4684
Open

WASM demo takes 40 seconds to load and is not interactive #2769

sffc opened this issue Oct 19, 2022 · 20 comments · May be fixed by #4684
Assignees
Labels
C-process Component: Team processes good first issue Good for newcomers help wanted Issue needs an assignee S-medium Size: Less than a week (larger bug fix or enhancement) T-enhancement Type: Nice-to-have but not required

Comments

@sffc
Copy link
Member

sffc commented Oct 19, 2022

During the 40 seconds it takes to download the latest full-data WASM demo, the page is not interactive.

Short-term solutions:

  1. Go back to using testdata
  2. Add a loading spiral that goes away when WASM is finished loading (this would take about 10 lines of code)
  3. Print an error upon input if the WASM is not loaded

Medium-term solution: download the locale data on demand. Steps to make that work:

  1. The wasm-demo datagen should generate a directory of language-specific postcard files
  2. A data provider should be hooked up that informs JS when root data is being used
  3. JS code that fetches the requested language pack when that callback occurs, and then re-tries the rendering

Steps 1 and 3 should already be achievable via public APIs. Step 2 is doable in Rust but may need an FFI hook. It will be easier to write once callback support is added.

@sffc sffc added T-enhancement Type: Nice-to-have but not required good first issue Good for newcomers help wanted Issue needs an assignee C-process Component: Team processes S-medium Size: Less than a week (larger bug fix or enhancement) labels Oct 19, 2022
@Manishearth
Copy link
Member

Without callbacks we can possibly make it work by having an introspecting adapter like you added in #2766

@sffc sffc added the backlog label Dec 1, 2022
@sffc sffc added this to the Backlog milestone Dec 22, 2022
@sffc sffc removed the backlog label Dec 22, 2022
@ashu26jha
Copy link
Contributor

ashu26jha commented Feb 27, 2024

Hello, can I take this one? My WASM demo is now running

I think short-term solutions has already been implemented by @sffc and implementing the medium term solution remains

@sffc
Copy link
Member Author

sffc commented Feb 27, 2024

Yes! This should be a good starter project now that you have the wasm running in your development environment. Happy to review an implementation plan or answer any questions.

@ashu26jha
Copy link
Contributor

Here is my findings so far, feel free to point anything out:

  1. As we are currently using ICU4XDataProvider.create_compiled() which is not an efficient approach. We could have the following directory specification, if we want to use fixed_decimal with bn locale:

    |-generated
      |-fixed-decimal
        |-bn.json 
      |-segmenter
      |-date-time
    

    Now to download the locale, we could fetch to get locale data from something like:

    https://raw.githubusercontent.com/unicode-org/icu4x/main/provider/datagen/tests/data/json/decimal/symbols%401/bn.json
    
  2. We need to keep track whether root data is being used or not. It should use an FFI hook to call a JavaScript function if root data is being used and display the message in our frontend. (I am slightly unsure about this one, apologies)

  3. This one should be comparitively easy as we need to re-render

Thoughts and question about second point:

  1. I am unsure about how js be informed, as diplomat is used to call into Rust libraries that too in a unidirectional fashion.
  2. I am still learning about diplomat so will update this comment if I could find something.

@Manishearth
Copy link
Member

It should use an FFI hook to call a JavaScript function if root data is being used and display the message in our frontend

We don't have callback support for Diplomat and are unlikely to have it soon. The introspection should be done on the JS side using the APIs from icu_capi, we can add more if needed.

@ashu26jha
Copy link
Contributor

We could create an adapter which wraps our DataProvider and expose an API which returns a boolean. The design I have in my mind is something like:

pub struct DynamicLocaleAdapter<P> {
    inner: P,
}

impl<P> DynamicLocaleLoaderAdapter<P> {
    pub fn new(provider: P) -> Self {
        Self { inner: provider }
    }
}

impl<P, M> DataProvider<M> for DynamicLocaleLoaderAdapter<P>
where
    P: DataProvider<M>,
    M: KeyedDataMarker,
{
    fn load(&self, req: DataRequest) -> Result<DataResponse<M>, DataError> {
        let response = self.inner.load(req);

        if let Ok(ref data_response) = response {
            if data_response.metadata.locale.is_langid_und() {
               // Do something here
            }
        }
        response
    }
}

This needs to be written in our ffi/capi/src/provider.rs. As our ultimate goal is to use in our frontend.

Thoughts regarding this?

@sffc
Copy link
Member Author

sffc commented Mar 6, 2024

This could work, but another approach that might be simpler would be

  1. In JS, make an array or Set of the locales that have been loaded; the set starts empty
  2. When a new locale is requested in the GUI, we check if data is loaded for it. If yes, we are done. If not, continue.
  3. Try to load the locale bundle file from the server. If you get a 404, use the LocaleFallbacker to request the parent locale. Continue doing this until you either find the locale bundle or reach root.
  4. When you finally get the bundle, add it to the data provider as a fork-by-locale provider. I think there is an FFI for this but if not we can add one.

Also, the bundle should be in postcard format, not JSON, and I think for the purposes of this demo it can contain all keys so we need to load it only one for all components.

@ashu26jha
Copy link
Contributor

Thanks @sffc for suggesting a much easier approach. Though I have couple of points:

  1. load the locale bundle file from the server

    I am unsure about the URL to load the files from. As for the URL I could only find URL for chakma locale:

     https://storage.googleapis.com/static-493776/icu4x_2023-11-03/ccp.blob
    

    Or do I have to use icu4x-datagen tools to generate it?

  2. I think there is an FFI

    Yes! We do have fork_by_locale in our FFI declarations

@sffc
Copy link
Member Author

sffc commented Mar 6, 2024

Yes, we have the ccp.postcard sample file. Part of this project is generating similar files for all other locales. They can likely be hosted in the same place as the WASM code.

@ashu26jha
Copy link
Contributor

So basically I have to expose the DatagenDriver over the FFI declaration to be able to use in JavaScript (generating them on demand). I guess currently it is not available.

@sffc
Copy link
Member Author

sffc commented Mar 6, 2024

The data files should be generated during the build. You shouldn't need to run Datagen in WASM.

@ashu26jha
Copy link
Contributor

I could do the following in package.json

  "scripts": {
    "clean": "rm -rf dist/*",
    "prebuild": "npm run generate-locale-data",
    "build": "webpack",
    "start": "webpack serve --mode development --port 12349",
    "tsc": "tsc -p",
    "generate-locale-data": "icu4x-datagen --keys all --locales en --format blob2 --out dist/my_data_blob.postcard"
  }

Now running npm run build would generate the files.

PS: Thank you for being so patient and thoughtful 🚀

@ashu26jha
Copy link
Contributor

ashu26jha commented Mar 9, 2024

@sffc Performance issue may occur just in one case:

If the user doesn't have icu_datagen installing it would take quite sometime.

@sffc
Copy link
Member Author

sffc commented Mar 9, 2024

Generating the files via an npm build seems like the right approach. There should not be performance issues for end users because datagen is run only when generating the WASM demo. Only ICU4X developers (and CI) should need to install icu4x-datagen.

@ashu26jha
Copy link
Contributor

ashu26jha commented Mar 13, 2024

Almost done, just a slight question.

It's being correctly formatted, but any reason why am I getting this in the console?

Screenshot 2024-03-13 at 9 12 08 PM

Also I have implemented a fallback mechanism which logs to where did it fallback. Should I submit a PR if its difficult to point to out?

This is how fallback is being printed
Screenshot 2024-03-13 at 9 25 37 PM

@sffc
Copy link
Member Author

sffc commented Mar 13, 2024

The LocaleFallbacker API should silence the errors and run the fallback algorithm; are you using it? I think in FFI there is also enable_locale_fallback() which you should use.

@ashu26jha
Copy link
Contributor

ashu26jha commented Mar 13, 2024

Yes, I am already using enProvider.enable_locale_fallback_with(this.fallbacker);. But I wanted to add where it fallback. Like using de-AT-u-co-phonebk should fallback to de which is working fine.
Screenshot 2024-03-13 at 11 36 49 PM

@sffc
Copy link
Member Author

sffc commented Mar 13, 2024

de-AT-u-co-phonebk should reach de-AT and then de with fallback.

@ashu26jha
Copy link
Contributor

ashu26jha commented Mar 13, 2024

I think the issue is with async await of javascript, here is the outline of my code:

private async init() {

        let enProvider = await this.loadLocale('und');
        let unProvider = await this.loadLocale('und');

        let fallbacker = ICU4XLocaleFallbacker.create(unProvider);
        let fallbackConfig = new ICU4XLocaleFallbackConfig();
        fallbackConfig.priority = ICU4XLocaleFallbackPriority.Language;
        let fallbackerWithConfig = fallbacker.for_config(fallbackConfig);

        this.fallbacker = fallbackerWithConfig;
        enProvider.enable_locale_fallback_with(this.fallbacker);
        this.dataProvider = enProvider;
}

public async trackLocaleFallback(fallbacker: ICU4XLocaleFallbacker, locale: ICU4XLocale): Promise < string > {
        let localeSet = new Set(localeDefault.default);
        let fallbackIterator = fallbacker.fallback_for_locale(locale);
        while(fallbackIterator.get().to_string() != 'und') {
            const fallbackLocale = fallbackIterator.get().to_string();
            if(localeSet.has(fallbackLocale)) {
                return fallbackLocale;
            }
            fallbackIterator.step();
        }
        return 'un';
}
    
    
public async loadLocale(newLocale: string): Promise < ICU4XDataProvider > {
        const icu4xLocale = ICU4XLocale.create_from_string(newLocale);
        // Gets the fallback locale from the iterator
        const fallbackLocale = await this.trackLocaleFallback(this.fallbacker, icu4xLocale);
        const newFilePath = `dist/${fallbackLocale}.postcard`;
        let newProvider = await this.createDataProviderFromBlob(newFilePath);
        await newProvider.fork_by_locale(this.dataProvider);
        this.dataProvider = newProvider;
        this.loadedLocales.add(newLocale);
        this.fallbackLocale = ICU4XLocale.create_from_string(fallbackLocale);
        return newProvider;
 }
Screenshot 2024-03-14 at 12 18 06 AM

Is it the right direction?

Also in the above image it fallbacks from de-AT-u-co-phonebk to de-AT to de.

Personally I feel the error is because it takes a while to load the file, when we get the file we update the provider and eventually update the formatter to use the new formatter.

@ashu26jha
Copy link
Contributor

I think the issue is with async await of javascript

@sffc Fixed, finally!!!!

@ashu26jha ashu26jha linked a pull request Mar 14, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-process Component: Team processes good first issue Good for newcomers help wanted Issue needs an assignee S-medium Size: Less than a week (larger bug fix or enhancement) T-enhancement Type: Nice-to-have but not required
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants