Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure igraph can be installed in webR #1284

Open
krlmlr opened this issue Mar 6, 2024 · 6 comments
Open

Ensure igraph can be installed in webR #1284

krlmlr opened this issue Mar 6, 2024 · 6 comments

Comments

@krlmlr
Copy link
Contributor

krlmlr commented Mar 6, 2024

@georgestagg: Is there anything we can do to help? I'd like to create a webR demo for igraph and dm.

@georgestagg
Copy link

georgestagg commented Mar 6, 2024

Also reported at r-wasm/webr#341.

I had another look at it this today. The main blocker is the Fortran library arpack. The Fortran compiler that we are using, a patched version of LLVM Flang, does not yet support generating WebAssembly output for Fortran COMMON blocks (a form of global variables). Supporting COMMON blocks will require changes to LLVM itself.

The COMMON blocks in question are used in arpack-ng for reporting both debugging information and numerical statistics, defined in debug.h and stat.h.

If you aren't actually using the debugging functionality in arpack-ng, I think you can comment the COMMON blocks out without loss of any other functionality. I have never used {igraph} before, so I don't know how much, if at all, the debug.h and stat.h features are actually used.

I just tried it for myself, commenting out the blocks:

common /debug/
& logfil, ndigit, mgetv0,
& msaupd, msaup2, msaitr, mseigt, msapps, msgets, mseupd,
& mnaupd, mnaup2, mnaitr, mneigh, mnapps, mngets, mneupd,
& mcaupd, mcaup2, mcaitr, mceigh, mcapps, mcgets, mceupd

and

common /timing/
& nopx, nbx, nrorth, nitref, nrstrt,
& tsaupd, tsaup2, tsaitr, tseigt, tsgets, tsapps, tsconv,
& tnaupd, tnaup2, tnaitr, tneigh, tngets, tnapps, tnconv,
& tcaupd, tcaup2, tcaitr, tceigh, tcgets, tcapps, tcconv,
& tmvopx, tmvbx, tgetv0, titref, trvec

by prepending a c character in the first column of those lines.

With this change, the R package is able to be compiled for WebAssembly.


Next, once compiled the package still does not load, with error:

> library(igraph)
Error: package or namespace load failed for ‘igraph’ in dyn.load(file, DLLpath = DLLpath, ...):
 unable to load shared object '/usr/lib/R/library/igraph/libs/igraph.so':
  Could not load dynamic lib: /usr/lib/R/library/igraph/libs/igraph.so
LinkError: WebAssembly.Instance(): Import #226 module="env" function="dgesv_": imported function does not match the expected type

This is a WebAssembly issue occurring when loading a Lapack symbol. Unlike most systems, function signatures must be declared consistently for WebAssembly symbols. R's built-in Lapack implementation declares Fortran subroutines as returning void, but it looks like the source in this package declares Lapack subroutines as returning int:

int igraphdgetrf_(int *m, int *n, double *a, int *lda, int *ipiv,
int *info);
int igraphdgetrs_(char *trans, int *n, int *nrhs, double *a,
int *lda, int *ipiv, double *b, int *ldb,
int *info);
int igraphdgesv_(int *n, int *nrhs, double *a, int *lda,
int *ipiv, double *b, int *ldb, int *info);

In addition, extra so-called "hidden" Fortran character length arguments are missing from some of the function signatures in that file. Normally, these differences do not matter much, but under WebAssembly it does.

I continued to experiment, switching int to void and adding the extra length arguments to lapack_internal.h and where those functions are called, and recompiled the igraph package.

After these further changes, the package loads under webR. I am not very familiar with how to use igraph, but basic functionality seems to work:

Screenshot 2024-03-06 at 13 42 55

I have updated the igraph package on the webR binary Wasm package repository, so you can try this version out for yourself at https://webr.r-wasm.org/latest/ now using webr::install("igraph"). You might be able to trigger problems that I have not seen.

And, a summary of the changes I've made is here: main...r-wasm:rigraph:webr.

While these workarounds seem to get things up and running for Wasm, I don't really know how safe they are for the more traditional R systems. Applying them as-is means editing vendored source code, and might even break the package for normal Linux, Windows and macOS users.

Nevertheless, this should give you an idea of what the path looks like in the long term for webR compatibility. For formal changes to the C source code, it's possible to make specific changes for Wasm gated behind #ifdef __EMSCRIPTEN__ blocks, if changes are problematic on other systems. The technique is useful, I've used it before to make packages work on webR without affecting other systems and it works pretty well.

Another option might be to simply wait until our version of the LLVM Flang compiler better supports COMMON blocks, rather than hacking around the issue by commenting them out. Unfortunately, I don't have an idea of the timescale for that.

In the meantime, I am happy to maintain the fork at https://github.com/r-wasm/rigraph/tree/webr, we already do so for some other R packages that require patches for Wasm. That might be the simplest solution in the short term.

@szhorvat
Copy link
Member

szhorvat commented Mar 6, 2024

With the C core we bundle an older version of ARPACK that was translated to C with f2c. This would probably trigger compiler warnings on CRAN, so it's not option there, but could it be used with webR?

@georgestagg
Copy link

With the C core we bundle an older version of ARPACK that was translated to C with f2c [...] could it be used with webR?

Perhaps, can you please send me a link to one of the f2c converted ARPACK source files? I'll take a look.

However, note that the f2c converted version of Lapack in this repo also sets subroutines to have an integer return type, inconsistent with R's version of Lapack symbols:

Subroutine */ int igraphdgeev_(char *jobvl, char *jobvr, integer *n, doublereal *
a, integer *lda, doublereal *wr, doublereal *wi, doublereal *vl,
integer *ldvl, doublereal *vr, integer *ldvr, doublereal *work,
integer *lwork, integer *info)

So, we'd still have to deal with the int -> void mapping when compiling under Emscripten.

@szhorvat
Copy link
Member

szhorvat commented Mar 7, 2024

I'm sorry, you are correct. The int return type is still there. Translated LAPACK and ARPACK source files are all here: https://github.com/igraph/igraph/tree/master/vendor/lapack An example of an ARPACK one is dgetv0.c

The lack of a standard interface between C and Fortran (with older Fortran) tends to be an issue ...

@krlmlr
Copy link
Contributor Author

krlmlr commented Mar 7, 2024

Should we use the f2c translation also for the R package? Any downsides?

@szhorvat
Copy link
Member

szhorvat commented Mar 7, 2024

Many downsides. Worse performance, compiler warnings are likely, manual fix for warnings is not realistic, outdated ARPACK version. Upside: No need for a Fortran compiler, no issues with calling conventions (as discussed here), anyone can compile the igraph C core with minimal technical experience (which is important given our userbase)

The upsides don't all apply for the R interface ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants