Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suffixed 64-bit integer symbols names #12

Open
grisuthedragon opened this issue Dec 15, 2020 · 17 comments
Open

Suffixed 64-bit integer symbols names #12

grisuthedragon opened this issue Dec 15, 2020 · 17 comments

Comments

@grisuthedragon
Copy link
Member

I did some research on the use of the suffixed function names for the 64bit integer build, and I found an inconsistency introduced by the way Julia introduced this suffix idea. It is no question that we need that suffixed symbol names to be able to mix 32 and 64 bit integer code without running into trouble. @ViralBShah, said that they added 64_ to all symbol names. That is true for Julia's OpenBLAS version and the ones compiled by the Fedora team, having the *64_ packages (@Enchufa2). But having a closer look the symbols we get the following cases:

  • Helper functions
     openblas_set_num_threads   -> openblas_set_num_threads64_
    
  • CBLAS functions
    cblas_dgemm -> cblas_dgemm64_
    
  • LAPACKE functions
     LAPACKE_dgetrf -> LAPACKE_dgetrf64_ 
    

These three types are fine and nothing to complain. But if somebody tries to use this inside its Fortran code (calling from Fortran):

  DGEMM -> DGEMM_64 

or using the Fortran interface from C ( or any other language over a C-like direct call):

 dgemm_  -> dgemm_64_ 

Both cases conflict are valid if one look from the generated symbol name, but they look strange and conflicting, from my point of view, with the way the direct C symbols are transformed. I would suggest doing the following: First add the suffix to the functions and then apply the Fortran -> Symbolname name mangling scheme. Because there a stil compilers around, that does not the underscore thing as the GNU compiler. Even adding the 64_ instead of 64 can cause trouble since some Fortran compilers add a second _ to the end of the symbol name whenever a function name contains a _.

So guys, @ViralBShah and @Enchufa2, I am open for a discussion about this.

@ViralBShah
Copy link

I might have not got the placement of underscores right. There's so many combinations. Here's the output of a snippet from nm libopenblas.dylib on my mac:

000000000360abb0 T _zunml2_64_
000000000360c3f0 T _zunmlq_64_
000000000360cd00 T _zunmql_64_

I think the right thing for flexiblas is to allow user provided suffixes during build time for ILP64 versions. Everyone can then pick what they want for their system.

@grisuthedragon
Copy link
Member Author

I might have not got the placement of underscores right. There's so many combinations. Here's the output of a snippet from nm libopenblas.dylib on my mac:

000000000360abb0 T _zunml2_64_
000000000360c3f0 T _zunmlq_64_
000000000360cd00 T _zunmql_64_

I think the right thing for flexiblas is to allow user provided suffixes during build time for ILP64 versions. Everyone can then pick what they want for their system.

As I said, there are strange compilers around. ;-)

Making them completely user defined would be a way, but a very hard way. Personally, I prefer to set standards since that allows an easy work-together. At the moment, as far as I recognize, there are only a few cases, where this is used and so we have still some time to define a good standard. What compiler did you use on the Mac? Since I rarely saw prefixed symbols.

@ViralBShah
Copy link

We pretty much use gfortran for everything.

@grisuthedragon
Copy link
Member Author

We pretty much use gfortran for everything.

I need to organize my self a Mac System at work next year and to analyze the symbol generation there.

But back to the problem, we should think of the appearing difference between the transformed C and FORTRAN function names.

@ViralBShah
Copy link

And here's linux:

000000000174c200 T zunmtr_64_
000000000174c8d0 T zupgtr_64_
000000000174cce0 T zupmtr_64_

@grisuthedragon
Copy link
Member Author

Then the _ prefix seems to be defined by the MacOS ABI. Other symbols should be affected by this as well.

@Enchufa2
Copy link

I don't have a strong opinion here. The broader the consensus upstream, the better for us.

In Fedora, we primarily follow the trail of OpenBLAS. They support SYMBOLSUFFIX since OpenMathLib/OpenBLAS/pull/459, and then there was some discussion around standardizing a suffix in e.g. OpenMathLib/OpenBLAS/issues/646. Here and there I see some Julia people stating that the 64_ suffix is being used in many places, but I have to say that we only have Julia and SuiteSparse using this in Fedora.

Anyway, we ported that to reference BLAS/LAPACK via some objcopy magic, and so we ship a version of both OpenBLAS and Netlib with SYMBOLSUFFIX=64_. This wasn't ported to either ATLAS nor BLIS though.

@grisuthedragon
Copy link
Member Author

The objcopy thing is really a bit tricky. In general the 64 bit suffix is what we need, but the quick and dirty way It was introduced reading the OpenBLAS discussion leads to this difference in the C and the Fortran Naming scheme. I would like to write in Fortran

 CALL DGEMM64(....) 

and from C either

 dgemm64_();

or

 cblas_dgemm64() 

because than one can easily argue: "Everything works like in the normal case, only with a 64 added". With the 64 added in the Fortran function name, the C interface automatically gets "64_" in the GNU-GCC world. For the Julia guys this isn't a problem since the name mangling is implemented as blasfunc macro there.

The way this suffix was included in OpenBLAS seems to be only regarded from the C interface point of view and not what happens if you want to use it from Fortran.

@Enchufa2
Copy link

Maybe we should summon @tkelman @stevengj @susilehtola to this discussion.

@stevengj
Copy link

stevengj commented Dec 15, 2020

If you want to add _64 to the Fortran names, and then make the corresponding change to the linker symbols, that seems reasonable to me. It is equivalent to appending 64_ to the symbols on most modern Fortran compilers (that use lowercase symbols + underscore), especially on Mac and Linux systems, but there are still a few exceptions.

I just checked the manual, and it seems that Intel Fortran ifort on Windows (but not macOS or Linux) still defaults to using all-uppercase symbols with no underscore (since that was the convention set by its predecessor Digital Visual Fortran decades ago). Some now-ancient compilers (g77, following f2c) used lowercase+underscore, but append two underscores to Fortran identifiers containing an underscore, so they would effectively add 64__ to the symbols.

@grisuthedragon
Copy link
Member Author

If you want to add _64 to the Fortran names, and then make the corresponding change to the linker symbols, that seems reasonable to me. It is equivalent to appending 64_ to the symbols on most modern Fortran compilers (that use lowercase symbols + underscore), especially on Mac and Linux systems, but there are still a few exceptions.

I just checked the manual, and it seems Intel Fortran ifort on Windows (but not macOS or Linux) still defaults to using all-uppercase symbols with no underscore (since that was the convention set by its predecessor Digital Visual Fortran decades ago). Some now-ancient compilers (g77, following f2c) used lowercase+underscore, but append two underscores to Fortran identifiers containing an underscore, so they would effectively add 64__ to the symbols.

I do not want to have the _ in between of the fortran names. This is the issue I claim with the SUFFIX added by the OpenBLAS makefile. The makefile options makes people think 64_ is added, but the fortran name is now "_64" which is the opposite.

@tkelman
Copy link

tkelman commented Dec 15, 2020

There are one or more new LLVM-world Fortran compilers that are relevant to keep in mind. I suspect/hope they will implement gfortran-compatible mangling by default. But they might choose to do something different on windows given the ifort precedent, I haven't followed closely enough to know.

@stevengj
Copy link

stevengj commented Dec 15, 2020

I do not want to have the _ in between of the fortran names.

So you just want to append 64 to the Fortran names, e.g. dgemm64 rather than dgemm_64? That would be a reasonable choice, except that it conflicts with the convention set by the SunPerf BLAS, which was then copied by Fedora EPEL, which was then copied by Julia and OpenBLAS, and is currently under consideration by Intel MKL and AMD. The specific choice doesn't seem to matter so much, but it would be better to avoid pointless diversity here.

@grisuthedragon
Copy link
Member Author

@stevengj
As I explained in the beginning, the current way, as used by Julia and implemented, leads to the inconsistency between the CBLAS/LAPACKE interface and the Fortran Interface. I would like to have only the 64 added in Fortran and C, but with a consistent interface I would also be happy. That means either we have

    cblas_dgemm -> cblas_dgemm64
    LAPACKE_dgetrf -> LAPACKE_dgetrf64 
    DGEMM -> DGEMM64  (Fortran, gfortran symbol name dgemm64_)

or

    cblas_dgemm -> cblas_dgemm_64
    LAPACKE_dgetrf -> LAPACKE_dgetrf_64 
    DGEMM -> DGEMM_64  (Fortran, gfortran symbol name dgemm_64_)

But not the mixed stuffed realized at the moment in OpenBLAS.

The way SunPerf BLAS is one of the less used BLAS libraries, so I would like to establish a consistent interface rather then supporting a more or less quick-and-dirty looking solution.
AMD is supporting BLIS and BLAS is only a wrapper, so the changes here effect only a small interface. Intel I don't know, but realizing an inconsistent scheme is not there way, I think. I looked at the IBM ESSL stuff and they do not provide anything.

@stevengj
Copy link

The reason we followed the SunPerf BLAS option was because it is supported by SuiteSparse, IIRC.

The second option will be mostly compatible with what we are doing now (since Julia, SuiteSparse, and many other libraries use only the Fortran API).

@grisuthedragon
Copy link
Member Author

grisuthedragon commented Dec 15, 2020

I look at UMFPACK/Source/cholmod_blas.h and there are Sun-like defines, you are right. So I will implement this for the Fortran interface in FlexiBLAS. The CBLAS/LAPACKE interface will also add _64 (and not 64_ as done by OpenBLAS at the moment).

We agree that first the _64 is added and then the Fortran name-mangeling is done? This would remove the problem of the different ways the compiler will translate the symbols. @stevengj you already mentioned ifort on Windows, but XLF on Linux/PPC or PGI does similar strange stuff, IIRC.

edit
Regarding LAPACK functions like DGGES3 it is better to have DGGES3_64 instead of DGGES364.

@grisuthedragon
Copy link
Member Author

Internal Issue No. 89.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants