Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update ocaml-base-compiler, ocaml-system and ocaml-variants 4.13.0+ with support for native Windows #25861

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

dra27
Copy link
Member

@dra27 dra27 commented May 10, 2024

This PR augments the three compiler packages (ocaml-base-compiler, ocaml-system and ocaml-variants) with support for the MSVC and mingw-w64 native Windows ports for 4.13.0+. I intend to extend this support back to 4.08 as part of an ongoing overhaul of the compiler packages, but that is beyond the scope of this PR.

Principles

The implementation presented here stems from three underlying principles:

  1. All four native Windows ports should be able to be co-installed in the same opam root (i.e. ~/.opam) as separate switches (as with opam-repository-mingw). That is, the user should not be forced to choose permanently between MSVC/mingw-w64 and/or amd64/i686 at opam init.
  2. The Windows ports should not be made to look like Tier 2 or alternate platforms; i.e. the instructions to create a Windows OCaml switch should not be fundamentally different from other platforms.
  3. Depexts requirements should be precise, so the installation of a conf- package should not speculatively install more dependencies than necessary for the given switch. In particular, if a switch is configured with i686 OCaml, the installation of conf-libfoo should install the required packages for i686 libfoo, not both i686 and x86_64 libfoo.
Corollaries

There are some immediate observations and problems:

  • The first principle matches the existing capability of opam-repository-mingw.
  • The second principle prohibits either requiring opam exec or needing a Windows-specific wrapper (such as with-dkml or ocaml-env).
  • The second principle also has implications for the ocaml-base-compiler package, since there is no concept of a "default" C compiler on Windows. The mingw-w64 and MSVC ports of OCaml are not interchangeable in the way that GCC or Clang-based OCaml is.
  • The depext system in OCaml for Windows was based on opam 2.0's opam-depext plugin, and never adopted opam 2.1's integrated support. While the use of the plugin means that it suffers from the same solving problems that caused depext to be integrated directly into the solver in opam 2.1, it did mean that opam depext knew which compiler was installed in the switch, because it was necessarily run after the switch had been created.
  • The depexts section of an opam file can only be filtered on global variables. While we can (just about) set switch-specific global variables, this would be both awkward (users would need to know of an extra step), but would also contradict the second principle, adding a requirement to Windows-specific parts of the workflow. This means that the depexts for Windows need to be recorded in separate conf- packages, and the dependency graph of a switch needs to record sufficient information about the architecture and C compiler to select the correct package (in short, we require more base--like packages).
Other related work

This work overlaps with some considerable ongoing additional work on the compiler's opam packages:

I was originally (back in September) of the opinion that it would be better to solve these two issues first and then add Windows support as an extension of these fixes. However, while it's tempting to engineer things such that Windows becomes a "minor" addition, the compatibility concerns for these two fixes make them much higher risk than the Windows packages, which have no compatibility story to worry about. I've therefore restructured the changes so that the alterations are made Windows-only for now, with the fixes to Unix following later, lifting these "limitations".

How it works

With opam 2.2.0 beta2, opam init (if pointed to this branch) will create an OCaml 5.1.1 switch. Concretely, on a clean Windows 11 system:

winget install Git.Git
winget install opam

and then in fresh terminal followed by:

rem Accept all defaults for opam init
opam init git+https://github.com/dra27/opam-repository.git#windows-initial
opam exec -- ocaml

will give OCaml 5.1.1! When creating a switch, an arch- or system- package can simply be added just as for the ocaml-option- packages. For example, assuming the user has installed Visual Studio which, amongst methods, may be done with:

winget install Microsoft.VisualStudio.2022.BuildTools --override "--add Microsoft.VisualStudio.Workload.VCTools --includeRecommended --passive"

then a 32-bit MSVC 4.14.2 switch may be created with:

opam switch create 4.14.2-msvc32 ocaml.4.14.2 system-msvc arch-x86_32
echo print_endline "Hello, world" > hello.ml
opam exec -- ocamlopt -o hello.exe hello.ml

Note that these steps do not require the user to start a Visual Studio Tools Command Prompt or do anything beyond installing Visual Studio.

Under the hood

In more detail, at present, the ocaml.x.y.z package encodes the version of OCaml being installed. To this, at present for Windows only, I have added two more sets of packages:

  • arch-x86_32 and arch-x86_64 allow the choice between the i686 and amd64 architectures.
  • system-mingw and system-msvc provide the choice between the mingw-w64 and Microsoft Visual Studio (MSVC) ports.

Both ocaml-base-compiler and ocaml-variants use these two sets of packages. ocaml-base-compiler remains "OCaml in its default configuration", but it becomes possible to control exactly which C compiler configuration it's using.

The default compiler is amd64 mingw-w64 (i.e. arch-x86_64 and system-mingw will be automatically added if no other arch- or system- package has been selected) for two reasons:

  • We can't detect MSVC using opam's depext system at present (but we can automatically install mingw-w64), so mingw-w64 as a default means that opam init always builds a working OCaml
  • Cygwin (and MSYS2) are not available for 32-bit systems, so users will be on 64-bit Windows.

For now, it is intentionally not possible to install the Cygwin port of OCaml using native Windows opam.

Where the user installs a system- and an arch- package, there are also new sets of host-arch- and host-system- which are installed by all opam switches. The idea here is that one host-system- and one host-arch- package are always installed in a switch (be that ocaml-base-compiler, ocaml-variants or ocaml-system).

More information for users

The key rule is that arch- and system- should never be used in opam files because there are packages (such as ocaml-system) which don't use them. These are also packages which may also disappear in the future if opam gains a way to specify configuration options for to packages at installation time.

From the user's perspective, specifying arch-x86_64 vs host-arch-x86_64 is similar to the difference between specifying ocaml-base-compiler.4.14.1 vs ocaml.4.14.1. ocaml-base-compiler.4.14.1 instructs opam to build 4.14.1 from source, where ocaml.4.14.1 permits the use of a system compiler.

The motivation for this change is to be able to indicate precisely where packages are not supported:

  • available: os != "win32" is the sledgehammer: no Windows support at all
  • conflicts: "host-system-msvc": this package works with the mingw-w64 ports, but doesn't work with the Visual Studio ports
  • conflicts: "host-arch-x86_32": this package doesn't work on 32-bit Intel
  • depends: "host-system-mingw": this package only works with the mingw-w64 ports.

Notes

I've attempted to organise the changes into a meaningful commit series, which is slightly easier to review than the entire diff in one go. I've added missing metadata fields packages in order to pass opam lint.

@dra27
Copy link
Member Author

dra27 commented May 10, 2024

I have a lot of markdown notes on the full compiler package infrastructure, as well as further notes on the changes here which I'll consolidate into a single piece of markdown which can finally form some documentation (probably to go in the wiki here).

@dra27
Copy link
Member Author

dra27 commented May 10, 2024

(Some of the packages are triggering a lint warning which will be addressed in opam 2.2.0 beta3 by the final version of ocaml/opam#5927)

@jonahbeckford
Copy link
Contributor

If you are making changes to ocaml-config, I have been stuck not being able to submit DkML into the opam repository because of https://gitlab.com/dkml/distributions/dkml/-/releases/2.1.1#likely-permanent-incompatibilities

The root cause are my changes at https://github.com/diskuv/diskuv-opam-repository/tree/main/packages/ocaml-config so that the delegate executables (ex. ocamlc-real.exe) used by the DkML shims (ex. ocamlc.exe) are recognized. There was some restriction on OCaml 5 only for ocaml-config.3 which makes it impossible for me to upstream.

@dra27
Copy link
Member Author

dra27 commented May 11, 2024

ocaml-config.3 is OCaml 5 only when not on Windows - is that insurmountable? The change made here is to be explicit that Windows always requires ocaml-config.3. I think this may be a separate discussion (especially if those delegate executables relate to relocation).

This package is installed if the underlying OCaml compiler is for
64-bit IBM POWER.

Precisely, this means `ocamlopt -config-var architecture` equals `power`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a "nit" question here: why ppc64 instead of the OCaml power term in the package?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed - the namespacing here is just horrible from a usability perspective (recalling near errors over riscv vs riscv64, etc.!) - so many different names, and often in strings 😱

I opted that opam-based consistency was the best thing (or "least-worst"), so went with the values opam would use for %{arch}% on the basis that both are likely to be used in opam files and while it's odd that ocaml -config-var architecture gives power, it's probably stranger to have to write "host-arch-power" {arch = "ppc64"} in the opam files.

opam's arch variable is whatever uname -m gives with normalisations as given in OpamSysPoll.normalise_arch which pretty much reduce them to x86_64/x86_32, ppc64/ppc32, arm64/arm32, falling back to the uname -m value.

@avsm
Copy link
Member

avsm commented May 12, 2024

A very charismatic PR, thank you @dra27 :-) I'm not quite sure what the best way to review this; might you have any suggestions? There are three considerations:

  • new arch/host package scheme: this looks useful for other packages as well (for example for simpler selection of 64-bit architectures ?).
  • changes to ocaml-base-compiler packages for existing non-Windows architectures: is the expected standard that "no behavioural changes are expected" here?
  • do the new compilers work on Windows: do we have a story aside from manual testing yet? Manual testing is just fine for now, but it would be a useful guidance to opam-repo maintainers to be explicit here.

@dra27
Copy link
Member Author

dra27 commented May 12, 2024

Thanks 🙂 I certainly intend to extend the arch- and system- packages to allow better control of the compiler build on Unix systems too (for example, selecting between gcc and clang and being able to control the selection of arm64 or x86_64 on macOS). The host-arch- and host-system- packages work completely on Windows because opam is able to encode the defaults, but to do this on Unix will require some changes in opam (which, probably unsurprisingly, I'm already scheming towards!), but the packages as given here should at least encode the current slightly limited behaviour (cf. things like ocaml/opam#5949). The main thing is that we need a way to communicate a tiny bit more information about the C compiler to the solver (as, for example, we already do about any OCaml "system" compiler).

It should indeed be the case that nothing has changed for non-Windows apart from the installation of host-system-other and a host-arch- package.

There isn't a testing story as yet beyond manual testing, but there isn't a story for the Unix compiler packages either. At the moment, ocaml-variants packages are over-tested because of a "bug" in docker-base-images and ocaml-base-compiler packages aren't tested at all.

@dra27
Copy link
Member Author

dra27 commented May 12, 2024

I wonder whether a synchronous review may be best?

@raphael-proust
Copy link
Collaborator

I think a synchronous review would definitely help me… I'm pinging you offband about it

@dra27
Copy link
Member Author

dra27 commented May 13, 2024

At the moment, ocaml-variants packages are over-tested because of a "bug" in docker-base-images and ocaml-base-compiler packages aren't tested at all.

A slight refinement - new compiler releases aren't tested at all, but patches to the latest release of existing branches are (although, again, it is more that they're tested by accident than deisgn; cf. ocaml-base-compiler.4.14.2, ocaml-base-compiler.5.0.0, ocaml-base-compiler.5.1.1)

Comment on lines 4 to 5
This package specifies OCaml built with Microsoft Visual Studio and is presently
presently available for i386/x86_32 and amd64/x86_64.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This package specifies OCaml built with Microsoft Visual Studio and is presently
presently available for i386/x86_32 and amd64/x86_64.
This package specifies OCaml built with Microsoft Visual Studio and is presently
available for i386/x86_32 and amd64/x86_64.

"presently" is duplicated

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops - fixed!

license: "CC0-1.0+"
homepage: "https://opam.ocaml.org"
bug-reports: "https://github.com/ocaml/opam-repository/issues"
conflict-class: "ocaml-host-architecture"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
conflict-class: "ocaml-host-architecture"
conflict-class: "host-arch"

to match the now-reserved prefix for package names

needs to be applied to other packages too

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually maybe using just the prefix is a bit short, especially for some other packages like arch-. So possibly we should have meta-package-<prefix> or opam-setup-selection-<prefix> or something like that.

We'll discuss this during the meeting (or anyone should feel free to drop comments here) and decide on something.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the conclusion for these ones? The PR at the moment introduces:

  • msys2-environment
  • ocaml-architecture-setting & ocaml-system-setting
  • ocaml-host-architecture & ocaml-host-system
  • ocaml-mingw-env & ocaml-msvc-env

Prior to this PR, opam-repository's only use of conflict-class is ocaml-core-compiler (which is why the feature was added)

@@ -4,7 +4,12 @@ authors: "C++ compiler developers"
homepage: "https://github.com/ocaml/opam-repository"
bug-reports: "https://github.com/ocaml/opam-repository/issues"
license: "GPL-2.0-or-later"
build: ["c++" "--version"]
build: [
"i686-w64-mingw32-c++" {os = "win32" & host-arch-x86_32:installed}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dra27 IIUC this line

  • is currently "dead code" (as in, the condition cannot be met because we can't have windows 32bit platforms),
  • but it is included because (a) it's nothing to do with this here conf package that makes it dead and (b) in the future it should become not-dead

Is that a correct understanding?

(Or is the comment in packages/arch-x86_64/arch-x86_64.1/opam outdated?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite - 64-bit hosts can run 32-bit executables, so 32-bit is alive and well, it's just that you can't (easily) run opam on actual 32-bit Windows 10.

The point of the comment is that I've put (arch = "x86_64" | arch = "arm64"), but we're extremely unlikely ever to see arch = "x86_32"

Comment on lines +38 to +46
("arch-x86_64" {os = "win32" & arch = "x86_64"} &
("system-mingw" & "mingw-w64-shims" {os-distribution = "cygwin" & post} |
"system-msvc") |
# i686 mingw-w64 / MSVC
"arch-x86_32" {os = "win32"} &
("system-mingw" & "mingw-w64-shims" {os-distribution = "cygwin" & post} |
"system-msvc") |
# Non-Windows systems
"host-system-other" {os != "win32" & post})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure this uses the (implicit) operator precedence correctly, but I'm tempted to add parentheses. Opinions from anyone?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
("arch-x86_64" {os = "win32" & arch = "x86_64"} &
("system-mingw" & "mingw-w64-shims" {os-distribution = "cygwin" & post} |
"system-msvc") |
# i686 mingw-w64 / MSVC
"arch-x86_32" {os = "win32"} &
("system-mingw" & "mingw-w64-shims" {os-distribution = "cygwin" & post} |
"system-msvc") |
# Non-Windows systems
"host-system-other" {os != "win32" & post})
(("arch-x86_64" {os = "win32" & arch = "x86_64"} &
(("system-mingw" & "mingw-w64-shims" {os-distribution = "cygwin" & post}) |
"system-msvc")) |
# i686 mingw-w64 / MSVC
("arch-x86_32" {os = "win32"} &
(("system-mingw" & "mingw-w64-shims" {os-distribution = "cygwin" & post}) |
"system-msvc")) |
# Non-Windows systems
"host-system-other" {os != "win32" & post})

I'm not sure it's clearer, but I don't mind either way - what do you think?

@@ -8,25 +8,65 @@ license: "LGPL-2.1-or-later WITH OCaml-LGPL-linking-exception"
authors: "Xavier Leroy and many contributors"
homepage: "https://ocaml.org"
bug-reports: "https://github.com/ocaml/opam-repository/issues"
dev-repo: "git+https://github.com/ocaml/ocaml"
dev-repo: "git+https://github.com/ocaml/ocaml#5.0"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this not done on the 4* packages?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hah - I think this is an artefact of how I applied the patches across them - I would have diff'd 4.14.1 and 5.0.0 and not spotted the diff on the dev-repo but when I applied the changes to 5.1.0 I saw the diff, corrected the 5.0 URL but then didn't make the connection about the 4.13 and 4.14 packages! Done...!

@dra27
Copy link
Member Author

dra27 commented May 23, 2024

First rebase to clear conflicts with #25875 and #25911; now addressing review

dra27 added 7 commits May 23, 2024 15:31
Add the system-mingw and system-msvc packages to specify either the
mingw-w64 or MSVC ports when compiling OCaml and the host-system-mingw
and host-system-msvc packages to be used for the dependency graph.

The intention is to complete this for non-Windows systems, but as there
will always be a chance of an unknown system, host-system-other is added
to be used for all the other ports. This package is significant, as
compiler packages must always install a host-system- package (or the
user could attempt to install another incorrect one).
A package is available for each supported OCaml architecture.
host-arch-unknown is added to ensure that each compiler package is
always able to install one of these packages.

The intention is to complete this for non-Windows systems, and in
particular to ensure that this is fully compatible with
ocaml-option-32bit. For now, only arch-x86_32 and arch-x86_64 are
available, as these are the two supported Windows architectures.
This is a "legacy" package, given how long there has been temporary
packaging for OCaml 5 with mingw-w64 support. This package provides
ocaml-option-mingw in a mechanism compatible with the existing
ocaml-option- layout, that it is to say it requires the
ocaml-variants.x.yy.x+options package to be used and conflicts with all
the ocaml-options-only- packages.

Users are expected instead to use system-mingw and either their default
architecture or arch-x86_32/arch-x86_64 to select the mingw-w64 port
(which also works for ocaml-base-compiler).
Also added missing license field to these files.
Adds support for the mingw-w64 and MSVC native Windows ports of OCaml
for OCaml 4.13.0 onwards.

Two minor updates are required to the options packages:
- ocaml-option-nnpchecker is supported by the 64-bit MSVC port (but
  not by the mingw-w64, because it relies on SEH, which mingw-w64 GCC
  doesn't support)
- ocaml-option-tsan is not supported on any Windows ports (sadly)

The conf-msvc32 and conf-msvc64 packages can be co-installed, but only
one compiler may be activated at a time. This is expressed by the
ocaml-msvc-env package, which ensure that only configuration is set in
the environment. Placing these updates in a separate package also ensure
that the setenv updates are only ever considered when actually needed
(avoiding the issues with opam 2.0.10 and 2.0.4 not supporting += "" in
environment updates).

ocaml-system is updated to install the appropriate host-arch- package
dependent on the opam 2.1 sys-ocaml-arch variable. This variable is the
value of ocamlc -config-var architecture, but with amd64 changed to
x86_64 and i386 changed to i686. If this variable is not defined by opam
(for example, for opam 2.0, or where an opam root was upgraded from 2.0
to 2.1), then host-arch-unknown is installed. For Windows, this variable
must be defined. host-system-mingw, host-system-msvc or
host-system-other are installed dependent on the opam 2.1 sys-ocaml-libc
("msvc" for mingw-w64 and MSVC ports and "libc" for everything else) and
sys-ocaml-cc ("cl" for the MSVC port and "cc" for everything else)
variables.

ocaml-base-compiler and ocaml-variants both recognise the arch-x86_64
and arch-x86_32 packages for Windows which allow selecting between the
32-bit and 64-bit variants of the Windows ports (note that this is
distinct from the somewhat ad hoc ocaml-option-32bit package) and
similarly system-mingw and system-msvc to select between the mingw-w64
and MSVC ports. Both packages use the flexdll source package to
bootstrap flexlink and the FlexDLL runtime objects as part of the
compiler build.

All three packages will configure either the appropriate depexts and
mingw-w64 shims or Microsoft Visual Studio Tools environment for the
given compiler port, and ensure that these are placed into the
environment as part of opam env.
opam 2.x has no way to determine if Visual Studio is available or to
cause it to be installed. Flagging ocaml-msvc-env as avoid-version
therefore makes the MSVC ports "opt-in" only. Hopefully this situation
can be improved with enhanced depexts in opam 3.0.
Use avoid-version to steer Windows towards a 64-bit compiler by default.
Note that although opam can be compiled for 32-bit Windows, 32-bit
Windows is deprecated (Windows 10 is the last version available as a
32-bit host), Cygwin is only available for 64-bit.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants