-
Notifications
You must be signed in to change notification settings - Fork 103
Home
Welcome to the ffi-experimental wiki!
Experimental work on next-gen ffi bindings into the c++ libtorch library in preparation for 0.0.2 which targets the 1.0 backend.
pytorch-project provides prebuild-binaries of c++ libtorch library on official page and debian-package for ubuntu. By using the reliable binaries, we can start running haskell-programs on various environments quickly. (Complation of c++ libtorch for CUDA takes long time.)
The development of pytorch-project is very fast, then the API is changed frequently. So it is difficult to keep maintenance of haskell's API for it manually.
There is a plan that Declarations.yaml
becomes the single, externally visible API.
See this issue.
Use generated Declarations.yaml
spec instead of header parsing for code generation.
Declarations.yaml
is located at ffi-experimental/deps/pytorch/build/aten/src/ATen/Declarations.yaml
.
The file is generated by building libtorch-binary or running deps/get-deps.sh
.
It supports the funcitons of Native, TH and NN.
It does not support the methods of c++'s class.
The codes for methods are generated by spec/cppclass/*.yaml
.
The dataflow is below.
spec/Declarations.yaml(pytorch) -> codegen(a program of this repo.) -> ffi(ffi bindings of this repo.)
spec/cppclass/*.yaml(this repo.)-|
Use inline-c-cpp functionality to bind the C++ API instead of the C API. inline-c-cpp generates c++-codes and haskell-codes at compilation time. To generate the codes, it uses template-haskell.
Technically, symbols of c++-codes are wrapped by extern "C"
.
See How to mix C and C++.
The generated haskell-codes use FFI.
Original inline-c-cpp does not support namespace and template of c++. To support namespace and template of c++, we use modified inline-c-cpp. See this PR.
C++ has 2 memory models. One is heap. Another is stack.
libtorch functions return stack's object.
When the function using the object of local variable returns, the object on stack is deleted,
For example, see below, when test() returns, "Tensor a" on stack is deleted.
void test(){
at::Tensor a = at::ones({2, 2}, at::kInt);
at::Tensor b = at::randn({2, 2});
auto c = a + b.to(at::kInt);
}
So this ffi puts it on the heap using new so that it is not deleted.
at::Tensor* ones_for_haskell(){
at::Tensor a = at::ones({2, 2}, at::kInt);
return new at::Tensor(a);
}
c-lang's data is passed to function-argument directly. c++'s object is passed to function-argument by using object-pointer.
??? In end of function-call, c-lang's data returns by value. c++'s object returns by object-pointer with new.
Use garbage collection of GHC.
Generated ffi-codes have unmanaged codes(ffi-experimental/ffi/src/Aten/Unmanaged/*
) and managed codes(ffi-experimental/ffi/src/Aten/Managed/*
).
Unmanaged codes use 'Ptr'-type which is the same as c/c++'s raw-pointer.
Managed codes use ForeignPtr
-type which is managed by GHC.
To convert unmanaged codes to managed codes, c++'s object have to be a instance of CppObject-type-class and managed codes is wrapped by cast of Castable-type-calss. You can see details of cast in ffi-experimental/ffi/src/Aten/Cast.hs
.
class CppObject a where
fromPtr :: Ptr a -> IO (ForeignPtr a)
class Castable a b where
cast :: a -> (b -> IO r) -> IO r
uncast :: b -> (a -> IO r) -> IO r
instance (CppObject a) => Castable (ForeignPtr a) (Ptr a) where
cast x f = withForeignPtr x f
uncast x f = fromPtr x >>= f
cast0 :: (Castable a ca) => (IO ca) -> IO a
cast0 f = f >>= \ca -> uncast ca return
cast1 :: (Castable a ca, Castable y cy)
=> (ca -> IO cy) -> a -> IO y
cast1 f a = cast a $ \ca -> f ca >>= \cy -> uncast cy return
...
When c++-function of libtorch fail, throw exception.
- For now, use stack. (To use cabal-v2, update shell.nix and cabal.project)
- CircleCI
- Ubuntu18.04
- stack
- Use pined libtorch-binary
# Download libtorch-binary and generate 'Declarations.yaml'
> pushd deps
> ./get-deps.sh
> popd
# Generate ffi-codes to output-direcotory.
> stack exec codegen-exe
# Check difference and copy the generated codes.
> diff -r output/Aten ffi/src/Aten
> cp -r output/Aten ffi/src/
# Build and test
> stack test ffi
See MemorySpec.hs.
See BasicTest.hs.
- Prebuild libtorch uses old ABI of gcc to maintain backwards compatibility. Pass
-D_GLIBCXX_USE_CXX11_ABI=0
to gcc.
- Integrate this ffi to hasktorch/hasktorch.
- What does generated function's suffix mean? e.g. tts of add_tts.
- c++ supports overload. Haskell does not support it. We use the suffix not to conflict the names of function on Haskell.
- Is torch::Tensor the same as at::Tensor?
- Yes.
- Why not use fficxx?
- fficxx does not support mananaged codes using ForeignPtr.
- What are
native_functions.yaml
andnn.yaml
?- These files is used to generate
Declarations.yaml
.
- These files is used to generate
Please feel free to update this document and add FAQ.