Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Remove obsolete SIMD and BLAS dependencies (for cpu) #152

Open
5 tasks
hikettei opened this issue May 14, 2024 · 1 comment
Open
5 tasks

[WIP] Remove obsolete SIMD and BLAS dependencies (for cpu) #152

hikettei opened this issue May 14, 2024 · 1 comment

Comments

@hikettei
Copy link
Owner

hikettei commented May 14, 2024

we aim to generalize APIs and optimization techniques around different architecture computers, that is, we also have to make support for GPU removing CPU dependencies because cl-waffe2 was originally designed as so (easy to extend, easy to fuse multiple kernels, cl-waffe2 is nothing but tensor abstraction APIs and more including the fastest auto diff in Common Lisp)

As of now, I'm working on implementing a deep learning compiler for multiple targets including AVX, Neon, NVIDIA, AMD, and more! (it also extends eazy to extend concepts)

https://github.com/hikettei/AbstractTensor.lisp

The approaches are similar to tibygrad, even a beautiful tinygrad port to Common Lisp may be good.

This might be some kind of destructive changes and included in my future works(thats why i have created a new issue); but I believe this modification will enable get Int8 Quantized Llama3 model running on Common Lisp, with the smallest dependencies. This could be one of the reason using Common Lisp because it is impossible to reproduce it for Python, or other languages communities.

Workload to implement LLAMA3

  • (nearly) complete tinygrad port to Common Lisp
  • Fast Conv2D kernel implementation (and winograd)
  • Support more fuse patterns
  • GPU(all of NVIDIA, METAL, and AMD is not as difficult with our approaches) Supports
  • Improve data type interface, esp, cast ops, and quantization op support with JIT.
@hikettei
Copy link
Owner Author

hikettei commented Jun 3, 2024

Memo

    1. SIMD/Unroll/Parallel/GPU/Scheduler
    1. New nodes: Cast/ChangeFacetNode
    1. Support JIT Compilation for cl-waffe2/distribution
    1. Im2Col/Gemm (Aten Op)
    1. Export2C Mode
    1. From ONNX Mode
  • Update the docgen system

#155

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant