[llama.cpp] Update llama.cpp to latest release b2581 (2024-03-30) #3055

howlger · 2024-04-01T11:14:02Z

In order to get support for BERT based sentence embedding models like BAAI/bge-base-en-v1.5, mixedbread-ai/mxbai-embed-large-v1, or others, update llama.cpp from b1696 (2023-12-12) to the current latest release b2581 (2024-03-30).

BERT support was added to llama.cpp in February 2024: ggerganov/llama.cpp#5423

This change has not yet been tested. Maybe updating the Gradle property llamacpp_version is not enough and ai_djl_llama.cpp needs to be adapted as well. If so, please do so.

--------- Co-authored-by: Administrator <Administrator@tech8> Co-authored-by: KexinFeng <fenkexin@amazon.com>

* Implement PtNDArraryEx.multiboxDetection * MultiboxDetection - code cleanup * MultiboxDetection - code cleanup * MultiboxDetection - code cleanup * MultiboxDetection - code cleanup * format code * Fix, add tests, and pass CI --------- Co-authored-by: Zach Kimberg <kimbergz@amazon.com>

…brary#2796) This reverts commit 3a90d0a.

This fixes the markdown headers to be h1 so they render correctly in docs.

…y#2818)

…valibrary#2806) * [api] Added Early stopping configuration (deepjavalibrary#38) * [api] Added Builder for Early stopping configuration (deepjavalibrary#38) * Explicitly set NDManager for dataset in EarlyStoppingListenerTest to make the test run on JDK11 in gradle.

This creates an abstraction for combining devices into a single device. The main use case for now is in DJL Serving TP_parallel. It will allow us to create a WorkerGroup and a PyPredictor for a set of devices and then track the usage of devices properly. It could also be used later for multi-gpu training or other multi-device cases.

…alibrary#2826)

* Updates doc versions to 0.24.0 Also moves android gradle.properties to the new 0.25.0. * Remove android change

* Updates XGBoost to 2.0.1 * Use devtools 8 * Updates based on new Xgboost JNI API. --------- Co-authored-by: Frank Liu <frankfliu2000@gmail.com>

Fixes deepjavalibrary#2840

* Added element-wise gauss error function (ERF) * Added element-wise arctan2 * Format java * Fixed docs * added * to other_ptr in Atan2

* Added 2D FFT * Format java * Add default fft2 * Convert array to vectors * Add inverse fft2 * Add better assersion in ifft2 test * Add really better assersion in ifft2 test * Move cast bellow ifft2 for unsupported exception * Format java * changed dims to axes * changed dims to axes

* only build triton binaries * install requests library * remove script

…javalibrary#2850)

Updates the navigation as a followup to deepjavalibrary/djl-serving#1316.

…eepjavalibrary#3016)

…library#3017)

…y#3027)

…ibrary#3026) Fixes: deepjavalibrary#3025

…y#3028)

…rary#3029)

…brary#3032) * support includeTokenTypes in TextEmbeddingBatchTranslator Co-authored-by: Frank Liu <frankfliu2000@gmail.com>

Fixes: deepjavalibrary#3033

…ry#3035)

…avalibrary#3043)

* Increase DJL version to 0.27.0 * Update README

In order to get support for BERT based sentence embedding models like BAAI/bge-base-en-v1.5, mixedbread-ai/mxbai-embed-large-v1, or others, update llama.cpp from b1696 (2023-12-12): https://github.com/ggerganov/llama.cpp/releases/tag/b1696 to the current latest release b2581 (2024-03-30): https://github.com/ggerganov/llama.cpp/releases/tag/b2581 BERT support was added to llama.cpp in February 2024: ggerganov/llama.cpp#5423

frankfliu

Llama.cpp implementation has changed, just bump up version won't work, we have to make JNI code change to make it compile

howlger · 2024-04-02T08:25:38Z

I see. Thanks for taking the time to try it.

The native libraries are built for all platforms with Native S3 llama.cpp, right? I can't find the log of the failed build. Could you please share it?

SidneyLann and others added 30 commits September 19, 2023 17:36

To support Yolov8 (deepjavalibrary#2776)

950340f

--------- Co-authored-by: Administrator <Administrator@tech8> Co-authored-by: KexinFeng <fenkexin@amazon.com>

[onnxruntime] Upgrades OnnxRuntime to 1.16.0 (deepjavalibrary#2784)

da15713

build ft for sm90 (deepjavalibrary#2785)

15fd0d0

Updates LightGBM to 1.7.6 (deepjavalibrary#2793)

3a90d0a

Revert "Updates LightGBM to 1.7.6 (deepjavalibrary#2793)" (deepjavali…

8fd79db

…brary#2796) This reverts commit 3a90d0a.

[tokenizer] Allows import non-english model (deepjavalibrary#2797)

27c6a57

allow to just build for 1 flow (deepjavalibrary#2798)

d432a65

[api] Fixed NDList decode numpy file bug (deepjavalibrary#2804)

458933c

[api] Allows cancel Input (deepjavalibrary#2805)

2f4ebee

[ci] Fixes out of diskspace issue (deepjavalibrary#2808)

90059cd

[docs] Fixes markdown headers (deepjavalibrary#2812)

298ea1f

This fixes the markdown headers to be h1 so they render correctly in docs.

Bump up DJL version to 0.25.0 (deepjavalibrary#2809)

f0b4334

add gpu flag build for triton client (deepjavalibrary#2815)

fe86680

[xgb] Add .xgb file extension support (deepjavalibrary#2810)

1c5aef8

[tokenizers] Upgrade huggingface tokenizers to 1.14.1 (deepjavalibrar…

23e07cf

…y#2818)

[huggingface] Adds CrossEncoderTranslator (deepjavalibrary#2817)

85d9e85

Update README with release update (deepjavalibrary#2823)

7d68857

[api] Replace double-check singlton with lazy initialization (deepjav…

3927867

…alibrary#2826)

[api] Refactor PublisherBytesSupplier.java (deepjavalibrary#2831)

9b0c8c9

Updates doc versions to 0.24.0 (deepjavalibrary#2829)

6981d76

* Updates doc versions to 0.24.0 Also moves android gradle.properties to the new 0.25.0. * Remove android change

Updates XGBoost to 2.0.1 (deepjavalibrary#2833)

715e620

* Updates XGBoost to 2.0.1 * Use devtools 8 * Updates based on new Xgboost JNI API. --------- Co-authored-by: Frank Liu <frankfliu2000@gmail.com>

[tokenizer] Fixes tokenizer bug (deepjavalibrary#2843)

9f55189

Fixes deepjavalibrary#2840

Add erf and atan2 (deepjavalibrary#2842)

f84d3bb

* Added element-wise gauss error function (ERF) * Added element-wise arctan2 * Format java * Fixed docs * added * to other_ptr in Atan2

only build triton binaries (deepjavalibrary#2847)

8f6ff7c

* only build triton binaries * install requests library * remove script

[tokenizer] Update import script for huggingface_hub api change (deep…

e8ceef3

…javalibrary#2850)

[docs] Update serving configuration nav (deepjavalibrary#2853)

e315554

Updates the navigation as a followup to deepjavalibrary/djl-serving#1316.

frankfliu and others added 23 commits March 1, 2024 09:51

[api] Adds suffix to percentile metric name (deepjavalibrary#3011)

2a75b18

[api] Adds dimension for prediction metric (deepjavalibrary#3013)

93123ad

Thread-safe FaceDetectionTranslator.java by using TranslatorContext (d…

97cfc19

…eepjavalibrary#3016)

[docs] Disable progress bar for jupyter notebook convertion (deepjava…

ad076c8

…library#3017)

[api] Upgrades commons compress to 1.26.0 for CVE (deepjavalibrary#3018)

b144377

[onnx] Upgrades OnnxRuntime to 1.17.1 (deepjavalibrary#3019)

c3b1e88

Avoid duplicated loading native library (deepjavalibrary#3020)

ff3978c

[ci] Fixes nightly build for onnx 1.17.1 (deepjavalibrary#3021)

063fb8d

[pytorch] Fixes detecting wrong flavor on macOS issue (deepjavalibrar…

0fa0db2

…y#3027)

[api] Allows to use relative jar uri for cache folder name (deepjaval…

8c5ed49

…ibrary#3026) Fixes: deepjavalibrary#3025

[example] Adds document about how to trace gpt2 model (deepjavalibrar…

7567277

…y#3028)

[docs] update mkdocs structure for new lmi documentation (deepjavalib…

b6b8729

…rary#3029)

support includeTokenTypes in TextEmbeddingBatchTranslator (deepjavali…

6b32ef2

…brary#3032) * support includeTokenTypes in TextEmbeddingBatchTranslator Co-authored-by: Frank Liu <frankfliu2000@gmail.com>

[bom] Fixes djl-serving packages in BOM (deepjavalibrary#3039)

e3a8e4c

[api] Support encode/decode String tensor (deepjavalibrary#3034)

b3b04f5

Fixes: deepjavalibrary#3033

[tokenizer] Adds includeTokenTypes for all translators (deepjavalibra…

ee93305

…ry#3035)

Updates dependencies version to latest (deepjavalibrary#3040)

fa28fbe

[pytorch] Allows to exclude certain DLL from pytorch directory (deepj…

091a41c

…avalibrary#3043)

Update checkstyle tool version to 10.14.2 (deepjavalibrary#3047)

5eb2b9e

Upgrade dependency version (deepjavalibrary#3049)

2c1e7fa

Increase DJL version to 0.27.0 (deepjavalibrary#3046)

a22360a

* Increase DJL version to 0.27.0 * Update README

Increase build version to 0.28.0 (deepjavalibrary#3050)

22b46a9

howlger requested review from zachgk, frankfliu and a team as code owners April 1, 2024 11:14

frankfliu approved these changes Apr 1, 2024

View reviewed changes

frankfliu requested changes Apr 1, 2024

View reviewed changes

frankfliu force-pushed the master branch from ec89a66 to c68f8a7 Compare April 26, 2024 19:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[llama.cpp] Update llama.cpp to latest release b2581 (2024-03-30) #3055

[llama.cpp] Update llama.cpp to latest release b2581 (2024-03-30) #3055

howlger commented Apr 1, 2024

frankfliu left a comment

howlger commented Apr 2, 2024

[llama.cpp] Update llama.cpp to latest release b2581 (2024-03-30) #3055

Are you sure you want to change the base?

[llama.cpp] Update llama.cpp to latest release b2581 (2024-03-30) #3055

Conversation

howlger commented Apr 1, 2024

frankfliu left a comment

Choose a reason for hiding this comment

howlger commented Apr 2, 2024