We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU was used in spite of the -ng parameter. Zero usage in ANE.
I have followed the steps in README for CoreML, and the log looks like this:
make clean WHISPER_COREML=1 make -j
I whisper.cpp build info: I UNAME_S: Darwin I UNAME_P: arm I UNAME_M: arm64 I CFLAGS: -I. -O3 -DNDEBUG -std=c11 -fPIC -D_XOPEN_SOURCE=600 -D_DARWIN_C_SOURCE -pthread -DGGML_USE_ACCELERATE -DACCELERATE_NEW_LAPACK -DACCELERATE_LAPACK_ILP64 -DGGML_USE_METAL I CXXFLAGS: -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -D_XOPEN_SOURCE=600 -D_DARWIN_C_SOURCE -pthread -DWHISPER_USE_COREML -DGGML_USE_METAL I LDFLAGS: -framework Accelerate -framework Foundation -framework CoreML -framework Foundation -framework Metal -framework MetalKit I CC: Apple clang version 15.0.0 (clang-1500.3.9.4) I CXX: Apple clang version 15.0.0 (clang-1500.3.9.4) cc -I. -O3 -DNDEBUG -std=c11 -fPIC -D_XOPEN_SOURCE=600 -D_DARWIN_C_SOURCE -pthread -DGGML_USE_ACCELERATE -DACCELERATE_NEW_LAPACK -DACCELERATE_LAPACK_ILP64 -DGGML_USE_METAL -c ggml.c -o ggml.o cc -I. -O3 -DNDEBUG -std=c11 -fPIC -D_XOPEN_SOURCE=600 -D_DARWIN_C_SOURCE -pthread -DGGML_USE_ACCELERATE -DACCELERATE_NEW_LAPACK -DACCELERATE_LAPACK_ILP64 -DGGML_USE_METAL -c ggml-alloc.c -o ggml-alloc.o cc -I. -O3 -DNDEBUG -std=c11 -fPIC -D_XOPEN_SOURCE=600 -D_DARWIN_C_SOURCE -pthread -DGGML_USE_ACCELERATE -DACCELERATE_NEW_LAPACK -DACCELERATE_LAPACK_ILP64 -DGGML_USE_METAL -c ggml-backend.c -o ggml-backend.o cc -I. -O3 -DNDEBUG -std=c11 -fPIC -D_XOPEN_SOURCE=600 -D_DARWIN_C_SOURCE -pthread -DGGML_USE_ACCELERATE -DACCELERATE_NEW_LAPACK -DACCELERATE_LAPACK_ILP64 -DGGML_USE_METAL -c ggml-quants.c -o ggml-quants.o c++ -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -D_XOPEN_SOURCE=600 -D_DARWIN_C_SOURCE -pthread -DWHISPER_USE_COREML -DGGML_USE_METAL -c whisper.cpp -o whisper.o c++ -O3 -I . -fobjc-arc -c coreml/whisper-encoder.mm -o whisper-encoder.o c++ -O3 -I . -fobjc-arc -c coreml/whisper-encoder-impl.m -o whisper-encoder-impl.o cc -I. -O3 -DNDEBUG -std=c11 -fPIC -D_XOPEN_SOURCE=600 -D_DARWIN_C_SOURCE -pthread -DGGML_USE_ACCELERATE -DACCELERATE_NEW_LAPACK -DACCELERATE_LAPACK_ILP64 -DGGML_USE_METAL -c ggml-metal.m -o ggml-metal.o c++ -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -D_XOPEN_SOURCE=600 -D_DARWIN_C_SOURCE -pthread -DWHISPER_USE_COREML -DGGML_USE_METAL examples/main/main.cpp examples/common.cpp examples/common-ggml.cpp examples/grammar-parser.cpp ggml.o ggml-alloc.o ggml-backend.o ggml-quants.o whisper.o whisper-encoder.o whisper-encoder-impl.o ggml-metal.o -o main -framework Accelerate -framework Foundation -framework CoreML -framework Foundation -framework Metal -framework MetalKit c++ -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -D_XOPEN_SOURCE=600 -D_DARWIN_C_SOURCE -pthread -DWHISPER_USE_COREML -DGGML_USE_METAL examples/bench/bench.cpp ggml.o ggml-alloc.o ggml-backend.o ggml-quants.o whisper.o whisper-encoder.o whisper-encoder-impl.o ggml-metal.o -o bench -framework Accelerate -framework Foundation -framework CoreML -framework Foundation -framework Metal -framework MetalKit c++ -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -D_XOPEN_SOURCE=600 -D_DARWIN_C_SOURCE -pthread -DWHISPER_USE_COREML -DGGML_USE_METAL examples/quantize/quantize.cpp examples/common.cpp examples/common-ggml.cpp examples/grammar-parser.cpp ggml.o ggml-alloc.o ggml-backend.o ggml-quants.o whisper.o whisper-encoder.o whisper-encoder-impl.o ggml-metal.o -o quantize -framework Accelerate -framework Foundation -framework CoreML -framework Foundation -framework Metal -framework MetalKit c++ -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -D_XOPEN_SOURCE=600 -D_DARWIN_C_SOURCE -pthread -DWHISPER_USE_COREML -DGGML_USE_METAL examples/server/server.cpp examples/common.cpp examples/common-ggml.cpp examples/grammar-parser.cpp ggml.o ggml-alloc.o ggml-backend.o ggml-quants.o whisper.o whisper-encoder.o whisper-encoder-impl.o ggml-metal.o -o server -framework Accelerate -framework Foundation -framework CoreML -framework Foundation -framework Metal -framework MetalKit ./main -h
./main -ng -l fi -otxt -ovtt -osrt -olrc -m "./models/ggml-large-v3.bin" -pp -f "../Henkilöauton ajokoe.wav" -of "whcpp-ml"
whisper_init_from_file_with_params_no_state: loading model from './models/ggml-large-v3.bin' whisper_model_load: loading model whisper_model_load: n_vocab = 51866 whisper_model_load: n_audio_ctx = 1500 whisper_model_load: n_audio_state = 1280 whisper_model_load: n_audio_head = 20 whisper_model_load: n_audio_layer = 32 whisper_model_load: n_text_ctx = 448 whisper_model_load: n_text_state = 1280 whisper_model_load: n_text_head = 20 whisper_model_load: n_text_layer = 32 whisper_model_load: n_mels = 128 whisper_model_load: ftype = 1 whisper_model_load: qntvr = 0 whisper_model_load: type = 5 (large v3) whisper_model_load: adding 1609 extra tokens whisper_model_load: n_langs = 100 whisper_model_load: CPU total size = 3094.36 MB whisper_model_load: model size = 3094.36 MB whisper_init_state: kv self size = 220.20 MB whisper_init_state: kv cross size = 245.76 MB whisper_init_state: loading Core ML model from './models/ggml-large-v3-encoder.mlmodelc' whisper_init_state: first run on a device may take a while ... whisper_init_state: Core ML model loaded whisper_init_state: compute buffer (conv) = 10.92 MB whisper_init_state: compute buffer (cross) = 9.38 MB whisper_init_state: compute buffer (decode) = 209.26 MB system_info: n_threads = 4 / 8 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | METAL = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 | CUDA = 0 | COREML = 1 | OPENVINO = 0 main: processing '../Henkilöauton ajokoe.wav' (20244723 samples, 1265.3 sec), 4 threads, 1 processors, 5 beams + best of 5, lang = fi, task = transcribe, timestamps = 1 ... [00:00:00.000 --> 00:00:02.000] Terve. Moikka. [00:00:02.000 --> 00:00:04.000] Mä voin mennä sajelemaan. Joo. [00:00:04.000 --> 00:00:06.000] Joo, Mika. Petra, moi. [00:00:06.000 --> 00:00:08.000] Okei, mä voin ottaa sulta ne paperit sieltä. [00:00:08.000 --> 00:00:10.000] Siitä se ja se. [00:00:10.000 --> 00:00:12.000] All right, mennään tonne autoon. Joo. [00:00:12.000 --> 00:00:14.000] Katsotaan siellä loppuun. [00:00:14.000 --> 00:00:23.000] All right, sit mä tarkastan, et mul on oikee puski mukana.
The text was updated successfully, but these errors were encountered:
Don't add -ng and adjust the following parameter in the code to enable ANE and CPU-only:
-ng
whisper.cpp/coreml/whisper-encoder.mm
Lines 25 to 29 in 3928dbd
Sorry, something went wrong.
Thanks for the compiling tip. It does not seem I can monitor neural engine usage with powermetrics as it does for CPUs and GPUs.
No branches or pull requests
GPU was used in spite of the -ng parameter. Zero usage in ANE.
I have followed the steps in README for CoreML, and the log looks like this:
The text was updated successfully, but these errors were encountered: