Is Gemma on device really this slow ? #379

MJ1998 · 2024-04-30T14:36:57Z

I used llm_inference sample with gemma-2b-it-cpu-int4.bin on Pixel 8 Pro emulator.

The prefill speed seems to be in minutes.

Pixel 8 Pro configurations:-
RAM - 22GB, VM heap - 512mb

Reference video
https://github.com/googlesamples/mediapipe/assets/22965002/c7730dba-48e8-4eec-ae68-fe847d2778f2

The text was updated successfully, but these errors were encountered:

PaulTR · 2024-04-30T14:40:33Z

Oh boy, no definitely not. It's not really intended to be run on the emulator, so your results are going to vary wildly. Here's a presentation I did last week with a slide showing Gemma running on a device in real-time (not sped up or altered, just recorded and turned into a gif) https://docs.google.com/presentation/d/1uetAcmkNWDXHEJaCt6WoBflDM1iMUU1N1ahzQof6PLM/edit#slide=id.g26cd5c56ad9_1_30

MJ1998 · 2024-04-30T14:48:41Z

I saw a post suggesting emulator with increased ram works similarly.
Here it is - link - Search for "Creating an Android Emulator with Increased RAM"

What's the difference that makes physical device so much faster ? Is it particularly customized for gemma ?

Thanks for the prompt response!

PaulTR · 2024-04-30T14:56:55Z

No idea on that level of detail. My general experience over the last 10+ years with Android development though has always been "Eh, emulators are OK, but never as good as a real device"

MJ1998 · 2024-05-02T08:59:12Z

Time to first token is still pretty slow compared to the video you shared. Takes around 15 seconds for both 4bit and 8bit cpu versions of gemma2b.
Physical device that I am using is pixel 7 pro.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is Gemma on device really this slow ? #379

Is Gemma on device really this slow ? #379

MJ1998 commented Apr 30, 2024 •

edited

PaulTR commented Apr 30, 2024

MJ1998 commented Apr 30, 2024

PaulTR commented Apr 30, 2024

MJ1998 commented May 2, 2024

Is Gemma on device really this slow ? #379

Is Gemma on device really this slow ? #379

Comments

MJ1998 commented Apr 30, 2024 • edited

PaulTR commented Apr 30, 2024

MJ1998 commented Apr 30, 2024

PaulTR commented Apr 30, 2024

MJ1998 commented May 2, 2024

MJ1998 commented Apr 30, 2024 •

edited