Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Video (MP4) with Gemini Pro 1.5 #332

Open
schnee opened this issue May 10, 2024 · 11 comments
Open

Support for Video (MP4) with Gemini Pro 1.5 #332

schnee opened this issue May 10, 2024 · 11 comments
Labels
component:other Questions unrelated to SDK status:awaiting user response Awaiting a response from the author type:bug Something isn't working

Comments

@schnee
Copy link

schnee commented May 10, 2024

Description of the bug:

When processing video/mp4 mime-type assets, exception is thrown. Exception is not thrown when processing image/png (e.g.) assets.

import google.generativeai as genai
from google.ai.generativelanguage import Part, Blob
from google.ai.generativelanguage import GenerationConfig
import os
from dotenv import load_dotenv

api_key = os.getenv("GEMINI_API_KEY")

genai.configure(api_key=api_key)

model = genai.GenerativeModel('models/gemini-1.5-pro-latest')

prompt_part = Part()
prompt_part.text = """describe this video in a few words."""
    
image_fn = "/path/to/short_video.mp4" # 374KB
with open(image_fn, "rb") as asset_data:
    asset = Blob()
    asset.mime_type = "video/mp4"
    asset.data = asset_data.read()

asset_part = Part()
asset_part.inline_data = asset

response = model.generate_content([prompt_part, asset_part],
                                  generation_config=GenerationConfig(temperature=0.0,),
                                 )

Actual vs expected behavior:

actual behavior:

following exception is thrown:

 /Users/schnee/projects/pmg/github/bw-impossible/creative_kitchen/repl.py
Traceback (most recent call last):
  File "/Users/schnee/projects/pmg/github/bw-impossible/creative_kitchen/repl.py", line 25, in <module>
    response = model.generate_content([prompt_part, asset_part],
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/schnee/Library/Caches/pypoetry/virtualenvs/bw-impossible-_xsmYHYG-py3.11/lib/python3.11/site-packages/google/generativeai/generative_models.py", line 262, in generate_content
    response = self._client.generate_content(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/schnee/Library/Caches/pypoetry/virtualenvs/bw-impossible-_xsmYHYG-py3.11/lib/python3.11/site-packages/google/ai/generativelanguage_v1beta/services/generative_service/client.py", line 791, in generate_content
    response = rpc(
               ^^^^
  File "/Users/schnee/Library/Caches/pypoetry/virtualenvs/bw-impossible-_xsmYHYG-py3.11/lib/python3.11/site-packages/google/api_core/gapic_v1/method.py", line 131, in __call__
    return wrapped_func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/schnee/Library/Caches/pypoetry/virtualenvs/bw-impossible-_xsmYHYG-py3.11/lib/python3.11/site-packages/google/api_core/retry/retry_unary.py", line 293, in retry_wrapped_func
    return retry_target(
           ^^^^^^^^^^^^^
  File "/Users/schnee/Library/Caches/pypoetry/virtualenvs/bw-impossible-_xsmYHYG-py3.11/lib/python3.11/site-packages/google/api_core/retry/retry_unary.py", line 153, in retry_target
    _retry_error_helper(
  File "/Users/schnee/Library/Caches/pypoetry/virtualenvs/bw-impossible-_xsmYHYG-py3.11/lib/python3.11/site-packages/google/api_core/retry/retry_base.py", line 212, in _retry_error_helper
    raise final_exc from source_exc
  File "/Users/schnee/Library/Caches/pypoetry/virtualenvs/bw-impossible-_xsmYHYG-py3.11/lib/python3.11/site-packages/google/api_core/retry/retry_unary.py", line 144, in retry_target
    result = target()
             ^^^^^^^^
  File "/Users/schnee/Library/Caches/pypoetry/virtualenvs/bw-impossible-_xsmYHYG-py3.11/lib/python3.11/site-packages/google/api_core/timeout.py", line 120, in func_with_timeout
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/schnee/Library/Caches/pypoetry/virtualenvs/bw-impossible-_xsmYHYG-py3.11/lib/python3.11/site-packages/google/api_core/grpc_helpers.py", line 78, in error_remapped_callable
    raise exceptions.from_grpc_error(exc) from exc
google.api_core.exceptions.InternalServerError: 500 An internal error has occurred. Please retry or report in https://developers.generativeai.google/guide/troubleshooting

expected behavior: gemini-pro-1.5-latest consumes the prompt and video file.

Any other information you'd like to share?

Note, over in vertexai land, the model is gemini-pro-1.5-preview-0409 and this does appear to process videos. I'm not sure if my issues is a model capability or not; that model does not appear to be available to this SDK.

The video I'm using is quite short and small: < 10 seconds and ~384KB. It is also h.264 encoded, which appears to be acceptable by Gemini.

@schnee schnee added component:python sdk Issue/PR related to Python SDK type:bug Something isn't working labels May 10, 2024
@singhniraj08
Copy link

@schnee, Thank you reporting this issue. "500 An internal error has occurred" error looks like an intermittent error and should work now.
This repository is for issues related to Gemini Python SDK bugs or improvements. For issues for feature requests related to Gemini API, we would suggest you to use "Send Feedback" option in Gemini docs. Ref: Screenshot below.

image

@singhniraj08 singhniraj08 added status:awaiting user response Awaiting a response from the author component:other Questions unrelated to SDK and removed component:python sdk Issue/PR related to Python SDK labels May 14, 2024
@schnee
Copy link
Author

schnee commented May 14, 2024

@singhniraj08 Thank you. For being intermittent, it happens to me every time, including 2 minutes ago. Can you share the MP4 you used to validate that it works for you?

I have ported over to the VertexAI Python SDK and am able to use it to process the same test case video.

@MarkDaoust
Copy link
Collaborator

We only announced video support in the API+SDK today, it's probably wasn't active when you tried it. Try again?

@schnee
Copy link
Author

schnee commented May 14, 2024

Thank you @MarkDaoust

I updated to google-ai-generativelanguage-0.6.3 google-generativeai-0.5.3 and am getting another error now.

But you answered my base question: did the SDK support video, and the answer was no. This issue can be closed. And I'll research the below stack trace and possibly open another ticket.

Traceback (most recent call last):
  File "/Users//projects//github/bw-impossible/creative_kitchen/mre.py", line 25, in <module>
    response = model.generate_content([prompt_part, asset_part],
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users//Library/Caches/pypoetry/virtualenvs/bw-impossible-_xsmYHYG-py3.11/lib/python3.11/site-packages/google/generativeai/generative_models.py", line 262, in generate_content
    response = self._client.generate_content(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users//Library/Caches/pypoetry/virtualenvs/bw-impossible-_xsmYHYG-py3.11/lib/python3.11/site-packages/google/ai/generativelanguage_v1beta/services/generative_service/client.py", line 812, in generate_content
    response = rpc(
               ^^^^
  File "/Users//Library/Caches/pypoetry/virtualenvs/bw-impossible-_xsmYHYG-py3.11/lib/python3.11/site-packages/google/api_core/gapic_v1/method.py", line 131, in __call__
    return wrapped_func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users//Library/Caches/pypoetry/virtualenvs/bw-impossible-_xsmYHYG-py3.11/lib/python3.11/site-packages/google/api_core/retry/retry_unary.py", line 293, in retry_wrapped_func
    return retry_target(
           ^^^^^^^^^^^^^
  File "/Users//Library/Caches/pypoetry/virtualenvs/bw-impossible-_xsmYHYG-py3.11/lib/python3.11/site-packages/google/api_core/retry/retry_unary.py", line 153, in retry_target
    _retry_error_helper(
  File "/Users//Library/Caches/pypoetry/virtualenvs/bw-impossible-_xsmYHYG-py3.11/lib/python3.11/site-packages/google/api_core/retry/retry_base.py", line 212, in _retry_error_helper
    raise final_exc from source_exc
  File "/Users//Library/Caches/pypoetry/virtualenvs/bw-impossible-_xsmYHYG-py3.11/lib/python3.11/site-packages/google/api_core/retry/retry_unary.py", line 144, in retry_target
    result = target()
             ^^^^^^^^
  File "/Users//Library/Caches/pypoetry/virtualenvs/bw-impossible-_xsmYHYG-py3.11/lib/python3.11/site-packages/google/api_core/timeout.py", line 120, in func_with_timeout
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users//Library/Caches/pypoetry/virtualenvs/bw-impossible-_xsmYHYG-py3.11/lib/python3.11/site-packages/google/api_core/grpc_helpers.py", line 78, in error_remapped_callable
    raise exceptions.from_grpc_error(exc) from exc
google.api_core.exceptions.InvalidArgument: 400 * GenerateContentRequest.generation_config.response_schema.type: field predicate failed: $ != TYPE_UNSPECIFIED

@MarkDaoust
Copy link
Collaborator

MarkDaoust commented May 14, 2024

Video works. Try the cookbook video tutorial: https://colab.research.google.com/github/google-gemini/cookbook/blob/main/quickstarts/Video.ipynb

GenerateContentRequest.generation_config.response_schema.type: field predicate failed: $ != TYPE_UNSPECIFIED

That error is about the generation_config.response_schema parameter. What did you pass it? I can't see in that paste.

@schnee
Copy link
Author

schnee commented May 14, 2024

@MarkDaoust - I literally updated the packages and reran the code from the top.

I'm good with someone (me?) closing this ticket, and I appreciate the pointers to make the code run.

@Bikatr7
Copy link

Bikatr7 commented May 15, 2024

@MarkDaoust - I literally updated the packages and reran the code from the top.

I'm good with someone (me?) closing this ticket, and I appreciate the pointers to make the code run.

You can close it yourself if you resolved the issue.

@joja16
Copy link

joja16 commented May 17, 2024

I encountered the error:
ValueError('The response.parts quick accessor only works for a single candidate, but none were returned. Check the response.prompt_feedback to see if the prompt was blocked.')

Has anyone encountered the same?
The video I uploaded was over 5 minutes long, here is my code:

safety_settings=[
    {
        "category": "HARM_CATEGORY_HARASSMENT",
        "threshold": "BLOCK_NONE",
    },
    {
        "category": "HARM_CATEGORY_HATE_SPEECH",
        "threshold": "BLOCK_NONE",
    },
    {
        "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
        "threshold": "BLOCK_NONE",
    },
    {
        "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
        "threshold": "BLOCK_NONE",
    },
    ]
video_file = genai.upload_file(path=filepath)
    
    while video_file.state.name == "PROCESSING":
        print('Waiting for video to be processed.')
        time.sleep(10)
        video_file_infos = genai.get_file(video_file.name)
        if video_file_infos.state.name == "ACTIVE":
            response = model.generate_content(['what is this?', video_file], safety_settings=safety_settings,
                                              request_options={"timeout": 600})
            
            genai.delete_file(video_file_infos.name)
            return response

@MarkDaoust
Copy link
Collaborator

MarkDaoust commented May 17, 2024

Try printing the response object that you got back.

@MarkDaoust MarkDaoust reopened this May 17, 2024
@joja16
Copy link

joja16 commented May 18, 2024

Thanks for your reply.
This is the response object I got back:
image

And I changed to async code:
image

@joja16
Copy link

joja16 commented May 18, 2024

I found out the reason. In my video which contains footage of creating a doll character, it started with creating body parts and clothes. So Google has determined that the video violates sexually.
Is there any way to fix it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:other Questions unrelated to SDK status:awaiting user response Awaiting a response from the author type:bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants