You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have read and followed the docs and still think this is a bug
Description
I have created subclips of a video in .mp4 using ffmpeg (through moviepy):
# moviepy.video.io.ffmpeg_tools.ffmpeg_extract_subclipdefffmpeg_extract_subclip(filename, t1, t2, targetname=None):
""" Makes a new video file playing video file ``filename`` between the times ``t1`` and ``t2``. """name, ext=os.path.splitext(filename)
ifnottargetname:
T1, T2= [int(1000*t) fortin [t1, t2]]
targetname="%sSUB%d_%d.%s"% (name, T1, T2, ext)
cmd= [get_setting("FFMPEG_BINARY"),"-y",
"-ss", "%0.2f"%t1,
"-i", filename,
"-t", "%0.2f"%(t2-t1),
"-map", "0", "-vcodec", "copy", "-acodec", "copy", targetname]
subprocess_call(cmd)
Output:
The subclip path is passed to VideoUrl:
subclip=VideoUrl("<subclip_path>")
Trying to load the tensors fails:
tensors=subclip.load()
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[21], line 1
----> 1 tensors = subclip.load()
File ~/Projects/chrisammon3000/experiments/docarray/docarray-test/.venv/lib/python3.11/site-packages/docarray/typing/url/video_url.py:96, in VideoUrl.load(self, **kwargs)
33 """
34 Load the data from the url into a `NamedTuple` of
35 [`VideoNdArray`][docarray.typing.VideoNdArray],
(...)
93 [`NdArray`][docarray.typing.NdArray] of the key frame indices.
94 """
95 buffer = self.load_bytes(**kwargs)
---> 96 return buffer.load()
File ~/Projects/chrisammon3000/experiments/docarray/docarray-test/.venv/lib/python3.11/site-packages/docarray/typing/bytes/video_bytes.py:86, in VideoBytes.load(self, **kwargs)
84 audio = parse_obj_as(AudioNdArray, np.array(audio_frames))
85 else:
---> 86 audio = parse_obj_as(AudioNdArray, np.stack(audio_frames))
88 video = parse_obj_as(VideoNdArray, np.stack(video_frames))
89 indices = parse_obj_as(NdArray, keyframe_indices)
File ~/Projects/chrisammon3000/experiments/docarray/docarray-test/.venv/lib/python3.11/site-packages/numpy/core/shape_base.py:449, in stack(arrays, axis, out, dtype, casting)
447 shapes = {arr.shape for arr in arrays}
448 if len(shapes) != 1:
--> 449 raise ValueError('all input arrays must have the same shape')
451 result_ndim = arrays[0].ndim + 1
452 axis = normalize_axis_index(axis, result_ndim)
ValueError: all input arrays must have the same shape
Stepping through the code shows that the first audio frame has a sample rate of 16:
The second and all subsequent frames have 1024 samples:
So this results in arrays with different shapes for the audio.
What Ive tried:
I have tried adjusting the options for ffmpeg like converting to AAC ad specifying audio channels and it does fix the problem, however it takes about 10 times longer to create the subclips.
Using a preprocessing step to pad the arrays before reading them into DocArray would require reading and writing each subclip again
If there is a way to handle the shape mismatch inside DocArray that would be great because it would let me create the subclips and model them as quickly as possible. It would need to be added to this block:
importosfrompathlibimportPathimportnumpyasnpfromdocarray.typingimportVideoUrlfrommoviepy.video.io.ffmpeg_toolsimportffmpeg_extract_subclipdefgenerate_subclips(parent_path, video_id, video_uri, video_duration, duration=60):
subclips_path=Path(parent_path) /"subclips"subclips_path.mkdir(exist_ok=True)
start_times=np.arange(0, video_duration, duration)
end_times=np.append(start_times[1:], video_duration)
clip_times=list(zip(start_times, end_times))
forstart_time, end_timeinclip_times:
# filename should have start_end seconds as part of the nameoutput_file_path=subclips_path/f"{video_id}__{start_time}_{end_time}.{video_uri.suffix[1:]}"ffmpeg_extract_subclip(video_uri, start_time, end_time, targetname=output_file_path)
# Example usage# parent_path = 'path/to/parent/directory'# video_id = 'example_video_id'# video_uri = Path('path/to/video.mp4')# video_duration = 1200 # for example, 20 minutes# generate_subclips(parent_path, video_id, video_uri, video_duration, duration=60)defsort_key(path):
"""Sorts by the start time in the subclip file name For example: Fu7YkoRWKB8_Y__0_60.mp4 will sort by `0` """# Extract the integer after "__" from the filenamereturnint(path.stem.split('__')[1].split('_')[0])
subclips_dir=Path(os.getcwd()).parent/"subclips"# create subclipsgenerate_subclips(subclips_dir, <video_id>, <video_uri>, <video_duration>, duration=60)
subclips_paths=sorted(subclips_dir.iterdir(), key=sort_key)
video_urls= [VideoUrl(f"{str(subclip)}") forsubclipinsubclips_paths]
# load tensors# the first subclip might work...subclip0=VideoUrl(str(subclips_paths[0]))
subclip0_tensors=subclip.load()
# but the second and other subclips throw the shape mismatch errorsubclip1=VideoUrl(str(subclips_paths[1]))
subclip1_tensors=subclip.load()
The text was updated successfully, but these errors were encountered:
chrisammon3000
changed the title
Loading audio tensors fails: ValueError: all input arrays must have the same shape
Loading audio tensors fails: ValueError: all input arrays must have the same shape
Mar 6, 2024
@JoanFM Created a pull request for this (#1880). The fix applies to audio from video only, since sometimes tools like FFMPEG downsample blank frames when they create subclips, which is what I was doing when I ran into this error. Please let me know if I should resolve the failed check for signed commits in the PR.
@JoanFM Created a pull request for this (#1880). The fix applies to audio from video only, since sometimes tools like FFMPEG downsample blank frames when they create subclips, which is what I was doing when I ran into this error. Please let me know if I should resolve the failed check for signed commits in the PR.
Initial Checks
Description
I have created subclips of a video in .mp4 using ffmpeg (through moviepy):
Output:
The subclip path is passed to
VideoUrl
:Trying to load the tensors fails:
Stepping through the code shows that the first audio frame has a sample rate of 16:
The second and all subsequent frames have 1024 samples:
So this results in arrays with different shapes for the audio.
What Ive tried:
If there is a way to handle the shape mismatch inside DocArray that would be great because it would let me create the subclips and model them as quickly as possible. It would need to be added to this block:
docarray/docarray/typing/bytes/video_bytes.py
Lines 83 to 86 in f71a5e6
Example Code
Python, DocArray & OS Version
Affected Components
The text was updated successfully, but these errors were encountered: