Not able to generate a depthmap that is longer than 3 to 5 minutes [FEATURE REQUEST maybe??] #411

eyeEmotion · 2024-02-23T19:00:35Z

After testing which depthmap-model was suitable for my needs, where I want to generate depthmaps to convert (old) feature films to 3D, I suddenly had to discover that I can't process videos that are longer than around 3 to 5 minutes, even if the filesize is moderate.

With my 32Gb of Ram, I still get out of memory errors. So I'm assuming it first wants to extract every frame, before generating the depthmap frames. But this makes it it impossible to ever generate a depthmap video for larger videos.
Isn't it better to have it:

extract a certain amount of frames, process them and generate the video
extract further frames, starting from the point where it left off, process the next batch
generate the video and append to the depthmap video that has been created up to that point

and continue on, untill the entire video has been processed?
Or render/process it like video editors do. They also have to deal with a lot of frames. Is use Davinci Resolve, and it is able to generate a depthmap and process it on the video to create stereoscopic 3D (SBS) and render the video.
The reason I don't want to use Davinci Resolve's depthmap, because it doesn't really even create the general outline too well. Not like Midas atleast. Which makes some unwanted extrusions and prone too wobbly effects. It's fast, as it can create a deptmap in an instant. But you're stuck with the level of detail Davinci Resolve has set. No way to choose if you want to sacrifice some speed for more detail.

I already tried cutting the movie into pieces of 3 to 5 minutes. But it's not easy to cut off exactly where you left of. And with a film lasting 1h30 to 2 hours, that's a lot of work of cutting and rendering, only to again have to append all the parts of the processed depthmap video and have it exactly at the same frames as the movie.

I hope there is just something that I'm missing and this is already possible.

Cheers

Edit: Tried the 5 minute file again. During 'computing output', the Virtual Memory goes up to around 90Gb. Then it starts generating the deptmaps. During that process, I can see the Virtual Memory go up to 126Gb (still have plenty left on my SDD). But then I get these errors and everything falls down.

To create a public link, set share=True in launch().
Startup time: 54.5s (prepare environment: 16.8s, import torch: 9.6s, import gradio: 4.6s, setup paths: 7.9s, initialize shared: 1.3s, other imports: 4.4s, setup codeformer: 1.2s, setup gfpgan: 0.4s, list SD models: 0.1s, load scripts: 7.6s, create ui: 0.3s, gradio launch: 0.7s).
Creating model from config: D:\Documenten\stable-diffusion-webui\configs\v1-inference.yaml
Applying attention optimization: Doggettx... done.
Model loaded in 56.4s (load weights from disk: 39.0s, create model: 0.7s, apply weights to model: 1.3s, apply half(): 8.5s, load textual inversion embeddings: 0.1s, calculate empty prompt: 6.7s).
Generating depthmaps for the video frames
DepthMap v0.4.6 (500ee72)
device: cuda
Loading model(s) ..
Loading model weights from ./models/midas/dpt_beit_large_384.pt
Computing output(s) ..
100%|██████████████████████████████████████████████████████████████████████████████| 7322/7322 [50:16<00:00, 2.43it/s]
Computing output(s) done.
All done.

Processing generated depthmaps
Traceback (most recent call last):
File "D:\Documenten\stable-diffusion-webui\extensions\stable-diffusion-webui-depthmap-script\src\common_ui.py", line 457, in run_generate
ret = video_mode.gen_video(
File "D:\Documenten\stable-diffusion-webui\extensions\stable-diffusion-webui-depthmap-script\src\video_mode.py", line 150, in gen_video
input_depths = process_predicitons(input_depths, smoothening)
File "D:\Documenten\stable-diffusion-webui\extensions\stable-diffusion-webui-depthmap-script\src\video_mode.py", line 126, in process_predicitons
a, b = np.percentile(np.stack(processed), [0.5, 99.5])
File "<array_function internals>", line 180, in percentile
File "D:\Documenten\stable-diffusion-webui\venv\lib\site-packages\numpy\lib\function_base.py", line 4166, in percentile
return _quantile_unchecked(
File "D:\Documenten\stable-diffusion-webui\venv\lib\site-packages\numpy\lib\function_base.py", line 4424, in _quantile_unchecked
r, k = _ureduce(a,
File "D:\Documenten\stable-diffusion-webui\venv\lib\site-packages\numpy\lib\function_base.py", line 3725, in _ureduce
r = func(a, **kwargs)
File "D:\Documenten\stable-diffusion-webui\venv\lib\site-packages\numpy\lib\function_base.py", line 4590, in _quantile_ureduce_func
arr = a.flatten()
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 20.9 GiB for an array with shape (11246592000,) and data type float16

The text was updated successfully, but these errors were encountered:

semjon00 · 2024-02-23T19:46:56Z

This indeed would be a great addition to the program! Sadly I am busy with other things and can't promise to add it anytime soon.

eyeEmotion · 2024-02-23T20:36:26Z

This indeed would be a great addition to the program! Sadly I am busy with other things and can't promise to add it anytime soon.

I understand. I'm just putting it out there.

In the meantime, I tried it again with the 5 minute video. This time I copied the errors I got. Don't know if they will be helpful for anybody.

During 'computing output', the Virtual Memory goes up to around 90Gb. Then it starts generating the deptmaps. During that process, I can see the Virtual Memory go up to 126Gb (still have plenty left on my SDD). But then I get these errors and everything falls down.

To create a public link, set share=True in launch().
Startup time: 54.5s (prepare environment: 16.8s, import torch: 9.6s, import gradio: 4.6s, setup paths: 7.9s, initialize shared: 1.3s, other imports: 4.4s, setup codeformer: 1.2s, setup gfpgan: 0.4s, list SD models: 0.1s, load scripts: 7.6s, create ui: 0.3s, gradio launch: 0.7s).
Creating model from config: D:\Documenten\stable-diffusion-webui\configs\v1-inference.yaml
Applying attention optimization: Doggettx... done.
Model loaded in 56.4s (load weights from disk: 39.0s, create model: 0.7s, apply weights to model: 1.3s, apply half(): 8.5s, load textual inversion embeddings: 0.1s, calculate empty prompt: 6.7s).
Generating depthmaps for the video frames
DepthMap v0.4.6 (500ee72)
device: cuda
Loading model(s) ..
Loading model weights from ./models/midas/dpt_beit_large_384.pt
Computing output(s) ..
100%|██████████████████████████████████████████████████████████████████████████████| 7322/7322 [50:16<00:00, 2.43it/s]
Computing output(s) done.
All done.

Processing generated depthmaps
Traceback (most recent call last):
File "D:\Documenten\stable-diffusion-webui\extensions\stable-diffusion-webui-depthmap-script\src\common_ui.py", line 457, in run_generate
ret = video_mode.gen_video(
File "D:\Documenten\stable-diffusion-webui\extensions\stable-diffusion-webui-depthmap-script\src\video_mode.py", line 150, in gen_video
input_depths = process_predicitons(input_depths, smoothening)
File "D:\Documenten\stable-diffusion-webui\extensions\stable-diffusion-webui-depthmap-script\src\video_mode.py", line 126, in process_predicitons
a, b = np.percentile(np.stack(processed), [0.5, 99.5])
File "<array_function internals>", line 180, in percentile
File "D:\Documenten\stable-diffusion-webui\venv\lib\site-packages\numpy\lib\function_base.py", line 4166, in percentile
return _quantile_unchecked(
File "D:\Documenten\stable-diffusion-webui\venv\lib\site-packages\numpy\lib\function_base.py", line 4424, in _quantile_unchecked
r, k = _ureduce(a,
File "D:\Documenten\stable-diffusion-webui\venv\lib\site-packages\numpy\lib\function_base.py", line 3725, in _ureduce
r = func(a, **kwargs)
File "D:\Documenten\stable-diffusion-webui\venv\lib\site-packages\numpy\lib\function_base.py", line 4590, in _quantile_ureduce_func
arr = a.flatten()
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 20.9 GiB for an array with shape (11246592000,) and data type float16

petermg · 2024-03-14T06:55:20Z

This indeed would be a great addition to the program! Sadly I am busy with other things and can't promise to add it anytime soon.

Seriously! I'm trying to do this as well. If you implemented the suggestions made by the OP, that would be insane. We could convert an entire feature length film to 3D with minimal interaction! As of right now I am outputting my video files to png files and even then it seems after 3300 I get an OOM error, which I find bizarre since I am expecting it to just process each frame individually, don't know what it's doing that will give it an OOM error but it seems unnecessary? I figured it would avoid the OOM error by batch processing image files?

semjon00 added the orange A vibrant color label Jun 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Not able to generate a depthmap that is longer than 3 to 5 minutes [FEATURE REQUEST maybe??] #411

Not able to generate a depthmap that is longer than 3 to 5 minutes [FEATURE REQUEST maybe??] #411

eyeEmotion commented Feb 23, 2024 •

edited

Loading

semjon00 commented Feb 23, 2024

eyeEmotion commented Feb 23, 2024

petermg commented Mar 14, 2024

Not able to generate a depthmap that is longer than 3 to 5 minutes [FEATURE REQUEST maybe??] #411

Not able to generate a depthmap that is longer than 3 to 5 minutes [FEATURE REQUEST maybe??] #411

Comments

eyeEmotion commented Feb 23, 2024 • edited Loading

semjon00 commented Feb 23, 2024

eyeEmotion commented Feb 23, 2024

petermg commented Mar 14, 2024

eyeEmotion commented Feb 23, 2024 •

edited

Loading