Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to prepare training data for ControlNeXt-SVD-v2? #29

Open
JWargrave opened this issue Aug 22, 2024 · 1 comment
Open

How to prepare training data for ControlNeXt-SVD-v2? #29

JWargrave opened this issue Aug 22, 2024 · 1 comment

Comments

@JWargrave
Copy link

Hi, thank you for your great job!

I want to finetune ControlNeXt-SVD-v2 on my own dataset. And I have some problems with data preprocessing.


First is guide_path in meta_info.json. According to the preprocess.py, I think pose_video.mp4 obtained by the code below is the corresponding guide_path for a given train_video.mp4.

from dwpose.dwpose_detector import dwpose_detector as dwprocessor
from dwpose.util import draw_pose
import decord
from tqdm import tqdm
import numpy as np
import cv2

def write_mp4(list_of_rgb_np_img,fps,output_filename):
    height, width, _ = list_of_rgb_np_img[0].shape
    fourcc = cv2.VideoWriter_fourcc(*'mp4v')
    video_writer = cv2.VideoWriter(output_filename, fourcc, fps, (width, height))
    for frame in list_of_rgb_np_img:
        video_writer.write(cv2.cvtColor(frame, cv2.COLOR_RGB2BGR))
    video_writer.release()

video_path='train_video.mp4'
vr=decord.VideoReader(video_path,ctx=decord.cpu(0))

frames=vr.get_batch(list(range(0, len(vr)))).asnumpy()

height,width=frames.shape[1],frames.shape[2]

detected_poses = [np.array(draw_pose(dwprocessor(frm),height,width)).transpose((1,2,0)) for frm in tqdm(frames, desc="DWPose")]
dwprocessor.release_memory()
write_mp4(detected_poses,vr.get_avg_fps(),'./pose_video.mp4')

For example:

train_video.mp4
pose_video.mp4

Is it right?


Second is meta_info in meta_info.json (i.e., meta_info_example/meta_info/1.json), which contains information about boxes, hands_boxes and hands_score of every frame. Could you tell me how to calculate these three variables?


Thanks a lot.

@JWargrave
Copy link
Author

I also want to know what the ref_w of draw_pose was when you trained ControlNeXt-SVD-v2? Is it the default 2160?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant