Take a walk in the generated world.
- use depth map to avoid changing camera angle at i2i
- implement hybrid t2i and i2i pipeline
- solve disgration of i2i
- 716x716
- upgrade midas
- change midas to stronger
- use DDIM?
- prompt engineering for sure
- Avoid overlapping depth models between pipe and dsd