Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create safe slices from cdef_line_buf #981

Closed
wants to merge 53 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
53 commits
Select commit Hold shift + click to select a range
0e4fb3c
`struct DisjointMutBounds`: Store `Location::caller`.
kkysen Apr 17, 2024
428787c
`trait DisjointMutIndex`: Improve error messages to match `std`'s and…
kkysen Apr 17, 2024
99fc2e4
`struct DisjointMut`: Store `Location::caller()` and improve out of b…
kkysen Apr 18, 2024
82570be
`struct DisjointMut`: Fix missing initializations of `DisjointMut` fi…
kkysen Apr 17, 2024
00df89d
`struct DisjointMut`: Fix missing initializations of `DisjointMut` fi…
kkysen Apr 18, 2024
8097a91
`fn load_tmvs_c`: Initialize a `refmvs_temporal_block` all at once.
kkysen Apr 15, 2024
c365d13
`fn load_tmvs_c`: Add an extra safe `DisjointMut` arg for `rp_proj`.
kkysen Apr 15, 2024
57bdd9e
`fn Rav1dRefmvsDSPContext::load_tmvs`: Refactor out.
kkysen Apr 15, 2024
9792f54
`fn load_tmvs_c`: Replace an indexing loop with an iterator.
kkysen Apr 15, 2024
5a5dfc5
`fn Rav1dRefmvsDSPContext::save_tmvs`: Make `fn rav1d_refmvs_save_tmv…
kkysen Apr 15, 2024
75d0d06
`fn Rav1dRefmvsDSPContext::load_tmvs`: Inline `fn RefMvsFrame::as_mut…
kkysen Apr 18, 2024
0ad9f33
`fn get_prev_frame_segid`: Fix overslicing.
kkysen Apr 17, 2024
dcdb372
`fn load_tmvs_c`: Simplify `rp_proj_offset`.
kkysen Apr 15, 2024
6b4ef97
`struct Rav1dRefmvsDSPContext`: Make `fn` ptr fields private now that…
kkysen Apr 15, 2024
d3f252e
`struct RefMvsFrame::{rp,rp_ref}`: Remove raw ptr fields and pass thr…
kkysen Apr 18, 2024
3eb384e
`fn save_tmvs_c`: Make `stride` a `usize` (it already is on the other…
kkysen Apr 19, 2024
6420e5a
`struct Rav1dFrameData::{mvs,ref_mvs}`: `Arc`ify with `Option`<Disjoi…
kkysen Apr 18, 2024
baa4212
`struct Rav1dContext::refmvs_pool`: Remove (for now) now-unused `refm…
kkysen Apr 19, 2024
990ff36
`struct Rav1dContext::cdf_pool`: Remove (for now) now-unused `cdf_poo…
kkysen Apr 19, 2024
2030345
`fn get_prev_frame segid`: Fix overslicing (#979)
kkysen Apr 19, 2024
4501b38
`fn load_tmvs_c`: Add safe `rp_proj` arg (#975)
kkysen Apr 19, 2024
d862eea
`struct Rav1dRefmvsDSPContext`: Add wrapper methods (#976)
kkysen Apr 19, 2024
9ee877e
`struct RefMvsFrame::{rp,rp_ref}`: Remove raw ptr fields and pass thr…
kkysen Apr 19, 2024
d5c1619
`struct Rav1dFrameData::{mvs,ref_mvs}`: `Arc`ify with `Option<Disjoin…
kkysen Apr 19, 2024
ebfe7ba
`struct Rav1dContext::cdf_pool`: Remove now-unused `cdf_pool` (#985)
kkysen Apr 19, 2024
aa35bf9
`fn *cdef_dsp_init*`: Make safe by making `c` arg a ref.
kkysen Apr 19, 2024
97a381d
`mod itx`: Isolate the `#[rustfmt::skip]`s to closures.
kkysen Apr 19, 2024
258e619
`fn *itx_dsp_init*`: Make safe by making `c` arg a ref.
kkysen Apr 19, 2024
5a08062
`fn *loop_filter_dsp_init*`: Make safe by making `c` arg a ref.
kkysen Apr 19, 2024
baaa8a9
`fn *mc_dsp_init*`: Make safe by making `c` arg a ref.
kkysen Apr 19, 2024
34a993e
`fn rav1d_get_cpu_flags`: Remove `#[cfg(feature = "asm")]`; it will j…
kkysen Apr 19, 2024
3c168aa
`fn Rav1dFilmGrainDSPContext::new`: Add `CpuFlags` arg.
kkysen Apr 19, 2024
213fcaf
`fn struct Rav1dCdefDSPContext::new`: Initialize directly.
kkysen Apr 19, 2024
6b96706
`fn Rav1dIntraPredDSPContext::new`: Initialize directly.
kkysen Apr 19, 2024
4a9441c
`fn Rav1dInvTxfmDSPContext::new`: Initialize directly.
kkysen Apr 19, 2024
723e3ee
`fn Rav1dLoopFilterDSPContext::new`: Initialize directly.
kkysen Apr 19, 2024
11a02db
`fn Rav1dLoopRestorationDSPContext::new`: Initialize directly.
kkysen Apr 19, 2024
ef6ff68
`struct Rav1dMCDSPContext::w_mask`: Use `wrap_fn_ptr!` to provide a d…
kkysen Apr 19, 2024
263bb73
`struct Rav1dMCDSPContext::w_mask`: Use `enum_map!`.
kkysen Apr 19, 2024
bd3ef93
`struct Rav1dMCDSPContext::mc`: Use `wrap_fn_ptr!`.
kkysen Apr 19, 2024
f06c06b
`struct Rav1dMCDSPContext::mc_scaled`: Use `wrap_fn_ptr!`.
kkysen Apr 19, 2024
e263531
`struct Rav1dMCDSPContext::mct`: Use `wrap_fn_ptr!`.
kkysen Apr 19, 2024
b45e601
`struct Rav1dMCDSPContext::mct_scaled`: Use `wrap_fn_ptr!`.
kkysen Apr 19, 2024
56f11c8
`struct Rav1dMCDSPContext::mc{,t}{,_scaled}`: Make `enum_map!`s.
kkysen Apr 19, 2024
96b1e25
`struct Rav1dMCDSPContext::new`: Initialize directly.
kkysen Apr 19, 2024
dd2a181
`fn Rav1dMCDSPContext::init_x86`: Use `bpc_fn!`.
kkysen Apr 19, 2024
7667ae3
`fn Rav1d*DSPContext::default`: Rename from `new_c`.
kkysen Apr 19, 2024
cb534ed
`fn Rav1dDSPContext::new`: Initialize directly.
kkysen Apr 19, 2024
6dc1735
`fn Rav1dDSPContext::get`: Lazily initialize with a `OnceLock`, stori…
kkysen Apr 19, 2024
15e995a
`fn Rav1dDSP*Context::init*`: Mark `#[inline(always)]` like C to redu…
kkysen Apr 19, 2024
3ad56bb
`fn Rav1dDSPContext::get`: Lazily initialize with `OnceLock` and stor…
kkysen Apr 19, 2024
b8b83d5
Create safe slices from `cdef_line_buf`
rinon Apr 18, 2024
217cd21
Fix cdef_line_buf strides
rinon Apr 18, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
264 changes: 140 additions & 124 deletions src/cdef.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,19 +5,14 @@ use crate::include::common::bitdepth::LeftPixelRow2px;
use crate::include::common::intops::apply_sign;
use crate::include::common::intops::iclip;
use crate::include::common::intops::ulog2;
use crate::src::cpu::CpuFlags;
use crate::src::tables::dav1d_cdef_directions;
use bitflags::bitflags;
use libc::ptrdiff_t;
use std::cmp;
use std::ffi::c_int;
use std::ffi::c_uint;

#[cfg(feature = "asm")]
use cfg_if::cfg_if;

#[cfg(feature = "asm")]
use crate::src::cpu::{rav1d_get_cpu_flags, CpuFlags};

#[cfg(feature = "asm")]
use crate::include::common::bitdepth::BPC;

Expand Down Expand Up @@ -1030,98 +1025,6 @@ unsafe fn cdef_find_dir_rust<BD: BitDepth>(
return best_dir;
}

#[inline(always)]
#[cfg(all(feature = "asm", any(target_arch = "x86", target_arch = "x86_64"),))]
unsafe fn cdef_dsp_init_x86<BD: BitDepth>(c: *mut Rav1dCdefDSPContext) {
let flags = rav1d_get_cpu_flags();

match BD::BPC {
BPC::BPC8 => {
if !flags.contains(CpuFlags::SSE2) {
return;
}

(*c).fb[0] = dav1d_cdef_filter_8x8_8bpc_sse2;
(*c).fb[1] = dav1d_cdef_filter_4x8_8bpc_sse2;
(*c).fb[2] = dav1d_cdef_filter_4x4_8bpc_sse2;

if !flags.contains(CpuFlags::SSSE3) {
return;
}

(*c).dir = dav1d_cdef_dir_8bpc_ssse3;
(*c).fb[0] = dav1d_cdef_filter_8x8_8bpc_ssse3;
(*c).fb[1] = dav1d_cdef_filter_4x8_8bpc_ssse3;
(*c).fb[2] = dav1d_cdef_filter_4x4_8bpc_ssse3;

if !flags.contains(CpuFlags::SSE41) {
return;
}

(*c).dir = dav1d_cdef_dir_8bpc_sse4;
(*c).fb[0] = dav1d_cdef_filter_8x8_8bpc_sse4;
(*c).fb[1] = dav1d_cdef_filter_4x8_8bpc_sse4;
(*c).fb[2] = dav1d_cdef_filter_4x4_8bpc_sse4;

#[cfg(target_arch = "x86_64")]
{
if !flags.contains(CpuFlags::AVX2) {
return;
}

(*c).dir = dav1d_cdef_dir_8bpc_avx2;
(*c).fb[0] = dav1d_cdef_filter_8x8_8bpc_avx2;
(*c).fb[1] = dav1d_cdef_filter_4x8_8bpc_avx2;
(*c).fb[2] = dav1d_cdef_filter_4x4_8bpc_avx2;

if !flags.contains(CpuFlags::AVX512ICL) {
return;
}

(*c).fb[0] = dav1d_cdef_filter_8x8_8bpc_avx512icl;
(*c).fb[1] = dav1d_cdef_filter_4x8_8bpc_avx512icl;
(*c).fb[2] = dav1d_cdef_filter_4x4_8bpc_avx512icl;
}
}
BPC::BPC16 => {
if !flags.contains(CpuFlags::SSSE3) {
return;
}

(*c).dir = dav1d_cdef_dir_16bpc_ssse3;
(*c).fb[0] = dav1d_cdef_filter_8x8_16bpc_ssse3;
(*c).fb[1] = dav1d_cdef_filter_4x8_16bpc_ssse3;
(*c).fb[2] = dav1d_cdef_filter_4x4_16bpc_ssse3;

if !flags.contains(CpuFlags::SSE41) {
return;
}

(*c).dir = dav1d_cdef_dir_16bpc_sse4;

#[cfg(target_arch = "x86_64")]
{
if !flags.contains(CpuFlags::AVX2) {
return;
}

(*c).dir = dav1d_cdef_dir_16bpc_avx2;
(*c).fb[0] = dav1d_cdef_filter_8x8_16bpc_avx2;
(*c).fb[1] = dav1d_cdef_filter_4x8_16bpc_avx2;
(*c).fb[2] = dav1d_cdef_filter_4x4_16bpc_avx2;

if !flags.contains(CpuFlags::AVX512ICL) {
return;
}

(*c).fb[0] = dav1d_cdef_filter_8x8_16bpc_avx512icl;
(*c).fb[1] = dav1d_cdef_filter_4x8_16bpc_avx512icl;
(*c).fb[2] = dav1d_cdef_filter_4x4_16bpc_avx512icl;
}
}
};
}

#[inline(always)]
#[cfg(all(feature = "asm", any(target_arch = "arm", target_arch = "aarch64"),))]
unsafe extern "C" fn cdef_filter_8x8_neon_erased<BD: BitDepth>(
Expand Down Expand Up @@ -1274,37 +1177,150 @@ unsafe extern "C" fn cdef_filter_4x4_neon_erased<BD: BitDepth>(
}
}

#[inline(always)]
#[cfg(all(feature = "asm", any(target_arch = "arm", target_arch = "aarch64"),))]
unsafe fn cdef_dsp_init_arm<BD: BitDepth>(c: *mut Rav1dCdefDSPContext) {
let flags = rav1d_get_cpu_flags();
impl Rav1dCdefDSPContext {
pub const fn default<BD: BitDepth>() -> Self {
Self {
dir: cdef_find_dir_c_erased::<BD>,
fb: [
cdef_filter_block_8x8_c_erased::<BD>,
cdef_filter_block_4x8_c_erased::<BD>,
cdef_filter_block_4x4_c_erased::<BD>,
],
}
}

#[cfg(all(feature = "asm", any(target_arch = "x86", target_arch = "x86_64")))]
#[inline(always)]
const fn init_x86<BD: BitDepth>(mut self, flags: CpuFlags) -> Self {
match BD::BPC {
BPC::BPC8 => {
if !flags.contains(CpuFlags::SSE2) {
return self;
}

self.fb[0] = dav1d_cdef_filter_8x8_8bpc_sse2;
self.fb[1] = dav1d_cdef_filter_4x8_8bpc_sse2;
self.fb[2] = dav1d_cdef_filter_4x4_8bpc_sse2;

if !flags.contains(CpuFlags::SSSE3) {
return self;
}

self.dir = dav1d_cdef_dir_8bpc_ssse3;
self.fb[0] = dav1d_cdef_filter_8x8_8bpc_ssse3;
self.fb[1] = dav1d_cdef_filter_4x8_8bpc_ssse3;
self.fb[2] = dav1d_cdef_filter_4x4_8bpc_ssse3;

if !flags.contains(CpuFlags::SSE41) {
return self;
}

self.dir = dav1d_cdef_dir_8bpc_sse4;
self.fb[0] = dav1d_cdef_filter_8x8_8bpc_sse4;
self.fb[1] = dav1d_cdef_filter_4x8_8bpc_sse4;
self.fb[2] = dav1d_cdef_filter_4x4_8bpc_sse4;

#[cfg(target_arch = "x86_64")]
{
if !flags.contains(CpuFlags::AVX2) {
return self;
}

self.dir = dav1d_cdef_dir_8bpc_avx2;
self.fb[0] = dav1d_cdef_filter_8x8_8bpc_avx2;
self.fb[1] = dav1d_cdef_filter_4x8_8bpc_avx2;
self.fb[2] = dav1d_cdef_filter_4x4_8bpc_avx2;

if !flags.contains(CpuFlags::AVX512ICL) {
return self;
}

self.fb[0] = dav1d_cdef_filter_8x8_8bpc_avx512icl;
self.fb[1] = dav1d_cdef_filter_4x8_8bpc_avx512icl;
self.fb[2] = dav1d_cdef_filter_4x4_8bpc_avx512icl;
}
}
BPC::BPC16 => {
if !flags.contains(CpuFlags::SSSE3) {
return self;
}

self.dir = dav1d_cdef_dir_16bpc_ssse3;
self.fb[0] = dav1d_cdef_filter_8x8_16bpc_ssse3;
self.fb[1] = dav1d_cdef_filter_4x8_16bpc_ssse3;
self.fb[2] = dav1d_cdef_filter_4x4_16bpc_ssse3;

if !flags.contains(CpuFlags::SSE41) {
return self;
}

if !flags.contains(CpuFlags::NEON) {
return;
self.dir = dav1d_cdef_dir_16bpc_sse4;

#[cfg(target_arch = "x86_64")]
{
if !flags.contains(CpuFlags::AVX2) {
return self;
}

self.dir = dav1d_cdef_dir_16bpc_avx2;
self.fb[0] = dav1d_cdef_filter_8x8_16bpc_avx2;
self.fb[1] = dav1d_cdef_filter_4x8_16bpc_avx2;
self.fb[2] = dav1d_cdef_filter_4x4_16bpc_avx2;

if !flags.contains(CpuFlags::AVX512ICL) {
return self;
}

self.fb[0] = dav1d_cdef_filter_8x8_16bpc_avx512icl;
self.fb[1] = dav1d_cdef_filter_4x8_16bpc_avx512icl;
self.fb[2] = dav1d_cdef_filter_4x4_16bpc_avx512icl;
}
}
};

self
}

(*c).dir = match BD::BPC {
BPC::BPC8 => dav1d_cdef_find_dir_8bpc_neon,
BPC::BPC16 => dav1d_cdef_find_dir_16bpc_neon,
};
(*c).fb[0] = cdef_filter_8x8_neon_erased::<BD>;
(*c).fb[1] = cdef_filter_4x8_neon_erased::<BD>;
(*c).fb[2] = cdef_filter_4x4_neon_erased::<BD>;
}
#[cfg(all(feature = "asm", any(target_arch = "arm", target_arch = "aarch64")))]
#[inline(always)]
const fn init_arm<BD: BitDepth>(mut self, flags: CpuFlags) -> Self {
if !flags.contains(CpuFlags::NEON) {
return self;
}

self.dir = match BD::BPC {
BPC::BPC8 => dav1d_cdef_find_dir_8bpc_neon,
BPC::BPC16 => dav1d_cdef_find_dir_16bpc_neon,
};
self.fb[0] = cdef_filter_8x8_neon_erased::<BD>;
self.fb[1] = cdef_filter_4x8_neon_erased::<BD>;
self.fb[2] = cdef_filter_4x4_neon_erased::<BD>;

#[cold]
pub unsafe fn rav1d_cdef_dsp_init<BD: BitDepth>(c: *mut Rav1dCdefDSPContext) {
(*c).dir = cdef_find_dir_c_erased::<BD>;
(*c).fb[0] = cdef_filter_block_8x8_c_erased::<BD>;
(*c).fb[1] = cdef_filter_block_4x8_c_erased::<BD>;
(*c).fb[2] = cdef_filter_block_4x4_c_erased::<BD>;
self
}

#[cfg(feature = "asm")]
cfg_if! {
if #[cfg(any(target_arch = "x86", target_arch = "x86_64"))] {
cdef_dsp_init_x86::<BD>(c);
} else if #[cfg(any(target_arch = "arm", target_arch = "aarch64"))] {
cdef_dsp_init_arm::<BD>(c);
#[inline(always)]
const fn init<BD: BitDepth>(self, flags: CpuFlags) -> Self {
#[cfg(feature = "asm")]
{
#[cfg(any(target_arch = "x86", target_arch = "x86_64"))]
{
return self.init_x86::<BD>(flags);
}
#[cfg(any(target_arch = "arm", target_arch = "aarch64"))]
{
return self.init_arm::<BD>(flags);
}
}

#[allow(unreachable_code)] // Reachable on some #[cfg]s.
{
let _ = flags;
self
}
}

pub const fn new<BD: BitDepth>(flags: CpuFlags) -> Self {
Self::default::<BD>().init::<BD>(flags)
}
}
9 changes: 4 additions & 5 deletions src/cdef_apply.rs
Original file line number Diff line number Diff line change
Expand Up @@ -171,7 +171,6 @@ pub(crate) unsafe fn rav1d_cdef_brow<BD: BitDepth>(
BPC::BPC8 => 0,
BPC::BPC16 => f.cur.p.bpc - 8,
};
let dsp = &*f.dsp;
let mut edges: CdefEdgeFlags = if by_start > 0 {
CdefEdgeFlags::HAVE_BOTTOM | CdefEdgeFlags::HAVE_TOP
} else {
Expand Down Expand Up @@ -304,7 +303,7 @@ pub(crate) unsafe fn rav1d_cdef_brow<BD: BitDepth>(

let mut variance = 0;
let dir = if y_pri_lvl != 0 || uv_pri_lvl != 0 {
(dsp.cdef.dir)(
(f.dsp.cdef.dir)(
bptrs[0].cast(),
f.cur.stride[0],
&mut variance,
Expand Down Expand Up @@ -370,7 +369,7 @@ pub(crate) unsafe fn rav1d_cdef_brow<BD: BitDepth>(
if y_pri_lvl != 0 {
let adj_y_pri_lvl = adjust_strength(y_pri_lvl, variance);
if adj_y_pri_lvl != 0 || y_sec_lvl != 0 {
dsp.cdef.fb[0](
f.dsp.cdef.fb[0](
bptrs[0].cast(),
f.cur.stride[0],
lr_bak[bit as usize][0].as_mut_ptr().cast(),
Expand All @@ -385,7 +384,7 @@ pub(crate) unsafe fn rav1d_cdef_brow<BD: BitDepth>(
);
}
} else if y_sec_lvl != 0 {
dsp.cdef.fb[0](
f.dsp.cdef.fb[0](
bptrs[0].cast(),
f.cur.stride[0],
(lr_bak[bit as usize][0]).as_mut_ptr().cast(),
Expand Down Expand Up @@ -469,7 +468,7 @@ pub(crate) unsafe fn rav1d_cdef_brow<BD: BitDepth>(
bot = bptrs[pl].offset((8 >> ss_ver) * uv_stride);
}

dsp.cdef.fb[uv_idx as usize](
f.dsp.cdef.fb[uv_idx as usize](
bptrs[pl].cast(),
f.cur.stride[1],
lr_bak[bit as usize][pl].as_mut_ptr().cast(),
Expand Down
1 change: 0 additions & 1 deletion src/cpu.rs
Original file line number Diff line number Diff line change
Expand Up @@ -180,7 +180,6 @@ static rav1d_cpu_flags: AtomicU32 = AtomicU32::new(0);
/// so it shouldn't be performance sensitive.
static rav1d_cpu_flags_mask: AtomicU32 = AtomicU32::new(!0);

#[cfg(feature = "asm")]
#[inline(always)]
pub(crate) fn rav1d_get_cpu_flags() -> CpuFlags {
let flags =
Expand Down
Loading