Skip to content
This repository has been archived by the owner on Jan 7, 2023. It is now read-only.

Simd32 integration #104

Open
wants to merge 368 commits into
base: master
Choose a base branch
from
Open

Conversation

vbajaj1986
Copy link

No description provided.

tanty and others added 30 commits September 8, 2018 00:26
Fixes crash with
  piglit/bin/map_buffer_range-invalidate CopyBufferSubData \
                               increment-offset -auto -fbo

* Resize the resource storage already when the count is equal to the
  allocated size, fixes:

  Invalid write of size 8
  at 0xB72E4CF: virgl_drm_add_res (virgl_drm_winsys.c:629)
  by 0xB72E4CF: virgl_drm_emit_res (virgl_drm_winsys.c:663)
  by 0xB72A44A: virgl_encode_resource_copy_region (virgl_encode.c:776)
  by 0xB40CD12: st_copy_buffer_subdata (st_cb_bufferobjects.c:585)
  by 0xB244A3B: _mesa_CopyBufferSubData (bufferobj.c:2940)
  by 0x109A1E: upload (invalidate.c:169)
  by 0x109C2F: piglit_display (invalidate.c:215)
  by 0x4F80FBE: run_test (piglit_fbo_framework.c:52)
  by 0x4F66E5F: piglit_gl_test_run (piglit-framework-gl.c:229)
  by 0x10949D: main (invalidate.c:47)
  Address 0xbe07d30 is 0 bytes after a block of size 4,096 alloc'd
  at 0x4C31B25: calloc (in
       /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
  by 0xB72DAAF: virgl_drm_cmd_buf_create (virgl_drm_winsys.c:567)

* Also resize the space allocated for the handles, fixes:

  Invalid write of size 4
  at 0xB72E4F0: virgl_drm_add_res (virgl_drm_winsys.c:631)
  by 0xB72E4F0: virgl_drm_emit_res (virgl_drm_winsys.c:663)
  by 0xB72A44A: virgl_encode_resource_copy_region (virgl_encode.c:776)
  by 0xB40CD12: st_copy_buffer_subdata (st_cb_bufferobjects.c:585)
  by 0xB244A3B: _mesa_CopyBufferSubData (bufferobj.c:2940)
  by 0x109A1E: upload (invalidate.c:169)
  by 0x109C2F: piglit_display (invalidate.c:215)
  by 0x4F80FBE: run_test (piglit_fbo_framework.c:52)
  by 0x4F66E5F: piglit_gl_test_run (piglit-framework-gl.c:229)
  by 0x10949D: main (invalidate.c:47)
  Address 0xbe08570 is 0 bytes after a block of size 2,048 alloc'd
  at 0x4C2FB0F: malloc (
    in /usr/lib/valgrind/vgpreload_memcheck-amd64- linux.so)
  by 0xB72DAC8: virgl_drm_cmd_buf_create (virgl_drm_winsys.c:572)

Fixes: 4b15b5e ("virgl: resize resource bo allocation if we need to.")

v2: - Use REALLOC macro and avoid memory leak when re-allocation fails
    - add Fixes tag (both Emil Velikov)
    - reorder commit message

Signed-off-by: Gert Wollny <[email protected]>
(cherry picked from commit 9b0e8d8)
We require a single version of libdrm for all of our libdrm
dependencies (core and driver), but the way this is structured can make
the error message less than helpful, as one driver might be the one
setting the libdrm requirement, while another might be the one that
generates the version failure.

This adds a simple message to the output announcing which libdrm module
set the version, which might be more helpful.

v2: - Use message suggested by Eric Engstrom

Fixes: c445b1d
       ("meson: Use the same version for all libdrm checks")
Reviewed-by: Eric Engestrom <[email protected]>
(cherry picked from commit d25a27e)
The brw_vs_prog_data::double_inputs_read field comes directly from
shader_info::double_inputs which may contain inputs which are not
actually read.  Instead of using it directly, AND it with inputs_read
which is only things which are read.  Otherwise, we may end up
subtracting too many elements when computing elem_count.

Cc: [email protected]
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103241
Reviewed-by: Lionel Landwerlin <[email protected]>
(cherry picked from commit 7b26741)
Fix an other regression of
mesa: Make gl_vertex_array contain pointers to first order VAO members.
The regression showed up with drivers using the tnl module and
was reproducible using xonotic-glx -benchmark demos/the-big-keybench.dem.

Fixes: 64d2a20
    mesa: Make gl_vertex_array contain pointers to first order VAO members.
Tested-by: Ville Syrjälä <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Signed-off-by: Mathias Fröhlich <[email protected]>
(cherry picked from commit a6232b6)
Each invocation of va_copy() must be matched by a
corresponding invocation of va_end()

Reviewed-by: Eric Engestrom <[email protected]>
Fixes: 51691f0 "darwin: Use ASL for logging"
Signed-off-by: Andrii Simiklit <[email protected]>
(cherry picked from commit 267ed29)
The first usage of the 'va_list' instance could change it.

Reviewed-by: Eric Engestrom <[email protected]>
Fixes: 864148d "util: add util_vasprintf() for Windows (v2)"
Signed-off-by: Andrii Simiklit <[email protected]>
(cherry picked from commit 570cacb)
We should exit from the function 'util_vasprintf'
with error code -1 for case where 'malloc'
returns NULL

Reviewed-by: Eric Engestrom <[email protected]>
Fixes: 864148d "util: add util_vasprintf() for Windows (v2)"
Signed-off-by: Andrii Simiklit <[email protected]>
(cherry picked from commit 65cfe69)
MSDN:
"va_end must be called on each argument list that's initialized
 with va_start or va_copy before the function returns."

Reviewed-by: Eric Engestrom <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107810
Fixes: c6267eb "gallium/util: Stop bundling our snprintf implementation."
Signed-off-by: Andrii Simiklit <[email protected]>
(cherry picked from commit 2930b76)
If we have something like:

   #ifdef NOT_DEFINED
   #define A_MACRO(x) \
	if (x)
   #endif

The # on the #define is not skipped but the define itself is so
this then gets recognised as #if.

Until 28a3731 this didn't happen because we ended up in
<HASH>{NONSPACE} where BEGIN INITIAL was called stopping the
problem from happening.

This change makes sure we never call RETURN_TOKEN_NEVER_SKIP for
if/else/endif when processing a define.

Cc: Ian Romanick <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107772
Tested-By: Eero Tamminen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
(cherry picked from commit b9fe8ff)
…achable branch""

This reverts commit 2fd6f06.

Take back 28a3731 ("glsl: skip stringification in preprocessor if in
unreachable branch") after b9fe8ff ("glsl: fixer lexer for
unreachable defines") has made it to the branch.

Signed-off-by: Andres Gomez <[email protected]>
fixes: This commit was immediately reverted by commit 2dce117.

Signed-off-by: Andres Gomez <[email protected]>
Seems in case of 32-bit library, usage of msse2 makes
some stack corruption or incorrect instructions.
Usage with mstackrealign fixes that case.

v2: Fixed meson.

v3: Definition of c_sse2_args moved on the top (L.Landwerlin).
    Added mstackrealign for Android's mks where msee4.1 is used.

v4: Added for Vulkan also.

v5: Commit message correction.

CC: <[email protected]>
Fixes: 6b05c08 (i965: Compile with -msse3)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107779
Signed-off-by: Sergii Romantsov <[email protected]>
Reviewed-by: Dylan Baker <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
(cherry picked from commit d709f12)
Fixes
dEQP-GLES3.functional.fragment_ops.blend.default_framebuffer.rgb_func_alpha_func.dst.src_alpha_saturate_src_alpha_saturate
and friends with --deqp-egl-config-name=rgb565d0s0

Cc: "18.2" <[email protected]>
(cherry picked from commit f73f748)
gen9 hardware has a bug in the sampler cache that can cause GPU hangs
whenever an texture with aux compression enabled is in the sampler cache
together with an ASTC5x5 texture.  Because we can't control what the
client binds at any given time, we have two options: resolve the CCS or
decompresss the ASTC.  Doing a CCS or HiZ resolve is far less drastic
and will likely have a smaller performance impact.

Cc: [email protected]
Reviewed-by: Kristian H. Kristensen <[email protected]>
Tested-by: Tapani Pälli <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
(cherry picked from commit f9e630e)
Some of the bits of VERTEX_BUFFER_STATE such as access type, instance
data step rate, and pitch come from the pipeline.

Cc: [email protected]
Reviewed-by: Kenneth Graunke <[email protected]>
(cherry picked from commit c643c5e)
I have no idea if I'm correct about what's going wrong or if this is the
correct fix.  However, in my multiple weeks of banging my head on this
hang, a VUE reference counting bug seems to match all the symptoms and
it definitely fixes the hang.

Cc: [email protected]
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107280
Reviewed-by: Kenneth Graunke <[email protected]>
(cherry picked from commit b08b4b2)
Building of 32bit mesa with meson causes issue:
"implicit declaration of function ‘__builtin_ia32_clflush’".
Fixed by adding msse2 compilation flag.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107843
Fixes: 314879f (i965: Fix asynchronous mappings on !LLC platforms.)
Signed-off-by: Sergii Romantsov <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
(cherry picked from commit 97fcccb)
[Andres Gomez: resolve trivial conflicts]
Signed-off-by: Andres Gomez <[email protected]>

Conflicts:
	src/intel/tools/meson.build
There were two bugs working together to make things mostly work: I wasn't
dividing the VPM output size available by the size of a batch (vertex),
but I also had the size of the VPM reduced by a factor of 8.

Fixes dEQP-GLES3.functional.vertex_array_objects.all_attributes and it
seems also my intermittent varying failures.

Fixes: 1561e49 ("v3d: Emit the VCM_CACHE_SIZE packet.")
(cherry picked from commit a91b158)
The Vulkan 1.1.81 spec says:

    "It is legal for offset.x + extent.width or offset.y + extent.height
    to exceed the dimensions of the framebuffer - the scissor test still
    applies as defined above. Rasterization does not produce fragments
    outside of the framebuffer, so such fragments never have the scissor
    test performed on them."

Elsewhere, the Vulkan 1.1.81 spec says:

    "The application must ensure (using scissor if necessary) that all
    rendering is contained within the render area, otherwise the pixels
    outside of the render area become undefined and shader side effects
    may occur for fragments outside the render area. The render area
    must be contained within the framebuffer dimensions."

Unfortunately, there's some room for interpretation here as to what the
consequences are of having the render area set to exactly the
framebuffer dimensions and having a scissor that is larger than the
framebuffer.  Given that GL and other APIs provide automatic clipping to
the framebuffer, it makes sense that applications would assume that
Vulkan does this as well.  It costs us very little to play it safe and
just clamp client-provided scissors to the framebuffer dimensions.
Fortunately, the user is required to provide us with at least one
scissor so we don't need to handle the case where they don't.

Fixes: fb2a5ce "anv: Emit DRAWING_RECTANGLE once at driver..."
Reviewed-by: Kenneth Graunke <[email protected]>
(cherry picked from commit 465e5a8)
VI uses addrlib so it's unaffected.

Cc: 18.1 18.2 <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
(cherry picked from commit a1b9a00)
… on SI/CI

Cc: 18.2 <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
(cherry picked from commit d4e5228)
Cc: 18.1 18.2 <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
(cherry picked from commit da72b62)
important for debugging

Cc: 18.1 18.2 <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
(cherry picked from commit 662db03)
Cc: 18.2 <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
(cherry picked from commit a5f35aa)
Acked-by: Bas Nieuwenhuizen <[email protected]>
(cherry picked from commit 34a17a4)
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
(cherry picked from commit 6f00785)
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
CC: 18.2 <[email protected]>
(cherry picked from commit f6e09db)
Since commit af3685d various OpenGL applications regressed
on the classic mesa radeon driver.

Signed-off-by: Christopher Egert <[email protected]>
CC: 18.1 18.2 <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
(cherry picked from commit 51995f6)
This fixes the situation where we'd send a shader with just the
header and no data.

piglit/glsl-max-varyings test was causing this to happen, and
the renderer fix was breaking it.

v2: drop fprintf

Fixes: a8987b8 "virgl: add driver for virtio-gpu 3D (v2)"
Reviewed-by: Erik Faye-Lund <[email protected]>
(cherry picked from commit 240af61)
tpalli and others added 10 commits November 21, 2018 23:58
This change makes following test pass:
	dEQP-VK.api.info.device.extensions

Test: dEQP-VK.api.info.device.extensions
Signed-off-by: Tapani Pälli <[email protected]>

[strassek: carry this patch until the extensions are whitelisted in CTS]
…UsageANDROID

Android P and earlier expect that the surface supports storage images, and
so many of the tests fail when the framework checks for that support. The
framework also includes various image format and usage combinations that are
invalid for the hardware.

Drop the STORAGE restriction from the HAL and whitelist a pair of
formats so that existing versions of Android can pass these tests.

Fixes:
   dEQP-VK.wsi.android.*

Signed-off-by: Kevin Strasser <[email protected]>

(am from https://patchwork.freedesktop.org/patch/247681/)
…StorageOES

In the same fashion as is done for glEGLImageTextureTarget2D.

v2: share the fallback which sets baseformat and internalformat correctly
    which makes both of the tests pass (Tapani)

Fixes android.hardware.nativehardware.cts.AHardwareBufferNativeTests:

   #SingleLayer_ColorTest_GpuColorOutputCpuRead_R8G8B8X8_UNORM
   #SingleLayer_ColorTest_GpuColorOutputIsRenderable_R8G8B8X8_UNORM

Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Gurchetan Singh <[email protected]>
(cherry picked from commit 47e3338)
Set bit when initializing a device.

Signed-off-by: Rafael Antognolli <[email protected]>
(am from https://patchwork.freedesktop.org/patch/210949/)
Set bit when initializing context.

Signed-off-by: Rafael Antognolli <[email protected]>
(am from https://patchwork.freedesktop.org/patch/210950/)
Gen9 hardware requires some workarounds to disable preemption depending
on the type of primitive being emitted.

We implement this by adding a new atom that tracks BRW_NEW_PRIMITIVE.
Whenever it happens, we check the current type of primitive and
enable/disable object preemption.

For now, we just ignore blorp.  The only primitive it emits is
3DPRIM_RECTLIST, and since it's not listed in the workarounds, we can
safely leave preemption enabled when it happens. Or it will be disabled
by a previous 3DPRIMITIVE, which should be fine too.

Signed-off-by: Rafael Antognolli <[email protected]>
Cc: Kenneth Graunke <[email protected]>
(am from https://patchwork.freedesktop.org/patch/210952/)
Apparently, we're supposed to look at the texture object's built-in
sampler object's sRGB decode setting in order to decide whether to
decode/downsample/re-encode, or simply downsample as-is.  Previously,
I had always done the decoding/encoding.

Fixes SKQP's Skia_Unit_Tests.SRGBMipMaps test.

Reviewed-by: Tapani Pälli <[email protected]>
(cherry picked from commit 337a808)
…pport

Fixes Skqp's unitTest_EGLImageTest test.

For Intel platforms, we support external textures only for EGLImages
created with EGL_EXT_image_dma_buf_import. This restriction seems to
be Intel specific and not present for other platforms.

While running SKQP test - unitTest_EGLImageTest, GL_INVALID is sent
to the test because of this restriction.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105301
Signed-off-by: Aditya Swarup <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
(cherry picked from commit a5c39ed)
…ort it on gen8+

Dual source blending behaviour is undefined when shader doesn't
have second color output, dismissing fragment in such situation
leads to a hang on gen8+ if depth test in enabled.

Since blending cannot be gracefully fixed in such case and the result
is undefined - blending is simply disabled.

v2 (Kenneth Graunke):
 - Listen to BRW_NEW_FS_PROG_DATA in 3DSTATE_PS_BLEND
 - Also whack BLEND_STATE[] to keep the two in sync, since we're not
   sure exactly which copy of the redundant info the hardware will use.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107088
Signed-off-by: Danylo Piliaiev <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
(cherry picked from commit eca4a65)
@js0701
Copy link
Contributor

js0701 commented Nov 22, 2018

@tpalli
@strassek
Please review the patch and share the comments

@tpalli
Copy link
Contributor

tpalli commented Nov 22, 2018

Do we have performance data on these changes, what's the impact?

@vbajaj1986
Copy link
Author

vbajaj1986 commented Nov 22, 2018 via email

Copy link
Contributor

@strassek strassek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Has this series changed at all since the RFC [1] Toni posted to mesa-dev? I'd like to maintain links to the upstream submission if possible.
[1] https://patchwork.freedesktop.org/series/51006/

@strassek
Copy link
Contributor

strassek commented Dec 5, 2018

Merged and pushed to master

@strassek strassek closed this Dec 5, 2018
@strassek
Copy link
Contributor

These patches are causing visual artifacts on Celadon home screen, so I've included a revert in master branch. If this feature is still needed the issues will need to be resolved.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.