Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[all]: Rework Neon load/store and load/store pair #804

Merged
merged 2 commits into from
Mar 21, 2024

Conversation

murzinv
Copy link

@murzinv murzinv commented Feb 29, 2024

Currently Neon load/store is split in two instructions: one for handling immediate/register offset and another for handling post-index variant. The similar applies to Neon load/store pair. Note, missing support for pre-index variant for Neon load/store.

Let's refactor those instructions to take advantage of existing infrastructure for handling addressing modes. That removes need for special instruction for post-index variant and as a bonus naturally brings support for pre-index variant for Neon load/store.

@murzinv
Copy link
Author

murzinv commented Feb 29, 2024

Following #749 (comment)

Copy link
Member

@relokin relokin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two comments but otherwise looks good to me.

| I_STUR_SIMD(var,r1,rA,k) ->
let access_size = tr_simd_variant var and
k = K (match k with Some k -> k | None -> 0) in
simd_str access_size rA r1 k S_NOEXT ii
k = match k with Some k -> k | None -> 0 in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit annoying, I think now you could also fix it in the parser (use k0 instead of k0_opt)? It looks like we might be able to get rid of k0_opt.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack

pp_memo memo ^ " " ^
pp_vsimdreg v r1 ^ "," ^
pp_vsimdreg v r2 ^ "," ^
do_pp_idx true ra idx in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is getting a bit complicated but if you made the change below you wouldn't have to change the hashes of tests with LDP/STP. I think this means that we need to change the declaration of do_pp_idx and make sure that the argument from_pair is set to true for non-neon pair instructions only.

Suggested change
do_pp_idx true ra idx in
do_pp_idx false ra idx in

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact pp_smemp is used by Neon pair instructions only, so it is safe to change it to do_pp_idx false ra idx or use existing helper pp_idx ra idx. And, yes, once changed no need to update hashes.

Copy link
Member

@relokin relokin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two quick comments

Comment on lines 2652 to 2653
get_ea_reg rA v kr sext s ii >>= fun addr ->
simd_str access_size addr r1 ii
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has the unfortunate consequence of changing the intra-instruction relations. So I would suggest instead doing this:

Suggested change
get_ea_reg rA v kr sext s ii >>= fun addr ->
simd_str access_size addr r1 ii
let ma = get_ea_reg rA v kr sext s ii in
simd_str access_size ma r1 ii

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And the same applies to all store variants in this PR. The loads are ok.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!

Comment on lines 1903 to 1904
let simd_str_p sz addr rd rs k ii =
read_reg_neon true rd ii >>= fun v ->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check my comment below but to retain the iico_* relations as before, then you could do:

Suggested change
let simd_str_p sz addr rd rs k ii =
read_reg_neon true rd ii >>= fun v ->
let simd_str_p sz ma rd rs k ii =
ma >>|
read_reg_neon true rd ii >>= fun (addr, v) ->

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack.

Copy link
Member

@relokin relokin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great thanks for the quick fix!

If @maranget doesn't see any problems with the hash change we can merge. Let's give people some time to react. Let's aim to have this merged by the end of Thursday?

@maranget
Copy link
Member

If @maranget doesn't see any problems with the hash change we can merge. Let's give people some time to react. Let's aim to have this merged by the end of Thursday?

I have no objection, LDUR does not appear in our hardware logs.

@relokin
Copy link
Member

relokin commented Mar 21, 2024

@murzinv can you please rebase on top of master and then we can merge this.

Vladimir Murzin added 2 commits March 21, 2024 17:08
Currently Neon load/store is split in two instructions: one for
handling immediate/register offset and another for handling post-index
variant. The similar applies to Neon load/store pair. Note, missing
support for pre-index variant for Neon load/store.

Let's refactor those instructions to take advantage of existing
infrastructure for handling addressing modes. That removes need for
special instruction for post-index variant and as a bonus naturally
brings support for pre-index variant for Neon load/store.

Signed-off-by: Vladimir Murzin <[email protected]>
We can improve processing of LDUR instruction by eliminating special
handling of the optional immediate. Parser provides `k0` rule which
already handle absence of immediate offset by defaulting it to zero,
so use it instead of `k0_opt`. Since there are no more users of
`k0_opt` left, kill it for good.

Signed-off-by: Vladimir Murzin <[email protected]>
@relokin relokin merged commit 5f30dae into herd:master Mar 21, 2024
2 checks passed
@relokin
Copy link
Member

relokin commented Mar 21, 2024

Thanks @murzinv!

@murzinv murzinv deleted the use-idx-for-neon branch March 21, 2024 17:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants