Refer mapping symbol into R_RISCV_RELAX for rvc/norvc relaxations. #393

Nelson1225 · 2023-07-21T00:41:33Z

Hey guys,

Original discussion,
#116

The idea came from @jrtc27 one year ago, to resolve the problem that both rvc and norvc codes are in the same object. Not really sure how to write the psabi correctly, so needs your helps to review the changes. Thanks :-)

cc @kito-cheng @jrtc27 @jim-wilson @rui314 @MaskRay @asb

Nelson1225 · 2023-07-21T00:44:34Z

The implementation of binutils for this proposal,

RISC-V: Refer mapping symbol to R_RISCV_RELAX for rvc relaxations, https://sourceware.org/pipermail/binutils/2022-September/123191.html

MaskRay · 2023-07-24T02:25:02Z

RISC-V: Refer mapping symbol to R_RISCV_RELAX for rvc relaxations, https://sourceware.org/pipermail/binutils/2022-September/123191.html

Thank you for proposing this! Oh, I did not know the patch has been there for a long time.

I think using the symbol index part of r_info nicely describes the semantics of a R_RISCV_RELAX relocation.
Technically, we can make R_RISCV_RELAX relocations do more ~~awful~~magical relaxation for various ISA extensions...
But more likely, R_RISCV_RELAX will remain used only to distinguish RVC and non-RVC, as new relaxations should use new relocation types, there is not much we need from a marker relocation type.

I wonder whether we should use the r_addend field:

0: e_flags & EF_RISCV_RVC decides the RVC state
1: non-RVC, i.e., c.j/c.jal cannot be used for call/tail relaxation

By utilizing r_addend, the linker doesn't have to scan the referenced symbol name.
As a minor issue, we won't be able to use REL, but we already cannot use REL due to R_RISCV_ALIGN.

(I'll be out of town for ~2 weeks and will be slow in responding.)

jrtc27 · 2023-07-24T02:32:26Z

CHERI-RISC-V is an example of where you can do more things. Zce's existence shows that RVC alone isn't enough. Plus you have all the fun of things like 48-bit encodings. And then who knows what other extensions will exist. Let's not try and be clever here and cause even more issues in the future, let's just do the general thing that should always work and requires no thought. I.e. no, abusing the addend like that is too narrow.

MaskRay · 2023-07-24T02:43:05Z

CHERI-RISC-V is an example of where you can do more things. Zce's existence shows that RVC alone isn't enough. Plus you have all the fun of things like 48-bit encodings. And then who knows what other extensions will exist. Let's not try and be clever here and cause even more issues in the future, let's just do the general thing that should always work and requires no thought. I.e. no, abusing the addend like that is too narrow.

Do you have a self-contained example that there are clever uses of R_RISCV_RELAX and introducing a new one is insufficient?

The r_addend suggestion is primarily about linker performance, but is also relevant to another merit I want to uphold: the linker behavior should not be changed by magical symbol names (_c in the mapping symbol referenced by the R_RISCV_RELAX relocation).

jrtc27 · 2023-07-24T03:34:23Z

is also relevant to another merit I want to uphold: the linker behavior should not be changed by magical symbol names

Then come up with a viable alternative that has the same flexibility as using the ISA string without reinventing the wheel.

kito-cheng · 2023-07-24T06:25:12Z

I prefer pointer to dummy mapping symbol since that's more flexible to deal with different ISA extension including the vendor extensions.

Also r_addend is only 32 bits (due to RV32...), and we will consume at least 5 bits immediately for c, zcmt, zcf, zcd and zcb for this scheme, so it might...run out soon in next few years I suspect.

MaskRay · 2023-07-24T06:47:59Z

Can someone describe how these extensions are going to use R_RISCV_RELAX in a way that is close to the spirit of the current R_RISCV_RELAX, so that we should consider extensibility of R_RISCV_RELAX, instead of creating new relocation types, even if parsing referenced symbol names is a very odd thing to do? This is why I asked for a self-contained example.

kito-cheng · 2023-07-24T08:14:29Z

Let me try to made one with Zcmt:

1. rv32imafdc_zcmt

foo:
  # R_RISCV_RELAX symbol($xrv32imafdc_zcmt)
  # R_RISCV_CALL
  call bar # This can be relax to jal, c.jal or cm.jalt

2. rv32imafdc

bar:
  # R_RISCV_RELAX symbol($xrv32imafdc)
  # R_RISCV_CALL symbol(zoo)
  call zoo # This can be relax to jal, c.jal

3. rv32imafd

zoo:
  # R_RISCV_RELAX symbol($xrv32imafd)
  # R_RISCV_CALL symbol(abc)
  call abc # This can be relax to jal

And adding new relocation like R_RISCV_RELAX_NORVC can't distinguish 1 and 2.

kito-cheng · 2023-08-03T06:58:23Z

@Nelson1225 could you also add some more comment to R_RISCV_RELAX relocation? maybe like this:

diff --git a/riscv-elf.adoc b/riscv-elf.adoc
index 7eeddc5..1ba7f40 100644
--- a/riscv-elf.adoc
+++ b/riscv-elf.adoc
@@ -430,7 +430,7 @@ Description:: Additional information about the relocation
                                             <| S + A
 .2+| 47-50   .2+| *Reserved*                          .2+| -       |                   .2+| Reserved for future standard use
                                             <|
-.2+| 51      .2+| RELAX         .2+| Static  |                   .2+| Instruction can be relaxed, paired with a normal relocation at the same address
+.2+| 51      .2+| RELAX         .2+| Static  |                   .2+| Instruction can be relaxed, paired with a normal relocation at the same address. This relocation might point to an optional dummy mapping symbol, indicating which extension is permitted for use during the linker relaxation process.
                                             <|
 .2+| 52      .2+| SUB6          .2+| Static  | _word6_           .2+| Local label subtraction
                                             <| V - S - A

Nelson1225 · 2023-08-04T00:48:59Z

@Nelson1225 could you also add some more comment to R_RISCV_RELAX relocation? maybe like this:

diff --git a/riscv-elf.adoc b/riscv-elf.adoc
index 7eeddc5..1ba7f40 100644
--- a/riscv-elf.adoc
+++ b/riscv-elf.adoc
@@ -430,7 +430,7 @@ Description:: Additional information about the relocation
                                             <| S + A
 .2+| 47-50   .2+| *Reserved*                          .2+| -       |                   .2+| Reserved for future standard use
                                             <|
-.2+| 51      .2+| RELAX         .2+| Static  |                   .2+| Instruction can be relaxed, paired with a normal relocation at the same address
+.2+| 51      .2+| RELAX         .2+| Static  |                   .2+| Instruction can be relaxed, paired with a normal relocation at the same address. This relocation might point to an optional dummy mapping symbol, indicating which extension is permitted for use during the linker relaxation process.
                                             <|
 .2+| 52      .2+| SUB6          .2+| Static  | _word6_           .2+| Local label subtraction
                                             <| V - S - A

Sure, fixed :-)

kito-cheng · 2023-09-28T07:40:54Z

We found an interesting case during implementing zcmt linker optimization, should we do zcmt optimization IF attached mapping symbol didn't contained zcmt?

If yes: do we have way to disable that? It will hurt performance (a lots) in general, so we might want to disable that in some code region, e.g. performance critical function.

If no: that means we must build multi-lib with zcmt to enable that, which may increase lots of multi lib configuration in practice, otherwise libraries can't optimize with zcmt.

jrtc27 · 2023-09-28T12:02:56Z

Mapping symbols are for disassemblers and have no other semantic meaning. Putting the ISA string information in the RELAX relocation is the way to solve these kinds of problems.

kito-cheng · 2023-09-29T02:20:43Z

Yeah, anyway, I am not super concern about the case I throw, I think that would be sort of implementation details.

rui314 · 2023-09-29T02:49:34Z

If you want to identify the set of instruction extensions for a given piece of code, isn't that information already in the object file? For example, after the $xrv64imafdc symbol, the code is considered to be RV64GC until another mapping symbol appears. That information alone should be sufficient for the linker to learn what relaxation is allowed for that code region. It seems we don't need to repeat it for each RELAX relocation.

jrtc27 · 2023-09-29T02:52:20Z

Mapping symbols don't convey semantic information, that's the point. You can legitimately strip them and get a working (if potentially hard to disassemble) result. Doing it based off the actual symbols is fragile and a bad idea.

rui314 · 2023-09-29T02:56:50Z

Then I'd suggest we add a new relocation type or redefine RELAX to switch the current set of extensions instead of repeating the same thing for each RELAX relocation.

jrtc27 · 2023-09-29T03:23:15Z

Then I'd suggest we add a new relocation type or redefine RELAX to switch the current set of extensions instead of repeating the same thing for each RELAX relocation.

Relocations should not be stateful. They should only affect relocations at the same address. Otherwise you have to process them in-order (or sort them and search them) and permuting sections changes the semantics.

rui314 · 2023-09-29T03:35:01Z

Why shouldn't it be stateful? The same argument could have been applied to symbols, but with the introduction of mapping symbols, the symbols have become stateful in a sense. I don't see a strong reason to avoid it. The linker processes relocations in address order, so it is actually natural for the linker to handle them in-order. Permuting sections doesn't alter the semantics, as each"state change" relocation still points to the location where a new code region begins.

rui314 · 2023-09-29T03:47:57Z

For RISC-V object files, we can't really handle relocations in an arbitrary order because of the relaxation that reduces section size. Unlike other psABIs, a relocation's r_offset is not sufficient to identify the location where the relocation is applied to. We need to know how many bytes have been removed by other relocations that appears before it to apply it to the correct location. So they are already stateful and needs to be processed in the address-increasing order.

jrtc27 · 2023-09-29T04:16:11Z

Only ALIGN needs to be processed in-order today. Everything else can be done in an arbitrary address order.

rui314 · 2023-09-29T04:31:44Z

I don't think that's true. For example, if the first relocation with RELAX shrinks the section by, say, 2 bytes, the r_offset of all relocations with an address greater than the first one must be decreased by 2. To do so, they need to be processed in the address-increasing order.

jrtc27 · 2023-09-29T04:50:32Z

Depends how you track things. And if you've resolved the relocation then you don't need to care about its r_offset.

sorear · 2024-01-28T04:16:39Z

Are ISA string parsers reliable and interoperable enough now to consider using them for something load-bearing? I remember a lot of issues a few years ago.

kito-cheng · 2024-01-30T07:38:56Z

Are ISA string parsers reliable and interoperable enough now to consider using them for something load-bearing? I remember a lot of issues a few years ago.

ISA string is kinda stable than before after the adding huge extensions last two years I think, the only unstable thing is RISC-V ISA seems intend to split extension into more fine grained, e.g. M become M + Zmmul, A become A + Zalrsc + Zaamo, but that is less concern since that is not really backward incompatible change, and ISA string canonicalation could resolve that.

Back to the problem I throw before (#393 (comment)): the zcmt issue, we found a (little complicated) solution for that, which is allow some extension can be ignore that limitation IF the target ISA string isn't using (encoding) conflicted extensions, but anyway that's implementation define behavior, we just need make sure the spec has leave flexibility here.

Other than that I think it's viable solution, not sure how linker other guys think? @MaskRay @rui314

sorear · 2024-02-26T00:59:21Z

This is an incomplete solution, since not all forms of linker code generation or rewriting occur in the context of a relocation. The recently proposed instruction table standard extension can rewrite arbitrary 32-bit instructions, and the FDPIC proposal I'm working on benefits from RVC in the PLT. How do the existing Andes etc toolchains with instruction tables know what regions to enable them in?

jrtc27 · 2024-02-26T04:34:35Z

I don't know how you can hope to rewrite instructions without a relocation. You don't even know where the start of each instruction is. And you need the ability to turn it off for some assembly so you get exactly what you wrote.

kito-cheng · 2024-02-26T05:46:52Z

This is an incomplete solution, since not all forms of linker code generation or rewriting occur in the context of a relocation. The recently proposed instruction table standard extension can rewrite arbitrary 32-bit instructions,

Are you refer to Zcmt (cm.jt and cm.jalt)? that will still require a relocation to relax.

and the FDPIC proposal I'm working on benefits from RVC in the PLT.

Do you mind give more context about that? and that's good to know you are working FDPIC :)

How do the existing Andes etc toolchains with instruction tables know what regions to enable them in?

My memory tell me they are using relocation pair to disable that within specific region (e.g. R_START_NO_TABLE_INST_REGION, R_END_NO_TABLE_INST_REGION).

sorear · 2024-02-27T00:44:22Z

Are you refer to Zcmt (cm.jt and cm.jalt)? that will still require a relocation to relax.

No, I'm referring to the instruction table extension Zcmi.

Do you mind give more context about that? and that's good to know you are working FDPIC :)

I just created #429 so that there is a point of reference for people on GitHub. #343 (comment) has more information on the requirements and design process for the relocation scheme.

kito-cheng · 2024-03-01T12:54:44Z

No, I'm referring to the instruction table extension Zcmi.

Thanks for that info, found the spec in the thread, but just a bookmark here to let other easier access that: https://drive.google.com/file/d/1qCAjZs6pZWK8q2nPVSsuVqs4aSBqkRTw/view

And it seems very similar to Andes's extension which I am family with...so yeah, the comment in my last comment for that still valid and right: using relocation pair to mark a region can't be optimized with table jump instruction.

Refer mapping symbol into R_RISCV_RELAX for rvc/norvc relaxations.

12925d5

Nelson1225 force-pushed the mapping-rvc branch from cf29462 to 12925d5 Compare August 4, 2023 00:47

kito-cheng mentioned this pull request Sep 28, 2023

Relax GOT indirection into PC-relative addressing #397

Merged

kito-cheng requested review from rui314, MaskRay, jrtc27 and kito-cheng September 29, 2023 02:20

ilovepi mentioned this pull request Nov 30, 2023

[LLD][RISCV] Fix incorrect call relaxation when mixing +c and -c objects llvm/llvm-project#73977

Merged

MaskRay mentioned this pull request Jan 26, 2024

Embedding R_RISCV_RELAX to another relocations #401

Open

pz9115 mentioned this pull request Aug 26, 2024

.option norvc disables all 16-bit instructions in binutils riscv-collab/riscv-gnu-toolchain#1542

Closed

lenary mentioned this pull request Nov 1, 2024

Relocations, ELF Markers, and >32-bit Instructions #453

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refer mapping symbol into R_RISCV_RELAX for rvc/norvc relaxations. #393

Refer mapping symbol into R_RISCV_RELAX for rvc/norvc relaxations. #393

Nelson1225 commented Jul 21, 2023

Nelson1225 commented Jul 21, 2023

MaskRay commented Jul 24, 2023 •

edited

Loading

jrtc27 commented Jul 24, 2023 •

edited

Loading

MaskRay commented Jul 24, 2023

jrtc27 commented Jul 24, 2023

kito-cheng commented Jul 24, 2023

MaskRay commented Jul 24, 2023

kito-cheng commented Jul 24, 2023

kito-cheng commented Aug 3, 2023

Nelson1225 commented Aug 4, 2023

kito-cheng commented Sep 28, 2023

jrtc27 commented Sep 28, 2023

kito-cheng commented Sep 29, 2023

rui314 commented Sep 29, 2023

jrtc27 commented Sep 29, 2023

rui314 commented Sep 29, 2023

jrtc27 commented Sep 29, 2023

rui314 commented Sep 29, 2023 •

edited

Loading

rui314 commented Sep 29, 2023 •

edited

Loading

jrtc27 commented Sep 29, 2023

rui314 commented Sep 29, 2023

jrtc27 commented Sep 29, 2023

sorear commented Jan 28, 2024

kito-cheng commented Jan 30, 2024

sorear commented Feb 26, 2024

jrtc27 commented Feb 26, 2024

kito-cheng commented Feb 26, 2024

sorear commented Feb 27, 2024

kito-cheng commented Mar 1, 2024

Refer mapping symbol into R_RISCV_RELAX for rvc/norvc relaxations. #393

Are you sure you want to change the base?

Refer mapping symbol into R_RISCV_RELAX for rvc/norvc relaxations. #393

Conversation

Nelson1225 commented Jul 21, 2023

Nelson1225 commented Jul 21, 2023

MaskRay commented Jul 24, 2023 • edited Loading

jrtc27 commented Jul 24, 2023 • edited Loading

MaskRay commented Jul 24, 2023

jrtc27 commented Jul 24, 2023

kito-cheng commented Jul 24, 2023

MaskRay commented Jul 24, 2023

kito-cheng commented Jul 24, 2023

1. rv32imafdc_zcmt

2. rv32imafdc

3. rv32imafd

kito-cheng commented Aug 3, 2023

Nelson1225 commented Aug 4, 2023

kito-cheng commented Sep 28, 2023

jrtc27 commented Sep 28, 2023

kito-cheng commented Sep 29, 2023

rui314 commented Sep 29, 2023

jrtc27 commented Sep 29, 2023

rui314 commented Sep 29, 2023

jrtc27 commented Sep 29, 2023

rui314 commented Sep 29, 2023 • edited Loading

rui314 commented Sep 29, 2023 • edited Loading

jrtc27 commented Sep 29, 2023

rui314 commented Sep 29, 2023

jrtc27 commented Sep 29, 2023

sorear commented Jan 28, 2024

kito-cheng commented Jan 30, 2024

sorear commented Feb 26, 2024

jrtc27 commented Feb 26, 2024

kito-cheng commented Feb 26, 2024

sorear commented Feb 27, 2024

kito-cheng commented Mar 1, 2024

MaskRay commented Jul 24, 2023 •

edited

Loading

jrtc27 commented Jul 24, 2023 •

edited

Loading

rui314 commented Sep 29, 2023 •

edited

Loading

rui314 commented Sep 29, 2023 •

edited

Loading