Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RELOCID for vendor-specific relocations #423

Merged
merged 1 commit into from
Aug 12, 2024

Conversation

MaelCravero
Copy link
Contributor

This expands the current support for custom relocations to have a vendor identifier which would allow to avoid conflicts between vendor relocations in tools. This would also allow a greater number of relocations, such that these could eventually be supported in upstream tools.

A proof of concept implementation can be found in https://github.com/embecosm/corev-binutils-gdb-vendor-relocs.

@kito-cheng
Copy link
Collaborator

Thanks for raising this! I think this solution is kinda discussed for a while, and also agreed from many different parties, I guess the only only remaining controversy is the vendor ID: build our own list or use JEDEC manufacturer ID like mvendorid.

Added this into today's psABI meeting agenda, thanks again :)

Copy link
Collaborator

@jrtc27 jrtc27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the addend precludes the use of Elf_Rel, which some platforms (Fuschia) use for dynamic relocations. It needs to be encoded in the symbol instead.

<|
.2+| 192-255 .2+| *Reserved* .2+| - | .2+| Reserved for nonstandard ABI extensions
<|
|===

Nonstandard extensions are free to use relocation numbers 192-255 for any
purpose. These relocations may conflict with other nonstandard extensions.
purpose. These vendor-specific relocations must be preceded by a
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You say must yet the next paragraph says where possible? It can’t be must, that’s not compatible with the historic definition.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed re compatibility with existing objects, this is an optional addition, we will fix that in the text.

The intent for the following paragraph was to talk about presentation in objdump, etc. where if it is possible to show for the custom reloc a name that takes into account the full pair of relocations it should (e.g. R_RISCV_MYVENDOR_MEANINGFUL_NAME, rather than say R_RISCV_CUSTOM192), rather than emphasising/making optional the existence of a differentiator reloc.

@simonpcook
Copy link

Using the addend precludes the use of Elf_Rel, which some platforms (Fuschia) use for dynamic relocations. It needs to be encoded in the symbol instead.

Is this suggesting encoding in the symbol in all cases, or being generic enough that Sym (if Elf_Rel) or Sym+Addend (if Elf_Rela; probably the absolute section symbol) results in the "vendor identifier"? I understand the need to cover the Elf_Rel case if we have users that need that (and forbidding such functionality because you don't have addends in your ELF file makes no sense), but I'm not sure of an immediate strong argument of always adding symbols to hold this information if addends are available

@kito-cheng
Copy link
Collaborator

One idea for resolve both issue: 1) Organization don't have JEDEC manufacturer ID, 2) REL type relocation.

RELA: Encoding JEDEC manufacturer ID in addend, and 0 for those organization don't have JEDEC manufacturer ID, relocation symbol will point to a special symbol $v<vendor-name> if addend is 0.

REL: Relocation symbol will point to a special symbol: $jedec0x<jedecid-in-hex> or $v<vendor-name>.

@MaelCravero
Copy link
Contributor Author

I think it would be more straightforward to only use symbols with the vendor name: this way we have a single way to use this relocation for both REL and RELA without needing to differentiate for vendors with or without a JEDEC ID. Also, if a vendor really wants to use their JEDEC ID, they are free to use it in the symbol name.

@kito-cheng
Copy link
Collaborator

I am happy with special/dummy symbol only scheme.

@jrtc27 does it address your concern?

@kito-cheng
Copy link
Collaborator

Just cc more linker experts to make sure it's not insane to linker implementation.

Some background:

We have only limited relocation can be used within the RISC-V psABI due to RV32, we already divide few portion of relocation to vendor extension, however we didn't have a mechanism to well maintain the vendor relocation, that cause those relocation never go back to upstream, therefore, some vendor extension start upstream, like Core-V, they have few vendor relocation, and I believe they are not along...SiFive has also defined some vendor relocation within downstream, also IIRC CHERRI also defined few relocation...so having a mechanism defined in psABI would be great.

@rui314 @MaskRay

@jrtc27
Copy link
Collaborator

jrtc27 commented Mar 11, 2024

CHERI is less relevant here; as a custom research extension we have no need to coexist with other vendor extensions, and the standard extension we're working towards now in the CHERI SIG+TG won't be a vendor extension so will coexist just fine.

@rui314
Copy link
Collaborator

rui314 commented Mar 12, 2024

It seems the fundamental issue here is that the r_type member is just too small in 32 bits. So, as a workaround, maybe we can require a vendor-specific relocation to be represented as a pair of relocations instead of just a single relocation? This effectively extends the vendor-specific range from 2^6 (64) to 2^14 (16384), allowing vendors to use non-overlapping relocation types.

(This idea was inspired by the concept of UTF-16 surrogates.)

@jrtc27
Copy link
Collaborator

jrtc27 commented Mar 12, 2024

We cannot change the meaning of any of the existing relocations reserved for custom use. The whole point of reserving them for custom use like that is so that others can rely on being able to use them in perpetuity.

@rui314
Copy link
Collaborator

rui314 commented Mar 12, 2024

Then how about defining [176, 191] as "vendor-specific relocation prefixes"? That allows 16*256=4096 new vendor-specific relocations.

@jrtc27
Copy link
Collaborator

jrtc27 commented Mar 12, 2024

How are you encoding all vendor relocations in 12 bits?

@rui314
Copy link
Collaborator

rui314 commented Mar 12, 2024

Actually we can just reserve a single relocation number, say, 191, as a vendor-specific prefix and use UTF-8-like encoding scheme by using following relocations to represent any number of relocation types, eliminating the need to use different relocation numbers for RV32 and RV64 for those who need more than 64 relocation types.

@rui314
Copy link
Collaborator

rui314 commented Mar 12, 2024

Alternative approach would be to define 191 as the first relocation of a vendor-specific relocation pair, whose actual r_type is encoded in r_addend. The second relocation would contain actual data for the vendor-specific relocation (except r_type).

@rui314
Copy link
Collaborator

rui314 commented Mar 12, 2024

So, my proposal is to resolve the issue that RV32's r_type being too small, instead of resolving conflicts in the small space by adding vendor tags. It could be achieved by adding a clause like this to the psABI:

"A relocation whose r_type is greater than 255 is represented using two consecutive relocation records in RV32. The fist relocation has r_type of 191, and the actual relocation type is encoded in its r_offset. Other members, namely r_offset, r_sym and r_addend are taken from the second relocation."

And then we can reserve a plenty space (say, 0x8000'0000 to 0xffff'ffff) for vendor specific relocations, allowing vendors to use non-overlapping regions.

How does this sound to you guys?

@jrtc27
Copy link
Collaborator

jrtc27 commented Mar 12, 2024

You need r_offset to point at the right location otherwise every single tool will need to have special casing in what is normally highly generic code.

@rui314
Copy link
Collaborator

rui314 commented Mar 13, 2024

We can use r_addend instead.

@jrtc27
Copy link
Collaborator

jrtc27 commented Mar 13, 2024

We can use r_addend instead.

#423 (review)

@MaskRay
Copy link
Collaborator

MaskRay commented Mar 13, 2024

I think REL disallowing r_addend as an identifier is less relevant since there is no requirement to make REL work.
People may not need it. I added -z rel to ld.lld and Fuchsia seems to use it, but it is pretty minor size optimization when RELR is utilized.
Do we have ideas how many platform-specific relocations end up as dynamic relocations?

How urgent do people need reserved relocation types in the specification?
If my RELLEB proposal https://groups.google.com/g/generic-abi/c/yb0rjw56ORw is adopted, RV32 can use effectively unlimited number of relocation types.

@rui314
Copy link
Collaborator

rui314 commented Mar 13, 2024

@jrtc27 I agree with MaskRay on Fuchsia. Fuchsia (or anyone) reserves the right to deviate from this psABI, and that's what they are doing for dynamic relocations, but the consequence would also be on them if they choose to do so. We are not obligated to make nonstandard deviations work smoothly with new psABI features. If I were a Fuchsia developer, I'd use r_offset in the dynamic relocation table instead of r_addend (which shouldn't be hard), so I don't think that's going to be a problem, but how to implement it is up to them.

@kito-cheng
Copy link
Collaborator

@MaskRay :

Sorry for late reply, I was stuck at releasing stuffs.

Thanks you bring RELR and RELLEB(CREL), it's my first time know that, I think both are useful for code size reduction around the relocation info, I got few code size request around the kernel module (relocatable files), so I was considering to introducing mixing REL and RELA (*1), but I realized RELLEB(CREL) could do better here, anyway I am happy to see this happen, let me know if I can help anything here.

*1: AArch64 ABI allow mixing REL and RELA[1]

[1] https://github.com/ARM-software/abi-aa/blob/main/aaelf64/aaelf64.rst#addends-and-pc-bias

Back to the the vendor relocation, I am inclined to using dummy symbol scheme at this moment, since I think new relocation format may still require more than one release cycle to implement that into all tools, however vendor instruction has merged into upstream recently, I don't want to block this behind that.

At the end, let me emphasize again, I am really looking forward to RELR and RELLEB(CREL) can improve the code size issue, I saw RELR already merged into glibc since 2.36, so I gonna to prepare some PR to mention that in the RISC-V psABI :)

riscv-elf.adoc Outdated Show resolved Hide resolved
@MaelCravero
Copy link
Contributor Author

Updated to use symbols instead of the addend as vendor ID.

riscv-elf.adoc Outdated Show resolved Hide resolved
riscv-elf.adoc Outdated Show resolved Hide resolved
Copy link
Collaborator

@kito-cheng kito-cheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM :)

This PR already provide toolchain PoC, so the only missing part is we need few blessing from linker experts per the rule we set before, @rui314 @MaskRay @Nelson1225 do you mind give one?

riscv-elf.adoc Outdated Show resolved Hide resolved
riscv-elf.adoc Outdated Show resolved Hide resolved
riscv-elf.adoc Outdated Show resolved Hide resolved
@MaskRay
Copy link
Collaborator

MaskRay commented May 19, 2024

No objection to vendor relocations taking one type code.

However, the inclusion of a relocation symbol for vendor identification raises some questions.

Why does a vendor relocation need to identify the vendor name?
A vendor identifier table could introduce maintenance overhead, as vendors might register even without immediate need, especially considering the large number of RISC-V vendors.

We reserve one code so that vendor relocations will not conflict with standard relocations.
Ensuring two vendors can exchange relocatable files might not be a goal of this specification.
I am also concerned with toolchain implementation overhead (I especially care about LLVM binary utilities) to recognize these vendor names.

That said, vendor identification might be useful for other purposes and we probably need a per-file mechanism.
With that, do we really need a per-relocation identification?

@jrtc27
Copy link
Collaborator

jrtc27 commented May 19, 2024

Ensuring two vendors can exchange relocatable files might not be a goal of this specification.

Unfortunately that has been expressed very explicitly as a goal, hence how we've ended up here.

@kito-cheng kito-cheng mentioned this pull request May 22, 2024
@kito-cheng
Copy link
Collaborator

We reserve one code so that vendor relocations will not conflict with standard relocations.
Ensuring two vendors can exchange relocatable files might not be a goal of this specification.
I am also concerned with toolchain implementation overhead (I especially care about LLVM binary utilities) to recognize these vendor names.

That said, vendor identification might be useful for other purposes and we probably need a per-file mechanism.
With that, do we really need a per-relocation identification?

We may have multi-version function for different vendor, so that may contain relocation from more than one vendor in theory.

@MaskRay
Copy link
Collaborator

MaskRay commented May 23, 2024

Can you share an example of how vendor relocations from various vendors are combined?
Are we comfortable committing to long-term maintenance now, or should we gain more experience first?

What actions should upstream linkers take? I think it's crucial that they avoid recognizing vendor-specific semantics.

(A dummy symbol, if ever adopted, can be implemented with STB_WEAK STV_HIDDEN so that after linking there is only one copy.)

@kito-cheng
Copy link
Collaborator

@MaskRay

Are we comfortable committing to long-term maintenance now, or should we gain more experience first?

What actions should upstream linkers take? I think it's crucial that they avoid recognizing vendor-specific semantics.

RISC-V community agree taking vendor-specific extensions, so that may introduce new relocation you could imagine :P, but require vendor to maintain that, one agreement here is that require an open spec on toolchian-convention, give some piratical example is we have vendor extension from T-head, Core-V (OpenHW), Ventana and SiFive on upstream now.

Can you share an example of how vendor relocations from various vendors are combined?

So back to this question, because we allow vendor upstream their extension, so yes they may contain within single file.

Copy link
Collaborator

@kito-cheng kito-cheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This proposal has discussed for long time, and address most concern so far, give an explicit LGTM to notify everyone this is ready to land.

@kito-cheng
Copy link
Collaborator

@MaskRay any further concern?

Copy link
Collaborator

@MaskRay MaskRay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kito-cheng kito-cheng merged commit 456834e into riscv-non-isa:master Aug 12, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants