Skip to content

[pauthabielf64] Define R_AARCH64_AUTH_TLSDESC_CALL.#395

Open
smithp35 wants to merge 5 commits into
ARM-software:mainfrom
smithp35:tlsauthdesccall
Open

[pauthabielf64] Define R_AARCH64_AUTH_TLSDESC_CALL.#395
smithp35 wants to merge 5 commits into
ARM-software:mainfrom
smithp35:tlsauthdesccall

Conversation

@smithp35

Copy link
Copy Markdown
Contributor

The TLSDESC sequence for accessing an authenticated pointer is similar to the traditional TLSDESC sequence. As pointer signing must be done at run-time there are much more limited opportunities to relax a TLSDESC AUTH, we introduce a R_AARCH64_AUTH_TLSDESC_CALL relocation that permits a static linker to transform the blraa to a nop, when relaxation is possible.

A new relocation code has been introduced rather than reusing R_AARCH64_TLSDESC_CALL so that a static linker can assume that the destination symbol is signed, without having to derive it from other R_AARCH64_TLSDESC_* relocations.

TLSDESC sequence

adrp x0, :tlsdesc:v //R_AARCH64_TLSDESC_ADR_PAGE21
ldr x1, [x0, #:tlsdesc_lo12:v] //R_AARCH64_TLSDESC_LD64_LO12
add x0, x0, #:tlsdesc_lo12:v //R_AARCH64_TLSDESC_ADD_LO12
.tlsdesccall var //R_AARCH64_TLSDESC_CALL
blr x1

TLSDESC AUTH sequence

adrp x0, :tlsdesc_auth:v //R_AARCH64_AUTH_TLSDESC_ADR_PAGE21
ldr x16, [x0, #:tlsdesc_auth_lo12:v]//R_AARCH64_AUTH_TLSDESC_LD64_LO12
add x0, x0, #:tlsdesc_auth_lo12:v //R_AARCH64_AUTH_TLSDESC_ADD_LO12
.tlsdescauthcall v //R_AARCH64_AUTH_TLSDESC_CALL
blraa x16, x0

fixes #393

The TLSDESC sequence for accessing an authenticated pointer is
similar to the traditional TLSDESC sequence. As pointer signing
must be done at run-time there are much more limited opportunities
to relax a TLSDESC AUTH, we introduce a R_AARCH64_AUTH_TLSDESC_CALL
relocation that permits a static linker to transform the blraa to
a nop, when relaxation is possible.

A new relocation code has been introduced rather than reusing
R_AARCH64_TLSDESC_CALL so that a static linker can assume that
the destination symbol is signed, without having to derive it
from other R_AARCH64_TLSDESC_* relocations.

TLSDESC sequence

adrp x0, :tlsdesc:v             //R_AARCH64_TLSDESC_ADR_PAGE21
ldr  x1, [x0, #:tlsdesc_lo12:v] //R_AARCH64_TLSDESC_LD64_LO12
add  x0, x0, #:tlsdesc_lo12:v   //R_AARCH64_TLSDESC_ADD_LO12
.tlsdesccall var                //R_AARCH64_TLSDESC_CALL
blr  x1

TLSDESC AUTH sequence

adrp x0, :tlsdesc_auth:v             //R_AARCH64_AUTH_TLSDESC_ADR_PAGE21
ldr  x16, [x0, #:tlsdesc_auth_lo12:v]//R_AARCH64_AUTH_TLSDESC_LD64_LO12
add  x0, x0, #:tlsdesc_auth_lo12:v   //R_AARCH64_AUTH_TLSDESC_ADD_LO12
.tlsdescauthcall v                   //R_AARCH64_AUTH_TLSDESC_CALL
blraa x16, x0

fixes ARM-software#393
@kovdan01

Copy link
Copy Markdown

Comment thread pauthabielf64/pauthabielf64.rst Outdated
ldr x16, [x0, :tlsdesc_auth_lo12: undefined_weak // R_AARCH64_AUTH_TLSDESC_LD64_LO12
add x0, x0 :tlsdesc_auth_lo12: undefined_weak // R_AARCH64_AUTH_TLSDESC_ADD_LO12
.tlddescauthcall undefined_weak // R_AARCH64_AUTH_TLSDESC_CALL
autia x0, x8

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#393 (comment) gives this as blraa x16, x0 instead?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for spotting. Will update.

Comment thread pauthabielf64/pauthabielf64.rst Outdated
autia x0, x8

// After relaxation, assuming undefined_weak is known to be 0 at static-link time.
mov x0, #0x0

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the resolver is returning a (potentially between distinct allocations) offset from TP, wouldn't this cause a change in behaviour from giving "NULL" to giving "TP"? At least within FreeBSD our normal AArch64 TLSDESC resolver for undefined weak symbols is to return -TP(+A) so adding it to TP gives NULL(+A). I know undefined weak TLS objects are historically very cursed and break in all kinds of ways, but this would be a regression over non-PAuth TLSDESC, I think.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This also loses the addend entirely)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point, @kovdan01 is your TLSDESC resolver function for an undefined weak capable of returning -TP? I believe AArch64 glibc does this too for undefined weak TLS symbols.

I think this would be preferable to relaxing the sequence to 0 as currently that 0 would get added to TP, which would not result in a value of 0 for an undefined weak.

There's always the possibility of altering the TLSDESC sequence of checking for 0 before adding to TP as 0 isn't a valid offset (first offset is TCB size + alignment padding) which is a minimum of 16. However this seems worse than the resolver.

As an aside I've not been able to create a TLSDESC sequence with a non-zero addend from some simple compiled code. Always seems like the TLSDESC calculates the address of the symbol, then does an addition, or load with immediate offset instead.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that may be true in practice, just like GOT entries typically don't actually have addends (except when turned to relative ones, of course) as it's annoying to have potentially multiple entries per symbol to track in the linker.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kovdan01 is your TLSDESC resolver function for an undefined weak capable of returning -TP? I believe AArch64 glibc does this too for undefined weak TLS symbols.

@smithp35 Yes, glibc does this, and we match that behavior in our PoC reference musl implementation - see https://github.com/access-softek/musl/blob/v1.2.5-pauth-rev2025-11-21/src/ldso/aarch64/tlsdesc.S#L63-L70

kovdan01 added a commit to llvm/llvm-project that referenced this pull request May 25, 2026
The R_AARCH64_AUTH_TLSDESC_CALL is introduced to allow linker relaxation of
AUTH TLSDESC call sequences for non-preemptible undefined weak symbols.

The lld patch introducing the relaxation: #194636

Corresponding ARM docs PR: ARM-software/abi-aa#395
kovdan01 added a commit to llvm/llvm-project that referenced this pull request May 25, 2026
The R_AARCH64_AUTH_TLSDESC_CALL is introduced to allow linker relaxation of
AUTH TLSDESC call sequences for non-preemptible undefined weak symbols.

The lld patch introducing the relaxation: #194636

Corresponding ARM docs PR: ARM-software/abi-aa#395
jollaitbot pushed a commit to sailfishos-mirror/llvm-project that referenced this pull request May 26, 2026
* use blraa as in commit message.
* mention that a value of 0 when added to the thread pointer is invalid.
@smithp35

smithp35 commented May 27, 2026

Copy link
Copy Markdown
Contributor Author

I've fixed the typos and added a note that when 0 is added to the TP it will point at the thread control block.

Do we still need this? EDIT, yes as on other platforms there may be other relaxations possible.

@smithp35

Copy link
Copy Markdown
Contributor Author

Reading through https://www.fsfla.org/~lxoliva/writeups/TLS/RFC-TLSDESC-ARM.txt again. There is a relaxable sequence that could be used when static linking (so we know that the weak reference won't be defined). In effect this is inlining the resolver function that returns -TP

mrs x0, TPIDR_EL0
neg x0, x0 // alias of sub x0, xzr, x0
nop
nop


.. note::

Relocation code ``R_AARCH64_AUTH_TLSDESC_CALL`` is needed to permit

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'll move the details about the relaxation to a separate document in the design-documents folder.

I think it is platform specific choice of whether both fields of the TLS descriptor are signed. If only the resolver function address is signed then more relaxations are possible.

Likely to be next week before I can do that.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it's not about whether the second word of the descriptor is signed. The generated code never uses that, either it passes a pointer to it as an opaque blob to the resolver or it relaxes the entire sequence so there is no descriptor to sign? The question is whether the TLS data is being signed like globals can be, and therefore whether &tls_var - TP is the same for all threads or differs in the high bits due to signing. If the data isn't signed, you can just relax to that constant (which will be the same as non-PAuth, and is either an LE immediate or an IE run-time constant), it's only if the data is signed that IE/LE are fundamentally broken?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comment. I'll probably just take the rationale/relaxation bits out of the main document for now until I've got the time to work through this slowly.

Reading through the initial issue again #393. I think I've put too much weight on the comment:

Broader relaxations (such as GD->IE or GD->LE with non-statically-known-NULL symbols) are not possible because pointer authentication requires signing at program start-up.

I've missed the non-statically-known-NULL symbols part and in my haste reading. I had been trying to reconcile why other relaxations weren't possible with the code sequence the compiler uses for TLSDESC and a signed GOT.

As an aside:

Empirically using: clang --target=aarch64-linux -march=armv8.3-a -S -O2 tlsdesc.c -o - -fptrauth-elf-got -mabi=pauthtest with a trivial __thread int x; int val() { return x; }

I get:

        pacibsp
        stp     x29, x30, [sp, #-16]!           // 16-byte Folded Spill
        mov     x29, sp
        adrp    x0, :tlsdesc_auth:x
        ldr     x16, [x0, :tlsdesc_auth_lo12:x]
        add     x0, x0, :tlsdesc_auth_lo12:x
        blraa   x16, x0
        mrs     x8, TPIDR_EL0
        ldr     w0, [x8, x0]
        ldp     x29, x30, [sp], #16             // 16-byte Folded Reload
        retab

If I'm reading that correctly, the return value from the TLS resolver function isn't signed. Nor is the value of x. It looks like the only thing that is signed is the descriptor in the GOT.

That looks relaxable in principle, although maybe not in practice to initial exec. I'd expect if it were initial exec the (&tls_var - TP) would be signed in the GOT, there are enough spare instructions and registers to extract the unsigned (&tls_var - TP), but there aren't enough spare instructions to test whether the authenticate failed, which I believe is a requirement for -fptrauth-traps (for systems without FEAT_FPAC).

* Clarify that TLSDESC refers to the dialect
* Clarify that signing the parameter to a resolver function is
  a contract between the dynamic linker and the resolver function.
@smithp35

smithp35 commented Jun 4, 2026

Copy link
Copy Markdown
Contributor Author

I've split out the commentary/rationale about TLS into a separate design document based on my understanding of how TLS is being used today, and what could be done in a different signing-schema (defined separately from the PAuthABI).

@jrtc27 jrtc27 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, a lot clearer to me now as to what this is actually trying to achieve and how. Some comments inline.

Comment thread design-documents/pauthabi-tls.rst Outdated
dialects. The "traditional" dialect and the "descriptor" dialect. In
the traditional dialect global and local dynamic TLS use the
``R_<CLS>_TLSGD`` and ``R_<CLS>_TLSLD`` prefixed relocations. These
create a pair of GOT entries relocated by ``R_<CLS>_TLS_DTPMOD``. In

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and DTPREL

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK

Comment thread design-documents/pauthabi-tls.rst Outdated
TLS are handled the same way in both dialects.

The `PAUTHABIELF64`_ only supports the descriptor based dialect,
primarily because clang only supports the "descriptor" based dialect.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a stronger reason these days is because the traditional one is legacy so a new ABI should just follow the new approach? Clang supports traditional for a bunch of architectures so adding it for AArch64 would be straightforward.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK

Comment thread design-documents/pauthabi-tls.rst Outdated
defined for TLSDESC, but not for Initial Exec.

Local dynamic TLS does not use the GOT so it can be handled by the
``R_<CLS>_TLSLE`` prefixed relocations defined in `AAELF64`_.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe reiterate "from the base ABI" like a few paragraphs above to be clearer it's unchanged?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK

Comment thread design-documents/pauthabi-tls.rst Outdated
signing-schema for the platform. For example a signing-schema may only
sign GOT entries containing code-pointers, which would permit Initial
Exec TLS using the ``R_<CLS>_TLSIE`` prefixed relocations defined in
`AAELF64`_. Alternatively a signing-schema may sign all GOT entries.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"..., which would require AUTH variant static and dynamic relocations to be defined for Initial Exec"?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK, have added that in.

Comment thread design-documents/pauthabi-tls.rst Outdated

The static linker may relax a more general TLS model to a more
constrained model when TLS variables meet the requirements for using
the constrained model, and the relaxed sequence is permitted by the

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel this comma hurts legibility, as the "and" relates to the "when" rather than the whole clause? At least, I had to backtrack in my head to parse it properly.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the comma.

Comment thread design-documents/pauthabi-tls.rst Outdated

.. code

adrp x0, :gottprel:v // R_AARCH64_AUTH_TLSIE_ADR_GOTTPREL_PAGE21 v

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this hypothetical support would you not need auth somewhere in the modifiers(?) to get the right relocation, like :got_auth(_lo12): (presumably :gottprel_auth(_lo12):)?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I've added in auth to match the other ones.

Comment thread pauthabielf64/pauthabielf64.rst Outdated
linker optimization of TLS descriptor code sequences involving
authenticated pointers, when undefined weak non-preemptible symbols
are known to resolve to 0; this can only be done if all relevant uses
of TLS descriptors are marked to permit accurate relaxation.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably have a very brief outline of the cases discussed in the design doc and refer to it, rather than only list undefined weak non-preemptible symbols as the case that can be relaxed?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed the part about undefined weak, thought I'd done that already but my eyes were deceiving me.

I've replaced it with a reference to the design document.

* Add DTPREL dynamic relocation for traditional dialect.
* Emphasised traditional dialect is legacy.
* Removed comma making sentence difficult to parse.
* Added auth to theoretical initial exec.
* Reference design doc from main PAuthABI, further simplifying
  the section on relaxation.

@smithp35 smithp35 left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks very much for the review, and apologies for the delay in responding. I'll upload a new version.

Comment thread design-documents/pauthabi-tls.rst Outdated
dialects. The "traditional" dialect and the "descriptor" dialect. In
the traditional dialect global and local dynamic TLS use the
``R_<CLS>_TLSGD`` and ``R_<CLS>_TLSLD`` prefixed relocations. These
create a pair of GOT entries relocated by ``R_<CLS>_TLS_DTPMOD``. In

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK

Comment thread design-documents/pauthabi-tls.rst Outdated
TLS are handled the same way in both dialects.

The `PAUTHABIELF64`_ only supports the descriptor based dialect,
primarily because clang only supports the "descriptor" based dialect.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK

Comment thread design-documents/pauthabi-tls.rst Outdated
defined for TLSDESC, but not for Initial Exec.

Local dynamic TLS does not use the GOT so it can be handled by the
``R_<CLS>_TLSLE`` prefixed relocations defined in `AAELF64`_.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK

Comment thread design-documents/pauthabi-tls.rst Outdated
signing-schema for the platform. For example a signing-schema may only
sign GOT entries containing code-pointers, which would permit Initial
Exec TLS using the ``R_<CLS>_TLSIE`` prefixed relocations defined in
`AAELF64`_. Alternatively a signing-schema may sign all GOT entries.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK, have added that in.

Comment thread design-documents/pauthabi-tls.rst Outdated

The static linker may relax a more general TLS model to a more
constrained model when TLS variables meet the requirements for using
the constrained model, and the relaxed sequence is permitted by the

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the comma.

Comment thread design-documents/pauthabi-tls.rst Outdated

.. code

adrp x0, :gottprel:v // R_AARCH64_AUTH_TLSIE_ADR_GOTTPREL_PAGE21 v

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I've added in auth to match the other ones.

Comment thread pauthabielf64/pauthabielf64.rst Outdated
linker optimization of TLS descriptor code sequences involving
authenticated pointers, when undefined weak non-preemptible symbols
are known to resolve to 0; this can only be done if all relevant uses
of TLS descriptors are marked to permit accurate relaxation.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed the part about undefined weak, thought I'd done that already but my eyes were deceiving me.

I've replaced it with a reference to the design document.

Comment thread pauthabielf64/pauthabielf64.rst Outdated
Relocation code ``R_AARCH64_AUTH_TLSDESC_CALL`` is needed to permit
linker optimization of TLS descriptor code sequences involving
signed GOT entries. Further information, including possible
relaxations is available in the `PAUTHABITLS`_ design document.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
relaxations is available in the `PAUTHABITLS`_ design document.
relaxations, is available in the `PAUTHABITLS`_ design document.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK, have adopted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[PAUTHABIELF64] Introduce R_AARCH64_AUTH_TLSDESC_CALL and .tlsdescauthcall

3 participants