Clarify some interning details#158642
Conversation
| fn alloc_import(&'ra self, import: ImportData<'ra>) -> Import<'ra> { | ||
| // FIXME: `Interned::new_unchecked`'s argument must be unique (e.g. via interning) but it's | ||
| // not clear that this value is unique! And yet, somehow things seem to work ok. Perhaps | ||
| // the IDs and/or the spans within `import` guarantee uniqueness? |
There was a problem hiding this comment.
There isn't much to fix here (and in the other 3 cases), different Imports are unique by definition, even if they have the same data.
The address of the ImportData on the arena is the ID.
In theory we could introduce some separate ImportId(u32)s and add them to ImportData, and alloc_import would generate a fresh ImportId every time, but why doing that if the pointer itself is enough.
Previously arena allocated data in the resolver didn't even use Interned, it used a different type named PtrKey that simply provided pointer-like comparison and hashing for references.
But then PtrKey was merged into Interned (maybe by you actually), despite slightly different requirements.
Upd, relevant PR:
There was a problem hiding this comment.
different Imports are unique by definition, even if they have the same data.
Can you expand on that? I don't understand.
If you write a SAFETY: <comment> explanation to replace the FIXME comments I'm happy to change them.
There was a problem hiding this comment.
Suppose DeclData, for example, loses all the auxiliary information and is left only with
struct DeclData {
res: Res,
span: Span,
}Then you can end up with two different declarations in different modules having the same DeclData (the spans can always be dummy).
But they are still different unique declarations, and the DeclData address on the arena is used as their unique ID, and they can therefore be compared by address.
It can be represented as "desugaring"
struct DeclData {
id: *const () = /* address of self */, // this would make the data unique too
res: Res,
span: Span,
}, except the address is not kept explicitly.
(In C++ maps keyed on pointers like this are used right and left, including in LLVM.)
There was a problem hiding this comment.
Thanks for the info. I never really understood this properly. I have adjusted the PR significantly to account for it.
We could consider renaming Interned back to PtrKey, to better reflect that it covers more than just the interning case. But I haven't done that yet.
de8a386 to
b097cca
Compare
|
This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
This comment has been minimized.
This comment has been minimized.
`Interned` does pointer equality/hashing, which is valid in two cases. - The values are guaranteed to be unique (e.g. via interning, or construction). This is how `rustc_middle` uses `Interned`. - The type has "identity" and different values should be considered distinct even if they are identical. This is how `rustc_resolve` uses `Interned`. PR 137202 tried to clarify things by adding a `T: Hash` constraint to `Interned<'a, T>`. This constraint isn't actually used, because `Interned` is hashed based on pointer value, not contents. But it was intended to communicate the idea that a type stored in `Interned` is actually interned, which is likely to be done with hashing. Panicking impls of `Hash` were added for the relevant `rustc_resolve` types to work around the fact that it doesn't use hashing-based interning. In my opinion PR 137202 didn't improve things. The `T: Hash` constraint is only aimed at the interning case, and even for that case it's not quite right because you could use a `BTreeMap` to intern instead of a `HashMap`. This commit does several things. - Removes the `T: Hash` constraint and the `Hash` impls for `rustc_resolve` types added in PR 137202. - Improves the comments on `Interned` to cover the non-interning cases. - Removes the `PartialOrd`/`Ord` impls on `Interned` because (a) they're not used, and (b) their meaning is unclear for the "identity" case. - Improves the documentation in `rustc_resolve` to explain how `Interned` usage is valid there.
b097cca to
fdfba67
Compare
Details in the commit message.
r? @Mark-Simulacrum