Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[llvm-dlltool] Fix renamed imports without a separate regular import entry #98229

Merged
merged 1 commit into from
Jul 16, 2024

Conversation

mstorsjo
Copy link
Member

@mstorsjo mstorsjo commented Jul 9, 2024

Normally, when doing renamed imports, we do this by providing a
weak alias, towards another regular import, for the symbol we
want to actually import. In a def file, this looks like this:

regularfunc
renamedfunc == regularfunc

However, if we want to link against a function in a DLL, where we
(intentionally) don't provide a regular import for that symbol
with the name in its DLL, doing the renamed import with a weak
alias doesn't work, as there's no symbol that the weak alias can
point towards.

We can't make up such an import either, as we may intentionally
not want to provide a regular import for that name.

This situation can either be resolved by using the "long" import
library format (as e.g. produced by GNU dlltool), or by using the
new short import library name type "export as".

This patch implements it by using the "export as" name type.

When producing a renamed import, defer emitting it until all regular
imports have been produced. If the renamed import refers to a
symbol that does exist as a regular import entry, produce a
weak alias, just as before. (This implementation also avoids needing
to know whether the symbol that the alias points towards actually
is prefixed or not, too.)

If the renamed import points at a symbol that isn't otherwise
available (or is available as a renamed symbol itself), generate
an "export as" import entry.

This name type is new, and is normally used in ARM64EC import
libraries, but can also be used for other architectures.

@mstorsjo mstorsjo requested a review from cjacek July 9, 2024 21:36
@mstorsjo
Copy link
Member Author

mstorsjo commented Jul 9, 2024

This goes on top of #98228, only the topmost commit is for review here.

@llvmbot
Copy link
Collaborator

llvmbot commented Jul 9, 2024

@llvm/pr-subscribers-lld-coff

@llvm/pr-subscribers-lld

Author: Martin Storsjö (mstorsjo)

Changes

Normally, when doing renamed imports, we do this by providing a
weak alias, towards another regular import, for the symbol we
want to actually import. In a def file, this looks like this:

regularfunc
renamedfunc == regularfunc

However, if we want to link against a function in a DLL, where we
(intentionally) don't provide a regular import for that symbol
with the name in its DLL, doing the renamed import with a weak
alias doesn't work, as there's no symbol that the weak alias can
point towards.

We can't make up such an import either, as we may intentionally
not want to provide a regular import for that name.

This situation can either be resolved by using the "long" import
library format (as e.g. produced by GNU dlltool), or by using the
new short import library name type "export as".

This patch implements it by using the "export as" name type.

When producing a renamed import, defer emitting it until all regular
imports have been produced. If the renamed import refers to a
symbol that does exist as a regular import entry, produce a
weak alias, just as before. (This implementation also avoids needing
to know whether the symbol that the alias points towards actually
is prefixed or not, too.)

If the renamed import points at a symbol that isn't otherwise
available (or is available as a renamed symbol itself), generate
an "export as" import entry.

This name type is new, and is normally used in ARM64EC import
libraries, but can also be used for other architectures.


Full diff: https://github.com/llvm/llvm-project/pull/98229.diff

6 Files Affected:

  • (modified) lld/test/COFF/lib.test (+8-8)
  • (modified) llvm/lib/Object/COFFImportFile.cpp (+62-14)
  • (modified) llvm/lib/Object/COFFModuleDefinition.cpp (-2)
  • (modified) llvm/test/tools/llvm-dlltool/coff-decorated.def (+17-1)
  • (modified) llvm/test/tools/llvm-dlltool/coff-weak-exports.def (+15-4)
  • (added) llvm/test/tools/llvm-dlltool/renaming.def (+39)
diff --git a/lld/test/COFF/lib.test b/lld/test/COFF/lib.test
index 7525ef4226cda..82abca6ec9307 100644
--- a/lld/test/COFF/lib.test
+++ b/lld/test/COFF/lib.test
@@ -1,6 +1,14 @@
 # RUN: lld-link /machine:x64 /def:%S/Inputs/library.def /out:%t.lib
 # RUN: llvm-nm %t.lib | FileCheck %s
 
+CHECK: 00000000 R __imp_constant
+CHECK: 00000000 R constant
+
+CHECK: 00000000 D __imp_data
+
+CHECK: 00000000 T __imp_function
+CHECK: 00000000 T function
+
 CHECK: 00000000 a @comp.id
 CHECK: 00000000 a @feat.00
 CHECK: 00000000 W alias
@@ -11,11 +19,3 @@ CHECK: 00000000 a @feat.00
 CHECK: 00000000 W __imp_alias
 CHECK:          U __imp_function
 
-CHECK: 00000000 R __imp_constant
-CHECK: 00000000 R constant
-
-CHECK: 00000000 D __imp_data
-
-CHECK: 00000000 T __imp_function
-CHECK: 00000000 T function
-
diff --git a/llvm/lib/Object/COFFImportFile.cpp b/llvm/lib/Object/COFFImportFile.cpp
index fd8aca393e90b..6922ac6994469 100644
--- a/llvm/lib/Object/COFFImportFile.cpp
+++ b/llvm/lib/Object/COFFImportFile.cpp
@@ -12,6 +12,8 @@
 
 #include "llvm/Object/COFFImportFile.h"
 #include "llvm/ADT/ArrayRef.h"
+#include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/StringMap.h"
 #include "llvm/ADT/Twine.h"
 #include "llvm/Object/Archive.h"
 #include "llvm/Object/ArchiveWriter.h"
@@ -52,18 +54,12 @@ StringRef COFFImportFile::getFileFormatName() const {
   }
 }
 
-StringRef COFFImportFile::getExportName() const {
-  const coff_import_header *hdr = getCOFFImportHeader();
-  StringRef name = Data.getBuffer().substr(sizeof(*hdr)).split('\0').first;
-
+static StringRef applyNameType(ImportNameType Type, StringRef name) {
   auto ltrim1 = [](StringRef s, StringRef chars) {
     return !s.empty() && chars.contains(s[0]) ? s.substr(1) : s;
   };
 
-  switch (hdr->getNameType()) {
-  case IMPORT_ORDINAL:
-    name = "";
-    break;
+  switch (Type) {
   case IMPORT_NAME_NOPREFIX:
     name = ltrim1(name, "?@_");
     break;
@@ -71,6 +67,24 @@ StringRef COFFImportFile::getExportName() const {
     name = ltrim1(name, "?@_");
     name = name.substr(0, name.find('@'));
     break;
+  default:
+    break;
+  }
+  return name;
+}
+
+StringRef COFFImportFile::getExportName() const {
+  const coff_import_header *hdr = getCOFFImportHeader();
+  StringRef name = Data.getBuffer().substr(sizeof(*hdr)).split('\0').first;
+
+  switch (hdr->getNameType()) {
+  case IMPORT_ORDINAL:
+    name = "";
+    break;
+  case IMPORT_NAME_NOPREFIX:
+  case IMPORT_NAME_UNDECORATE:
+    name = applyNameType(static_cast<ImportNameType>(hdr->getNameType()), name);
+    break;
   case IMPORT_NAME_EXPORTAS: {
     // Skip DLL name
     name = Data.getBuffer().substr(sizeof(*hdr) + name.size() + 1);
@@ -667,6 +681,13 @@ Error writeImportLibrary(StringRef ImportName, StringRef Path,
 
   auto addExports = [&](ArrayRef<COFFShortExport> Exp,
                         MachineTypes M) -> Error {
+    StringMap<std::string> RegularImports;
+    struct Deferred {
+      std::string Name;
+      ImportType ImportType;
+      const COFFShortExport *Export;
+    };
+    SmallVector<Deferred, 0> Renames;
     for (const COFFShortExport &E : Exp) {
       if (E.Private)
         continue;
@@ -690,12 +711,6 @@ Error writeImportLibrary(StringRef ImportName, StringRef Path,
         Name.swap(*ReplacedName);
       }
 
-      if (!E.ImportName.empty() && Name != E.ImportName) {
-        Members.push_back(OF.createWeakExternal(E.ImportName, Name, false, M));
-        Members.push_back(OF.createWeakExternal(E.ImportName, Name, true, M));
-        continue;
-      }
-
       ImportNameType NameType;
       std::string ExportName;
       if (E.Noname) {
@@ -703,6 +718,23 @@ Error writeImportLibrary(StringRef ImportName, StringRef Path,
       } else if (!E.ExportAs.empty()) {
         NameType = IMPORT_NAME_EXPORTAS;
         ExportName = E.ExportAs;
+      } else if (!E.ImportName.empty()) {
+        if (Machine == IMAGE_FILE_MACHINE_I386 &&
+            applyNameType(IMPORT_NAME_UNDECORATE, Name) == E.ImportName)
+          NameType = IMPORT_NAME_UNDECORATE;
+        else if (Machine == IMAGE_FILE_MACHINE_I386 &&
+                 applyNameType(IMPORT_NAME_NOPREFIX, Name) == E.ImportName)
+          NameType = IMPORT_NAME_NOPREFIX;
+        else if (Name == E.ImportName)
+          NameType = IMPORT_NAME;
+        else {
+          Deferred D;
+          D.Name = Name;
+          D.ImportType = ImportType;
+          D.Export = &E;
+          Renames.push_back(D);
+          continue;
+        }
       } else {
         NameType = getNameType(SymbolName, E.Name, M, MinGW);
       }
@@ -722,9 +754,25 @@ Error writeImportLibrary(StringRef ImportName, StringRef Path,
         }
       }
 
+      RegularImports[applyNameType(NameType, Name)] = Name;
       Members.push_back(OF.createShortImport(Name, E.Ordinal, ImportType,
                                              NameType, ExportName, M));
     }
+    for (const auto &D : Renames) {
+      auto It = RegularImports.find(D.Export->ImportName);
+      if (It != RegularImports.end()) {
+        // We have a regular import entry for a symbol with the name we
+        // want to reference; produce an alias pointing at that.
+        StringRef Symbol = It->second;
+        if (D.ImportType == IMPORT_CODE)
+          Members.push_back(OF.createWeakExternal(Symbol, D.Name, false, M));
+        Members.push_back(OF.createWeakExternal(Symbol, D.Name, true, M));
+      } else {
+        Members.push_back(OF.createShortImport(
+            D.Name, D.Export->Ordinal, D.ImportType, IMPORT_NAME_EXPORTAS,
+            D.Export->ImportName, M));
+      }
+    }
     return Error::success();
   };
 
diff --git a/llvm/lib/Object/COFFModuleDefinition.cpp b/llvm/lib/Object/COFFModuleDefinition.cpp
index 0c0bef1319e44..82c18539658e8 100644
--- a/llvm/lib/Object/COFFModuleDefinition.cpp
+++ b/llvm/lib/Object/COFFModuleDefinition.cpp
@@ -282,8 +282,6 @@ class Parser {
       if (Tok.K == EqualEqual) {
         read();
         E.ImportName = std::string(Tok.Value);
-        if (AddUnderscores && !isDecorated(E.ImportName, MingwDef))
-          E.ImportName = std::string("_").append(E.ImportName);
         continue;
       }
       // EXPORTAS must be at the end of export definition
diff --git a/llvm/test/tools/llvm-dlltool/coff-decorated.def b/llvm/test/tools/llvm-dlltool/coff-decorated.def
index fc81f23d09d6c..f5685fb1cf0c6 100644
--- a/llvm/test/tools/llvm-dlltool/coff-decorated.def
+++ b/llvm/test/tools/llvm-dlltool/coff-decorated.def
@@ -7,12 +7,16 @@ EXPORTS
 CdeclFunction
 StdcallFunction@4
 @FastcallFunction@4
-StdcallAlias@4==StdcallFunction@4
+StdcallAlias@4==StdcallFunction
 ??_7exception@@6B@
 StdcallExportName@4=StdcallInternalFunction@4
 OtherStdcallExportName@4=CdeclInternalFunction
 CdeclExportName=StdcallInternalFunction@4
 
+NoprefixStdcall@4 == NoprefixStdcall@4
+DecoratedStdcall@4 == _DecoratedStdcall@4
+UndecoratedStdcall@4 == UndecoratedStdcall
+
 ; CHECK:      Name type: noprefix
 ; CHECK-NEXT: Export name: CdeclFunction
 ; CHECK-NEXT: Symbol: __imp__CdeclFunction
@@ -43,3 +47,15 @@ CdeclExportName=StdcallInternalFunction@4
 ; CHECK-NEXT: Export name: CdeclExportName
 ; CHECK-NEXT: Symbol: __imp__CdeclExportName
 ; CHECK-NEXT: Symbol: _CdeclExportName
+; CHECK:      Name type: noprefix
+; CHECK-NEXT: Export name: NoprefixStdcall@4
+; CHECK-NEXT: Symbol: __imp__NoprefixStdcall@4
+; CHECK-NEXT: Symbol: _NoprefixStdcall@4
+; CHECK:      Name type: name
+; CHECK-NEXT: Export name: _DecoratedStdcall@4
+; CHECK-NEXT: Symbol: __imp__DecoratedStdcall@4
+; CHECK-NEXT: Symbol: _DecoratedStdcall@4
+; CHECK:      Name type: undecorate
+; CHECK-NEXT: Export name: UndecoratedStdcall
+; CHECK-NEXT: Symbol: __imp__UndecoratedStdcall@4
+; CHECK-NEXT: Symbol: _UndecoratedStdcall@4
diff --git a/llvm/test/tools/llvm-dlltool/coff-weak-exports.def b/llvm/test/tools/llvm-dlltool/coff-weak-exports.def
index dacc5f73530fd..b08040e29fa42 100644
--- a/llvm/test/tools/llvm-dlltool/coff-weak-exports.def
+++ b/llvm/test/tools/llvm-dlltool/coff-weak-exports.def
@@ -5,7 +5,11 @@
 
 LIBRARY test.dll
 EXPORTS
+AltTestFunction
+AltTestFunction2
+AltTestData
 TestFunction==AltTestFunction
+TestData DATA == AltTestData
 ; When creating an import library, the DLL internal function name of
 ; the implementation of a function isn't visible at all.
 ImpLibName = Implementation
@@ -16,18 +20,25 @@ ImpLibName2 = Implementation2 == AltTestFunction2
 ; matter for the import library
 ImpLibName3 = kernel32.Sleep
 
+; CHECK:      T AltTestFunction
+; CHECK-NEXT: T __imp_AltTestFunction
+; CHECK:      T AltTestFunction2
+; CHECK-NEXT: T __imp_AltTestFunction2
+; CHECK:      T ImpLibName
+; CHECK-NEXT: T __imp_ImpLibName
+; CHECK:      T ImpLibName3
+; CHECK-NEXT: T __imp_ImpLibName3
 ; CHECK:      U AltTestFunction
 ; CHECK-NEXT: W TestFunction
 ; CHECK:      U __imp_AltTestFunction
 ; CHECK-NEXT: W __imp_TestFunction
-; CHECK:      T ImpLibName
-; CHECK-NEXT: T __imp_ImpLibName
+; CHECK-NOT:  W TestData
+; CHECK:      U __imp_AltTestData
+; CHECK-NEXT: W __imp_TestData
 ; CHECK:      U AltTestFunction2
 ; CHECK-NEXT: W ImpLibName2
 ; CHECK:      U __imp_AltTestFunction2
 ; CHECK-NEXT: W __imp_ImpLibName2
-; CHECK:      T ImpLibName3
-; CHECK-NEXT: T __imp_ImpLibName3
 
 ; ARCH-NOT: unknown arch
 
diff --git a/llvm/test/tools/llvm-dlltool/renaming.def b/llvm/test/tools/llvm-dlltool/renaming.def
new file mode 100644
index 0000000000000..57fd472aa37cf
--- /dev/null
+++ b/llvm/test/tools/llvm-dlltool/renaming.def
@@ -0,0 +1,39 @@
+; RUN: llvm-dlltool -k -m i386 --input-def %s --output-lib %t.a
+; RUN: llvm-readobj %t.a | FileCheck %s
+; RUN: llvm-nm %t.a | FileCheck %s -check-prefix=CHECK-NM
+
+LIBRARY test.dll
+EXPORTS
+
+symbolname == actualimport
+
+dataname DATA == actualdata
+
+_wcstok == wcstok
+wcstok == wcstok_s
+
+; CHECK-NM-NOT: actualimport
+; CHECK-NM-NOT: actualdata
+
+; CHECK:      Type: code
+; CHECK-NEXT: Name type: export as
+; CHECK-NEXT: Export name: actualimport
+; CHECK-NEXT: Symbol: __imp__symbolname
+; CHECK-NEXT: Symbol: _symbolname
+
+; CHECK:      Type: data
+; CHECK-NEXT: Name type: export as
+; CHECK-NEXT: Export name: actualdata
+; CHECK-NEXT: Symbol: __imp__dataname
+
+; CHECK:      Type: code
+; CHECK-NEXT: Name type: export as
+; CHECK-NEXT: Export name: wcstok
+; CHECK-NEXT: Symbol: __imp___wcstok
+; CHECK-NEXT: Symbol: __wcstok
+
+; CHECK:      Type: code
+; CHECK-NEXT: Name type: export as
+; CHECK-NEXT: Export name: wcstok_s
+; CHECK-NEXT: Symbol: __imp__wcstok
+; CHECK-NEXT: Symbol: _wcstok

@mstorsjo
Copy link
Member Author

Thanks for the review! For this one, I'd like to bring up a couple implementation details for discussion still.

I'm deferring outputting the renamed entries to after procesing the whole list of symbols. This isn't entirely ideal - it changes the order of the members. I considered other ways around this too:

  • We could do a first run through the symbols, without emitting them, just gathering the mapping of regular imports that we'll produce. This would avoid deferring outputting of renamed symbols - but would need to split up the whole loop in a pretty messy way.
  • We could emit renamed symbols directly, if we know that the alias target exists already. This could keep the member order mostly intact, but would cause a bit more code duplication.

I chose this approach, which requires the least amount of churn in the code.

Finally - we have a block of arm64ec specific code here. I'm not really familiar enough with the arm64ec cases to feel confident about this - how would the cases of renamed imports (as they're used in mingw-w64) behave for an arm64ec build from the same def files, vs the arm64ec logic here?

@cjacek
Copy link
Contributor

cjacek commented Jul 11, 2024

Changing the order doesn't seem too risky for me.

ARM64EC is a bit more problematic as technically we should also emit additional _imp_aux* aliases and both mangled and demangled thunk symbols. However, on ARM64EC, we don't need to worry about the compatibility and we could just always use EXPORTAS instead. This is a preexisting problem and I think that your changes don't make it any harder to fix, I may take care of that.

@mstorsjo
Copy link
Member Author

Thanks!

FWIW, we (or I) should probably look into adding support for the EXPORTAS name type in ld.bfd - ld.bfd does support linking against short import libraries. In practice, these cases of renames are only used in the base mingw-w64-crt import libraries, and ld.bfd can't link against them as things stand anyway (it fails with the use of weak aliases for renames), so taking this into use here shouldn't make anything worse wrt compatibility there anyway.

Copy link

github-actions bot commented Jul 14, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

…entry

Normally, when doing renamed imports, we do this by providing a
weak alias, towards another regular import, for the symbol we
want to actually import. In a def file, this looks like this:

    regularfunc
    renamedfunc == regularfunc

However, if we want to link against a function in a DLL, where we
(intentionally) don't provide a regular import for that symbol
with the name in its DLL, doing the renamed import with a weak
alias doesn't work, as there's no symbol that the weak alias can
point towards.

We can't make up such an import either, as we may intentionally
not want to provide a regular import for that name.

This situation can either be resolved by using the "long" import
library format (as e.g. produced by GNU dlltool), or by using the
new short import library name type "export as".

This patch implements it by using the "export as" name type.

When producing a renamed import, defer emitting it until all regular
imports have been produced. If the renamed import refers to a
symbol that does exist as a regular import entry, produce a
weak alias, just as before. (This implementation also avoids needing
to know whether the symbol that the alias points towards actually
is prefixed or not, too.)

If the renamed import points at a symbol that isn't otherwise
available (or is available as a renamed symbol itself), generate
an "export as" import entry.

This name type is new, and is normally used in ARM64EC import
libraries, but can also be used for other architectures.
@mstorsjo mstorsjo merged commit f2d6d74 into llvm:main Jul 16, 2024
4 of 5 checks passed
@mstorsjo mstorsjo deleted the dlltool-aliases branch July 16, 2024 20:19
sgundapa pushed a commit to sgundapa/upstream_effort that referenced this pull request Jul 23, 2024
…entry (llvm#98229)

Normally, when doing renamed imports, we do this by providing a
weak alias, towards another regular import, for the symbol we
want to actually import. In a def file, this looks like this:

    regularfunc
    renamedfunc == regularfunc

However, if we want to link against a function in a DLL, where we
(intentionally) don't provide a regular import for that symbol
with the name in its DLL, doing the renamed import with a weak
alias doesn't work, as there's no symbol that the weak alias can
point towards.

We can't make up such an import either, as we may intentionally
not want to provide a regular import for that name.

This situation can either be resolved by using the "long" import
library format (as e.g. produced by GNU dlltool), or by using the
new short import library name type "export as".

This patch implements it by using the "export as" name type.

When producing a renamed import, defer emitting it until all regular
imports have been produced. If the renamed import refers to a
symbol that does exist as a regular import entry, produce a
weak alias, just as before. (This implementation also avoids needing
to know whether the symbol that the alias points towards actually
is prefixed or not, too.)

If the renamed import points at a symbol that isn't otherwise
available (or is available as a renamed symbol itself), generate
an "export as" import entry.

This name type is new, and is normally used in ARM64EC import
libraries, but can also be used for other architectures.
yuxuanchen1997 pushed a commit that referenced this pull request Jul 25, 2024
…entry (#98229)

Summary:
Normally, when doing renamed imports, we do this by providing a
weak alias, towards another regular import, for the symbol we
want to actually import. In a def file, this looks like this:

    regularfunc
    renamedfunc == regularfunc

However, if we want to link against a function in a DLL, where we
(intentionally) don't provide a regular import for that symbol
with the name in its DLL, doing the renamed import with a weak
alias doesn't work, as there's no symbol that the weak alias can
point towards.

We can't make up such an import either, as we may intentionally
not want to provide a regular import for that name.

This situation can either be resolved by using the "long" import
library format (as e.g. produced by GNU dlltool), or by using the
new short import library name type "export as".

This patch implements it by using the "export as" name type.

When producing a renamed import, defer emitting it until all regular
imports have been produced. If the renamed import refers to a
symbol that does exist as a regular import entry, produce a
weak alias, just as before. (This implementation also avoids needing
to know whether the symbol that the alias points towards actually
is prefixed or not, too.)

If the renamed import points at a symbol that isn't otherwise
available (or is available as a renamed symbol itself), generate
an "export as" import entry.

This name type is new, and is normally used in ARM64EC import
libraries, but can also be used for other architectures.

Test Plan: 

Reviewers: 

Subscribers: 

Tasks: 

Tags: 


Differential Revision: https://phabricator.intern.facebook.com/D60251620
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants