Skip to content

Latest commit

 

History

History
84 lines (76 loc) · 5.48 KB

RefactorGuide.md

File metadata and controls

84 lines (76 loc) · 5.48 KB

Refactor guide

This is a step by step overview how to refactor an architecture.

It can also be used to add a new architecture module. As long as it is supported by LLVM or a fork of it.

Please always contact us in the Auto-Sync tracking issue before working on a module. We can provide support and save you a lot of time.

Don't hesitate to ask any questions in our Telegram Community channel.

Especially if you feel stuck or struggle to understand where an issue is coming from. The update process is, although already simplified, relatively complex.

Refactoring

Note:

  • If we talk about C++ files in the steps below, we always refer to the files in the LLVM repo.

  • PrinterCapstone is the class defined in llvm-capstone/llvm/utils/TabelGen/PrinterCapstone.cpp

  • Always attempt to make the translated C file behave as closely as possible to the original C++ file! This greatly helps debugging and assures that Capstone behaves almost exactly the same as original LLVM.

  • Prepare

    • Read CONTRIBUTING.md
    • Read docs/ARCHITECTURE.md
    • Read suite/auto-sync/README.md
    • Read suite/auto-sync/ARCHITECTURE.md
    • Read suite/auto-sync/intro.md
      • Delete all files in arch/<ARCH>/, except the ARCHModule.* and ARCHMapping.*.
      • cd suite/auto-sync/
  • Generate inc files

    • pip install -e .
    • Clone and build llvm-tblgen (see docs)
    • Quickly check options of the updater ASUpdater -h
      • Add Arch name in Target.py
      • In llvm-capstone handle arch in PrinterCapstone.cpp::decoderEmitterEmitFieldFromInstruction() (add decoder function)
      • Generate: ASUpdater -s IncGen -a ARCH
        • Errors? Check if the error message tells you what to do. If no hint exists, ask us.
      • Check if inc files in build look good.
  • Translation and Patching

    • Check for template functions in <ARCH>InstPrinter.cpp and <ARCH>Disassember.cpp
    • Copy new config in arch_conf.json (LoongArch for a minimal example).
      • Don't forget to add ARCHIntPrinter.cpp to the list of the AddCSDetail tests!
    • Add as a minimum the <ARCH>InstPrinter.cpp, <ARCH>InstPrinter.h and <ARCH>Disassembler.cpp to the translation list.
      • Tip: The variables use in there are defined in path_vars.json
    • Add architecture specific includes in Patches/Includes.py. Copy the code from another architecture for the beginning.
    • Prepare API header (<arch>.h) for patching:
      • Check the generated inc files. Files names like <ARCH>GenCS<something>Enum.inc contain enumerations for the header. Those get patched into the main header file of the architecture.
      • Remove old values and add // generated content <...> begin comments for patching. Checkout longarch.h as example.
    • Commit all changes so far.
    • The next step will write to the arch/ and include/capstone/<arch>.h header!
      • Run generation, translation and copy/patch the files: ASUpdater -a <ARCH> -w --copy-translated -s IncGen Translate PatchArchHeader
  • Clean up

    • Check: All necessary files

      • Arch header:
        • Invalid characters in enum identifiers? Replace char in PrinterCapstone::normalizedMnemonic
      • In arch/<ARCH>
        • Missing identifier/symbols? -> Check if they are somewhere in the generated files. If yes, included them and update Include.py. If not, you have to find the LLVM source file where they are defined and add it to the arch_config.json to translate it.
          • OR it needs the SystemOperands.inc file. Also can be generated by adding the arch to the list in inc_gen.json.
      • Note: When you start the next step, you likely don't want to generate, translate and copy files again. Because your had-made fixes get overwritten. So ensure you no longer use the -w flag for the ASUpdater and you checked thoroughly that all necessary files got translated!
    • Commit to save changes so far.
    • Remove and fix C++ syntax

      • Remove all obvious irrelevant C++ code from the translated files (e.g. class initializes)
      • Double check non-obvious cases, if they are important. Rember: removing something might lead to bugs later!
        • If in doubt, ask us.
      • If you fix the same syntax over and over again, consider adding a Patch for the CppTranslator.
      • Common problems:
        • Missing namespace prefix unsigned GR32Regs[] should be unsigned ARCH_GR32Regs[]. See namespace begin/end comments in the code.
      • TODO: Add more.
        • If in doubt, check the original C++ file in the LLVM repo.
  • Make it build

    • Add ARCHLinkage.h and the functions in the InstPrinter.c, ArchDisassembler.c.
    • Add essential code in ARCHMapping.c. Esential is everything not releated to details.
    • If unsure how to do Capstone <-> LLVM code things, always check LoongArch. If LoongArch doesn't handle this case, check Mips, SystemZ
  • Run tests & Fixing bugs

    • Update regression MC tests: Map LLVM mattr and mcpu names to the CS identifiers if necessary. -> Edit the mcupdater.json config file.
    • Update tests: ASUpdater -s MCUpdate -a Arch -w
      • Run MC tests: cstest tests/MC/Arch
  • Add details

    • Effectively copy behavior from LoongArchMapping.c or SystemZMapping.c but change values.
    • Changes to the API (structs in arch.h) are only allowed if it was wrong before. Otherwise only extensions.
    • Don't forget to update the Python bindings.
    • Run detail tests to check results.
    • Run detail tests with coverage. ArchMapping.c should be covered near 100%