Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Apple Silicon Mac support #164

Closed
wants to merge 27 commits into from
Closed
Changes from 1 commit
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
c7328c8
Fixed clang as assembly
cielavenir Nov 21, 2020
d6ec9e6
fixed clang as build
cielavenir Nov 21, 2020
e904c34
Fixed addressing assembly
cielavenir Nov 24, 2020
afa64d0
It should be fine to enable pmull always on Apple Silicon
cielavenir Nov 24, 2020
6b59dac
Fixed assembly (compared with objdump)
cielavenir Nov 25, 2020
75115c8
Merge branch 'fix_clang_as' into fix_mach
cielavenir Nov 25, 2020
d68f604
Fix typo, thanks @yuhaoth
cielavenir Dec 5, 2020
ecf1e81
Changed the conditional macro to __APPLE__
cielavenir Dec 6, 2020
84d2132
Rewritten dispatcher using sysctlbyname
cielavenir Dec 6, 2020
30b9639
Use __USER_LABEL_PREFIX__
cielavenir May 10, 2021
df8eb0a
Use __TEXT,__const as readonly section
cielavenir May 10, 2021
22211f8
Merge remote-tracking branch 'origin/master' into HEAD
cielavenir May 10, 2021
5eab6c1
Fixed erasure_code build on mach
cielavenir May 10, 2021
599615e
Fix indent
cielavenir May 10, 2021
cd25b1f
Merge commit '642ef36' into fix_mach
cielavenir Mar 6, 2022
2d7dd04
Merge remote-tracking branch 'origin/master' into fix_mach
cielavenir Mar 6, 2022
6875af9
Reworked on dispatcher
cielavenir Mar 6, 2022
e31c00f
fix func decl
cielavenir Mar 6, 2022
3646af7
use ASM_DEF_RODATA macro
cielavenir Mar 6, 2022
5bce07f
add comment
cielavenir Mar 9, 2022
b878e6d
Merge remote-tracking branch 'origin/master' into fix_mach
cielavenir Jul 20, 2022
acd48c0
fix ASM_DEF_RODATA include
cielavenir Jul 22, 2022
855112d
Merge remote-tracking branch 'ciel/fix_mach' into HEAD
cielavenir Jul 22, 2022
225b6bd
Fix q_fold_const load
cielavenir Jul 22, 2022
de6af92
Merge remote-tracking branch 'ciel/master' into fix_mach
cielavenir Oct 27, 2022
825d080
fix fold_constant decl
cielavenir Oct 27, 2022
33a7a42
fixed another fold_constant decl
cielavenir Oct 27, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions crc/aarch64/crc_aarch64_dispatcher.c
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,8 @@ DEFINE_INTERFACE_DISPATCHER(crc16_t10dif)
unsigned long auxval = getauxval(AT_HWCAP);
if (auxval & HWCAP_PMULL)
return PROVIDER_INFO(crc16_t10dif_pmull);
#elif defined(__aarch64__)
Copy link
Contributor

@yuhaoth yuhaoth Dec 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggestion is add it like

           #ifndef __MACH__
                        unsigned long auxval=getauxval(AT_HWCAP);
                        if(auxval & HWCAP_PMULL)
                                 return PROVIDER_INFO(crc16_t10dif_pmull);
                        return PROVIDER_BASIC(crc16_t10dif);
             #else
                        return PROVIDER_INFO(crc16_t10dif_pmull);
             #endif

And another thing I must confirm with you . If the transparent layer can be remove , I think this file should not be compiled in Apple Silicon Mac.

Copy link
Contributor Author

@cielavenir cielavenir Dec 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might not be answering correct question, but removing __MACH__ causes getauxval undefined.

Copy link
Contributor Author

@cielavenir cielavenir Dec 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have rewritten dispatchers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • aarch64's neon is spec
  • my very first assumption was Apple would not ever sell aarch64 CPU without pmull. In this way I don't need dispatcher.

in above condition, removing 12575f5 solves "aarch64_multibinary.h issue" in workaround way.

return PROVIDER_INFO(crc16_t10dif_pmull);
#endif
return PROVIDER_BASIC(crc16_t10dif);

Expand All @@ -45,6 +47,8 @@ DEFINE_INTERFACE_DISPATCHER(crc16_t10dif_copy)
unsigned long auxval = getauxval(AT_HWCAP);
if (auxval & HWCAP_PMULL)
return PROVIDER_INFO(crc16_t10dif_copy_pmull);
#elif defined(__aarch64__)
return PROVIDER_INFO(crc16_t10dif_copy_pmull);
#endif
return PROVIDER_BASIC(crc16_t10dif_copy);

Expand All @@ -57,6 +61,8 @@ DEFINE_INTERFACE_DISPATCHER(crc32_ieee)
if (auxval & HWCAP_PMULL) {
return PROVIDER_INFO(crc32_ieee_norm_pmull);
}
#elif defined(__aarch64__)
return PROVIDER_INFO(crc32_ieee_norm_pmull);
#endif
return PROVIDER_BASIC(crc32_ieee);

Expand All @@ -81,6 +87,8 @@ DEFINE_INTERFACE_DISPATCHER(crc32_iscsi)
if (auxval & HWCAP_PMULL) {
return PROVIDER_INFO(crc32_iscsi_refl_pmull);
}
#elif defined(__aarch64__)
return PROVIDER_INFO(crc32_iscsi_refl_pmull);
Copy link
Contributor

@yuhaoth yuhaoth Dec 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function might not be best choice . As I know, crc32_iscsi_crc_ext or crc32_iscsi_3crc_fold should be better choice.

You can test the real performance and pick up the best one

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I'm sorry.

Although I have asked an acquaintance of mine to test the same binary on ARM mac (it worked, so "it does support ARM mac"), my primary machine is Intel mac and my test environment is iPad (well, jailbroken and sshd enabled).

Also, at least crc32 instruction caused SIGILL on my iPad (from libslz).

The worse thing, as far as I know, there are no runtime cpu feature detection API on Darwin. That's why I adjust the dispatcher to middle-range...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems undocumented _get_cpu_capabilities can be used as "runtime cpu feature detection API".

Now my concern is https://developer.apple.com/documentation/xcode/writing_arm64_code_for_apple_platforms The platforms reserve register x18. Don’t use this register. Allowing CRC32 instruction could get into this codepath...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

x18 problem should be fix . I will raise an issue later.

And it looks runtime cpu feature detection should be added in Apple. As I known , M1 (mac mini arm64 ) are available now. I guess it has more feature support.

Could you review aarch64_multibinary.h ? I am not sure if there are anything that can not match Apple spec.

#endif
return PROVIDER_BASIC(crc32_iscsi);

Expand All @@ -105,6 +113,8 @@ DEFINE_INTERFACE_DISPATCHER(crc32_gzip_refl)

if (auxval & HWCAP_PMULL)
return PROVIDER_INFO(crc32_gzip_refl_pmull);
#elif defined(__aarch64__)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above comment. crc32_gzip_refl_crc_ext and crc32_gzip_refl_3crc_fold are better choice.

return PROVIDER_INFO(crc32_gzip_refl_pmull);
#endif
return PROVIDER_BASIC(crc32_gzip_refl);

Expand All @@ -117,6 +127,8 @@ DEFINE_INTERFACE_DISPATCHER(crc64_ecma_refl)

if (auxval & HWCAP_PMULL)
return PROVIDER_INFO(crc64_ecma_refl_pmull);
#elif defined(__aarch64__)
cielavenir marked this conversation as resolved.
Show resolved Hide resolved
return PROVIDER_INFO(crc64_ecma_refl_pmull);
#endif
return PROVIDER_BASIC(crc64_ecma_refl);

Expand All @@ -128,6 +140,8 @@ DEFINE_INTERFACE_DISPATCHER(crc64_ecma_norm)
unsigned long auxval = getauxval(AT_HWCAP);
if (auxval & HWCAP_PMULL)
return PROVIDER_INFO(crc64_ecma_norm_pmull);
#elif defined(__aarch64__)
return PROVIDER_INFO(crc64_ecma_norm_pmull);
#endif
return PROVIDER_BASIC(crc64_ecma_norm);

Expand All @@ -139,6 +153,8 @@ DEFINE_INTERFACE_DISPATCHER(crc64_iso_refl)
unsigned long auxval = getauxval(AT_HWCAP);
if (auxval & HWCAP_PMULL)
return PROVIDER_INFO(crc64_iso_refl_pmull);
#elif defined(__aarch64__)
return PROVIDER_INFO(crc64_iso_refl_pmull);
#endif
return PROVIDER_BASIC(crc64_iso_refl);

Expand All @@ -150,6 +166,8 @@ DEFINE_INTERFACE_DISPATCHER(crc64_iso_norm)
unsigned long auxval = getauxval(AT_HWCAP);
if (auxval & HWCAP_PMULL)
return PROVIDER_INFO(crc64_iso_norm_pmull);
#elif defined(__aarch64__)
return PROVIDER_INFO(crc64_iso_norm_pmull);
#endif
return PROVIDER_BASIC(crc64_iso_norm);

Expand All @@ -161,6 +179,8 @@ DEFINE_INTERFACE_DISPATCHER(crc64_jones_refl)
unsigned long auxval = getauxval(AT_HWCAP);
if (auxval & HWCAP_PMULL)
return PROVIDER_INFO(crc64_jones_refl_pmull);
#elif defined(__aarch64__)
return PROVIDER_INFO(crc64_jones_refl_pmull);
#endif
return PROVIDER_BASIC(crc64_jones_refl);

Expand All @@ -172,6 +192,8 @@ DEFINE_INTERFACE_DISPATCHER(crc64_jones_norm)
unsigned long auxval = getauxval(AT_HWCAP);
if (auxval & HWCAP_PMULL)
return PROVIDER_INFO(crc64_jones_norm_pmull);
#elif defined(__aarch64__)
return PROVIDER_INFO(crc64_jones_norm_pmull);
#endif
return PROVIDER_BASIC(crc64_jones_norm);

Expand Down