-
Notifications
You must be signed in to change notification settings - Fork 1
/
xtimes.cortex_a72_rpi4b.txt
235 lines (223 loc) · 10.3 KB
/
xtimes.cortex_a72_rpi4b.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
[H[2J[3J
=================================================
Relative Execution Times of FXP operations
=================================================
Operating System: Ubuntu 22.04.4 LTS
Architecture: arm64
Model : Raspberry Pi 4 Model B Rev 1.4
System details:
Type sizes on this system (some might depend on compiler options):
char has a size of 1 bytes.
short has a size of 2 bytes.
int has a size of 4 bytes.
long has a size of 8 bytes.
long long has a size of 8 bytes.
float has a size of 4 bytes.
double has a size of 8 bytes.
long double has a size of 16 bytes.
Number of frac bits: 8
10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Using only ints:
add : 1.00
mul : 11.37
div : 19.41
lg2 : 38.93 (BKM-L, only ints)
ln : 165.38 (using lg2)
lg10 : 165.17 (using lg2)
pow2 : 102.81 (BKM-E, only ints)
exp : 109.07 (about 1.06x pow2, using pow2)
pow10 : 96.43 (about 0.94x pow2, using pow2)
sqrt_alt : 218.03 (about 2.12x pow2, using lg2 & pow2)
sqrt : 114.02 (about 0.52x sqrt_alt, using CORDIC)
powxy : 295.64 (about 2.88x pow2, using lg2 & pow2)
Using longs:
mul_l : 2.27
div_l : 1.95
lg2_l : 8.30 (about 0.21x lg2, using BKM-L and longs)
lg2_mul_l : 24.20 (about 0.62x lg2, using mult. and longs)
ln_l : 31.24 (about 0.80x lg2, using lg2_l)
lg10_l : 31.22 (about 0.80x lg2, using lg2_l)
pow2_l : 21.05 (about 0.20x pow2, using BKM-E and longs)
exp_l : 24.27 (about 0.24x pow2, using pow2_l)
pow10_l : 21.95 (about 0.21x pow2, using pow2_l)
sqrt_alt_l : 41.28 (about 0.19x sqrt_alt, using lg2_l & pow2_l)
sqrt_l : 17.18 (about 0.15x sqrt, using CORDIC and longs)
powxy_l : 58.88 (about 0.57x pow2, using lg2_l & pow2_l)
cossin_l : 11.68 (using CORDIC)
Number of frac bits: 12
10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Using only ints:
add : 1.00
mul : 11.76
div : 20.41
lg2 : 56.47 (BKM-L, only ints)
ln : 168.65 (using lg2)
lg10 : 168.38 (using lg2)
pow2 : 102.75 (BKM-E, only ints)
exp : 107.58 (about 1.05x pow2, using pow2)
pow10 : 95.70 (about 0.93x pow2, using pow2)
sqrt_alt : 228.75 (about 2.23x pow2, using lg2 & pow2)
sqrt : 122.35 (about 0.53x sqrt_alt, using CORDIC)
powxy : 300.13 (about 2.92x pow2, using lg2 & pow2)
Using longs:
mul_l : 2.24
div_l : 1.95
lg2_l : 11.29 (about 0.20x lg2, using BKM-L and longs)
lg2_mul_l : 36.67 (about 0.65x lg2, using mult. and longs)
ln_l : 31.78 (about 0.56x lg2, using lg2_l)
lg10_l : 31.76 (about 0.56x lg2, using lg2_l)
pow2_l : 20.91 (about 0.20x pow2, using BKM-E and longs)
exp_l : 24.05 (about 0.23x pow2, using pow2_l)
pow10_l : 21.84 (about 0.21x pow2, using pow2_l)
sqrt_alt_l : 43.21 (about 0.19x sqrt_alt, using lg2_l & pow2_l)
sqrt_l : 17.98 (about 0.15x sqrt, using CORDIC and longs)
powxy_l : 59.18 (about 0.58x pow2, using lg2_l & pow2_l)
cossin_l : 14.59 (using CORDIC)
Number of frac bits: 16
10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Using only ints:
add : 1.00
mul : 12.27
div : 21.83
lg2 : 74.91 (BKM-L, only ints)
ln : 169.79 (using lg2)
lg10 : 169.57 (using lg2)
pow2 : 94.08 (BKM-E, only ints)
exp : 98.74 (about 1.05x pow2, using pow2)
pow10 : 92.27 (about 0.98x pow2, using pow2)
sqrt_alt : 237.33 (about 2.52x pow2, using lg2 & pow2)
sqrt : 129.68 (about 0.55x sqrt_alt, using CORDIC)
powxy : 300.57 (about 3.19x pow2, using lg2 & pow2)
Using longs:
mul_l : 2.27
div_l : 1.99
lg2_l : 14.34 (about 0.19x lg2, using BKM-L and longs)
lg2_mul_l : 49.16 (about 0.66x lg2, using mult. and longs)
ln_l : 32.19 (about 0.43x lg2, using lg2_l)
lg10_l : 32.19 (about 0.43x lg2, using lg2_l)
pow2_l : 19.45 (about 0.21x pow2, using BKM-E and longs)
exp_l : 22.48 (about 0.24x pow2, using pow2_l)
pow10_l : 21.18 (about 0.23x pow2, using pow2_l)
sqrt_alt_l : 44.94 (about 0.19x sqrt_alt, using lg2_l & pow2_l)
sqrt_l : 18.63 (about 0.14x sqrt, using CORDIC and longs)
powxy_l : 59.50 (about 0.63x pow2, using lg2_l & pow2_l)
cossin_l : 17.49 (using CORDIC)
Number of frac bits: 20
10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Using only ints:
add : 1.00
mul : 12.97
div : 18.48
lg2 : 92.59 (BKM-L, only ints)
ln : 169.88 (using lg2)
lg10 : 169.59 (using lg2)
pow2 : 89.44 (BKM-E, only ints)
exp : 98.81 (about 1.10x pow2, using pow2)
pow10 : 93.31 (about 1.04x pow2, using pow2)
sqrt_alt : 244.60 (about 2.73x pow2, using lg2 & pow2)
sqrt : 153.21 (about 0.63x sqrt_alt, using CORDIC)
powxy : 300.81 (about 3.36x pow2, using lg2 & pow2)
Using longs:
mul_l : 2.28
div_l : 1.99
lg2_l : 17.29 (about 0.19x lg2, using BKM-L and longs)
lg2_mul_l : 61.69 (about 0.67x lg2, using mult. and longs)
ln_l : 32.16 (about 0.35x lg2, using lg2_l)
lg10_l : 32.13 (about 0.35x lg2, using lg2_l)
pow2_l : 18.49 (about 0.21x pow2, using BKM-E and longs)
exp_l : 22.43 (about 0.25x pow2, using pow2_l)
pow10_l : 21.39 (about 0.24x pow2, using pow2_l)
sqrt_alt_l : 46.13 (about 0.19x sqrt_alt, using lg2_l & pow2_l)
sqrt_l : 20.66 (about 0.13x sqrt, using CORDIC and longs)
powxy_l : 59.70 (about 0.67x pow2, using lg2_l & pow2_l)
cossin_l : 20.53 (using CORDIC)
Number of frac bits: 24
10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Using only ints:
add : 1.00
mul : 13.19
div : 17.16
lg2 : 110.95 (BKM-L, only ints)
ln : 170.03 (using lg2)
lg10 : 169.81 (using lg2)
pow2 : 86.75 (BKM-E, only ints)
exp : 96.30 (about 1.11x pow2, using pow2)
pow10 : 91.10 (about 1.05x pow2, using pow2)
sqrt_alt : 252.37 (about 2.91x pow2, using lg2 & pow2)
sqrt : 153.23 (about 0.61x sqrt_alt, using CORDIC)
powxy : 299.87 (about 3.46x pow2, using lg2 & pow2)
Using longs:
mul_l : 2.28
div_l : 1.98
lg2_l : 20.33 (about 0.18x lg2, using BKM-L and longs)
lg2_mul_l : 74.33 (about 0.67x lg2, using mult. and longs)
ln_l : 32.26 (about 0.29x lg2, using lg2_l)
lg10_l : 32.23 (about 0.29x lg2, using lg2_l)
pow2_l : 18.10 (about 0.21x pow2, using BKM-E and longs)
exp_l : 21.95 (about 0.25x pow2, using pow2_l)
pow10_l : 21.00 (about 0.24x pow2, using pow2_l)
sqrt_alt_l : 47.61 (about 0.19x sqrt_alt, using lg2_l & pow2_l)
sqrt_l : 20.59 (about 0.13x sqrt, using CORDIC and longs)
powxy_l : 59.60 (about 0.69x pow2, using lg2_l & pow2_l)
cossin_l : 23.47 (using CORDIC)
Number of frac bits: 28
10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Using only ints:
add : 1.00
mul : 13.43
div : 16.47
lg2 : 129.69 (BKM-L, only ints)
ln : 170.72 (using lg2)
lg10 : 170.42 (using lg2)
pow2 : 96.43 (BKM-E, only ints)
exp : 100.88 (about 1.05x pow2, using pow2)
pow10 : 92.64 (about 0.96x pow2, using pow2)
sqrt_alt : 260.04 (about 2.70x pow2, using lg2 & pow2)
sqrt : 161.11 (about 0.62x sqrt_alt, using CORDIC)
powxy : 302.47 (about 3.14x pow2, using lg2 & pow2)
Using longs:
mul_l : 2.29
div_l : 1.99
lg2_l : 23.39 (about 0.18x lg2, using BKM-L and longs)
lg2_mul_l : 87.60 (about 0.68x lg2, using mult. and longs)
ln_l : 32.36 (about 0.25x lg2, using lg2_l)
lg10_l : 32.34 (about 0.25x lg2, using lg2_l)
pow2_l : 19.84 (about 0.21x pow2, using BKM-E and longs)
exp_l : 22.90 (about 0.24x pow2, using pow2_l)
pow10_l : 21.23 (about 0.22x pow2, using pow2_l)
sqrt_alt_l : 49.12 (about 0.19x sqrt_alt, using lg2_l & pow2_l)
sqrt_l : 21.14 (about 0.13x sqrt, using CORDIC and longs)
powxy_l : 59.91 (about 0.62x pow2, using lg2_l & pow2_l)
cossin_l : 26.53 (using CORDIC)
=================================================
Relative Xtime averages for frac bit configurations {8, 12, 16, 20, 24, 28}
=================================================
Using only ints:
add : 1.00 ( 3.43x system's native addition of ints)
mul : 12.50 ( 43.26x system's native multiplication of ints)
div : 18.96 ( 48.34x system's native division of ints)
lg2 : 83.94 (BKM, only ints)
ln : 169.08 (about 2.01x lg2, using lg2)
lg10 : 168.82 (about 2.01x lg2, using lg2)
pow2 : 95.37 (BKM, only ints)
exp : 101.89 (about 1.07x pow2, using pow2)
pow10 : 93.57 (about 0.98x pow2, using pow2)
sqrt_alt : 240.19 (about 2.52x pow2, using lg2 & pow2)
sqrt : 138.95 (about 0.58x sqrt_alt, using CORDIC and longs)
powxy : 299.92 (about 3.14x pow2, using lg2 & pow2)
Using longs:
mul_l : 2.27 ( 7.86x system's native multiplication of ints)
div_l : 1.98 ( 5.04x system's native division of ints)
lg2_l : 15.83 (about 0.19x lg2, using BKM and longs)
lg2_mul_l : 55.62 (about 0.66x lg2, using mult. and longs)
ln_l : 32.00 (about 0.38x lg2, using lg2_l)
lg10_l : 31.98 (about 0.38x lg2, using lg2_l)
pow2_l : 19.64 (about 0.21x pow2, using BKM and longs)
exp_l : 23.01 (about 0.24x pow2, using pow2_l)
pow10_l : 21.43 (about 0.22x pow2, using pow2_l)
sqrt_alt_l : 45.38 (about 0.19x sqrt_alt, using lg2_l & pow2_l)
sqrt_l : 19.37 (about 0.14x sqrt, using CORDIC and longs)
powxy_l : 59.46 (about 0.62x pow2, using lg2_l & pow2_l)
cossin_l : 19.05 (using CORDIC)
=================================================
(Keep in mind: compiler optimization options used/not used can affect these measurements significantly.)