Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add evm disassembly support. #3595

Closed
wants to merge 2 commits into from
Closed

Add evm disassembly support. #3595

wants to merge 2 commits into from

Conversation

gogo2464
Copy link
Contributor

Your checklist for this pull request

  • I've read the guidelines for contributing to this repository
  • I made sure to follow the project's coding style
  • I've documented or updated the documentation of every function and struct this PR changes. If not so I've explained why.
  • I've added tests that prove my fix is effective or that my feature works (if possible)
  • I've updated the rizin book with the relevant information (if needed)

Detailed description

Test plan

At this moment, there is mostly only the disassembly plugin that is under work.
Based on dead code. I will remove the dead code.
I have not added enough testing yet. I need to add more before merging.
I will add the plugin commit diff in the rizin book about how to code plugin with meson.

Closing issues
...

@gogo2464
Copy link
Contributor Author

Please do not review dead code.

I have issues at this line because 2 lines are printed:
6061 =>

PUSH1 0x61
PUSH2

instead of only PUSH1 0x61

What could I do please?

Copy link
Member

@ret2libc ret2libc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you pass 6061, evm_op[0x60].type is GB_8BIT, thus gbOpLength(evm_op[0x60].type) returns 1, meaning that the disassembled instruction has length 1. However, given that PUSH1 has an argument specified in the next byte, it should have a length 2. That's why the 61 after the 60 is interpreted as its own instruction.

librz/asm/arch/evm/evmdis.c Outdated Show resolved Hide resolved
@gogo2464
Copy link
Contributor Author

The testing allows me to find a bug on CALLDATACOPY lol

@gogo2464
Copy link
Contributor Author

Using tsting I realized:

-PUSH3 0x010203
+PUSH3 0x123

What should I do in your opinion @ret2libc or what else ?

@gogo2464
Copy link
Contributor Author

I have made the testing from:

echo 000102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f202122232425262728292a2b2c2d2e2f303132333435363738393a3b3c3d3e3f404142434445464748494a4b4c4d4e4f505152535455565758595a5b5c5d5e5f606162636465666768696a6b6c6d6e6f707172737475767778797a7b7c7d7e7f808182838485868788898a8b8c8d8e8f909192939495969798999a9b9c9d9e9fa0a1a2a3a4a5a6a7a8a9aaabacadaeafb0b1b2b3b4b5b6b7b8b9babbbcbdbebfc0c1c2c3c4c5c6c7c8c9cacbcccdcecfd0d1d2d3d4d5d6d7d8d9dadbdcdddedfe0e1e2e3e4e5e6e7e8e9eaebecedeeeff0f1f2f3f4f5f6f7f8f9fafbfcfdfeff | evmasm -d

librz/asm/arch/evm/evmdis.c Outdated Show resolved Hide resolved
librz/asm/arch/evm/evmdis.c Outdated Show resolved Hide resolved
@XVilka
Copy link
Member

XVilka commented Jun 23, 2023

@gogo2464 have you seen this? #3368

@gogo2464
Copy link
Contributor Author

@gogo2464 have you seen this? #3368

I have never ever seen this... this is terrible!!! I am going to check which version is better and where for both.

I think my disassembler is more tested and then more accurate than the old one. See my testing file. I also need to write my program for a school project. Merging it is not mandatory but could be a good plus.

Need your opinion.

@gogo2464
Copy link
Contributor Author

The analysis part of the other plan sounds good as well.

@gogo2464
Copy link
Contributor Author

I suggest to keep my disassemble plugin but keep his analysis plugin.

@gogo2464
Copy link
Contributor Author

@XVilka

@gogo2464
Copy link
Contributor Author

My code is very very well tested. There is no missing instruction .

librz/asm/arch/evm/evmasm.c Outdated Show resolved Hide resolved
Comment on lines 162 to 161
instruction_size = 5;
strncpy(out, buf+1, instruction_size-1);
rz_hex_bin2str(out, instruction_size-1, opcode);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if len is less than instruction_size ? this will create an out of bound read. you need to add checks to verify you have enough buffer before running rz_hex_bin2str and if there is not, you should return an invalid instruction.

@gogo2464
Copy link
Contributor Author

sounds good to me

@gogo2464 gogo2464 marked this pull request as ready for review July 19, 2023 12:34
@gogo2464
Copy link
Contributor Author

All is very very finely tested. May be ready for a merge.

All the instructions can be assembled/disassembled except PUSH32 that assembles with:

7f0102030405060708090102030405060708090102030405060708090102030421

instead of

7f0102030405060708090102030405060708090102030405060708090102030405

You may help me or you may merge directly.

librz/asm/arch/evm/evmasm.c Outdated Show resolved Hide resolved
@gogo2464
Copy link
Contributor Author

Could you merge before the 21 at 4pm london utc please? It could be a great plus for a school project. Any comment is welcome before this date.

Comment on lines +19 to +25
if (!strcmp("stop", buf)) {
opbuf[0] = 0x00;
} else if (!strcmp("add", buf)) {
opbuf[0] = 0x01;
} else if (!strcmp("sub", buf)) {
opbuf[0] = 0x03;
} else if (!strcmp("div", buf)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i strongly suggest to make a structure with these bytes, it is easier to read and you can loop easily.

Comment on lines +144 to +151
char out[100];
int number = atoi(buf+4);
switch (number) {
case 1:
len = 2;
opbuf[0] = 0x60;
break;
case 2:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can optimize this out by using a structure like suggested before which contains name, length, and the first byte.

Comment on lines +279 to +291
for (int j = 8, i = 0; i < len-1; j += 2, i ++) {
two_chars_str[0] = buf[j];
two_chars_str[1] = buf[j+1];
two_chars_str[2] = '\0';
opbuf[i+1] = strtol(two_chars_str, NULL, 16);
}
} else if (number > 9) {
for (int j = 9, i = 0; i < len-1; j += 2, i ++) {
two_chars_str[0] = buf[j];
two_chars_str[1] = buf[j+1];
two_chars_str[2] = '\0';
opbuf[i+1] = strtol(two_chars_str, NULL, 16);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are methods in rizin to convert hex to bytes. also you don't really need to use strtol for doing that.

#include <string.h>
#include <stdlib.h>

static int evmAsm(RzAsm *a, RzAsmOp *op, const char *buf) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You always assume that buf contains the correct information, but what happens if its length is less than expected? based on your code, many out of bound reads. you should always check for boundaries.

const unsigned char opcode[256];
switch (buf[0]) {
case 0x00:
rz_str_cpy(buf_asm, "STOP") break;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need to use rz_str_cpy when you have to push data into RzStrBuf.
just do rz_strbuf_set for static strings and rz_strbuf_appendf for adding extra info to the string.

Comment on lines +219 to +220
instruction_size = 21;
rz_hex_bin2str(buf+1, instruction_size-1, opcode);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to always check if the buffer is big enough. you cannot just read it as is and assume it will always be.

test/db/asm/evm Outdated Show resolved Hide resolved
@@ -0,0 +1,359 @@
// SPDX-FileCopyrightText: 2023 gogo <[email protected]>
// SPDX-FileCopyrightText: 2012-2020 pancake <[email protected]>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove pancake. you wrote this, he didnt.

@@ -0,0 +1,374 @@
// SPDX-FileCopyrightText: 2023 gogo <[email protected]>
// SPDX-FileCopyrightText: 2013-2018 pancake <[email protected]>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove pancake, you wrote this, not him.

Copy link
Member

@wargio wargio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check always if the buffer is long enough. this code has many security vulnerabilities.

@gogo2464
Copy link
Contributor Author

I will review this week.

@gogo2464
Copy link
Contributor Author

sure let me fix some issues.

@XVilka
Copy link
Member

XVilka commented Jan 11, 2024

I am closing this because of no activity. Please open a new one if you decide to continue.

@XVilka XVilka closed this Jan 11, 2024
@gogo2464
Copy link
Contributor Author

@XVilka I understand. I was confused with the table. Can I see this month if I will continue and rebase it please?

@wargio
Copy link
Member

wargio commented Jan 12, 2024

yes sure

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants