Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement full pipeline #16

Closed
Jacarte opened this issue Sep 20, 2019 · 25 comments
Closed

Implement full pipeline #16

Jacarte opened this issue Sep 20, 2019 · 25 comments
Assignees

Comments

@Jacarte
Copy link
Collaborator

Jacarte commented Sep 20, 2019

Implement full pipeline tool

We have the inner pieces and the vulnerabilities to test it. We can start to create the full pipeline

wasm -> souper IR -> candidate 1 -> LLVM IR -> wasm -> run_test_suite(), validate(), ...
                  -> candidate 2   
                  -> ...
                  -> candidate n
@jianguda
Copy link
Collaborator

  1. *.wat(WASM text format)*.o(WASM binary file)
  2. *.o(WASM binary file)*.bc(LLVM bitcode file)
  3. *.bc(LLVM bitcode file)*.opt(souper candidates)
  4. *.opt(souper candidates)*.ll(LLVM IR)
  5. *.ll(LLVM IR)*.o(WASM binary file)

@Jacarte
Copy link
Collaborator Author

Jacarte commented Sep 20, 2019

Prefect !

@Jacarte
Copy link
Collaborator Author

Jacarte commented Sep 20, 2019

Hi @shrin18, We have the point 2. right?

@jianguda
Copy link
Collaborator

Hey @Jacarte
Maybe we could replace the first two steps with following three steps:

  1. use wabt to support .wat -> .wasm -> .c, following this tutorial
  2. use clang to support .c -> .ll, following this link
  3. use llvm-as to support .ll -> .bc following this document

It would be much better if there is one way to directly support .wat/.wasm -> .ll/.bc

@shrin18
Copy link
Contributor

shrin18 commented Sep 26, 2019

Hi @shrin18, We have the point 2. right?

Hi, @Jacarte As per my understanding, last week we did a jump from 1. to 3. . Currently compilation from .wast programs in benchmark programs is directly possible so that we can eliminate the llvm part as discussed. Also, souper candidates have two formats, textual human readable format in S-language of WASM and pure binary format.

wasm-opt --flatten --simplify-locals-nonesting --souperify input.wasm

This code seems to accept a wasm file on input and so I used wabt to convert wast to wasm. Does that seem logical ? @Jacarte @jianguda

@jianguda
Copy link
Collaborator

Hey @shrin18 Great!
Then our steps should be:

  1. .wat.wasm (by wat2wasm from wabt)
    wat2wasm xxx.wat -o xxx.wasm

  2. .wasm.opt (by wasm-opt from binaryen)
    wasm-opt --flatten --simplify-locals-nonesting --souperify xxx.wasm

  3. .opt.ll (based on utils/souper2llvm.in from souper)
    scripts

  4. .ll.bc (by llvm-as from llvm)
    llvm-as xxx.ll

  5. .bc.opt (based on souper from souper)
    scripts

  6. .opt.ll (same as step3)


if we need assembly filetype:
.ll.s (by llc from llvm)

$ llc -march=wasm32 -filetype=asm xxx.ll

or we need binary filetype:
.ll.o (by llc from llvm)

$ llc -march=wasm32 -filetype=obj xxx.ll

WDYT @Jacarte @shrin18

@jianguda
Copy link
Collaborator

jianguda commented Oct 1, 2019

#19

@jianguda
Copy link
Collaborator

jianguda commented Oct 1, 2019

Current solution for .opt.ll is not so reliable, I feel it should be very ideal to improve its souper2llvm.py gradually based on our wast files.

This is related discussion.

@jianguda
Copy link
Collaborator

jianguda commented Oct 8, 2019

Another feasible solution is to reuse some implementations inside wasmer, I made one issue there.

@jianguda
Copy link
Collaborator

jianguda commented Oct 8, 2019

Right now, step2 .wasm → .opt might have issues. Also, step3 .opt → .ll needs improvements.

My current plan is to study wasmer to directly support .wasm → .ll, then we only need to improve souper2llvm.py gradually.

@Jacarte
Copy link
Collaborator Author

Jacarte commented Oct 8, 2019

Right now, step2 .wasm → .opt might have issues. Also, step3 .opt → .ll needs improvements.

My current plan is to study wasmer to directly support .wasm → .ll, then we only need to improve souper2llvm.py gradually.

What about improving wasm-opt in binaryen?

@jianguda
Copy link
Collaborator

jianguda commented Oct 8, 2019

Right now, step2 .wasm → .opt might have issues. Also, step3 .opt → .ll needs improvements.
My current plan is to study wasmer to directly support .wasm → .ll, then we only need to improve souper2llvm.py gradually.

What about improving wasm-opt in binaryen?

For wasm-opt, I am trying to locate error: '%3' defined with type 'i1' but expected 'i32'.

@shrin18
Copy link
Contributor

shrin18 commented Oct 10, 2019

We are trying to get valid candidates in souper in the .opt extension. On the input we are parsing .wasm or .wat files and as per the pipeline we should get required valid candidates for each of the .wasm or .wat program files. However the program throws some unexpected error regarding the syntax of the .wasm programs and it is different for each of the program per se. Further, it is good that the program succeeds in getting binary output from the llvm extensions and so we can narrow down the problem on the working of wasm-opt function.

@jianguda
Copy link
Collaborator

jianguda commented Oct 10, 2019

For example, we have one file named "fib.opt":

; function: $0

; start LHS (in $0)
%0:i32 = var
%1 = slt %0, 2:i32
infer %1


; start LHS (in $0)
%0:i32 = var
%1 = sub %0, 2:i32
infer %1


; start LHS (in $0)
%0:i32 = var
%1 = sub %0, 1:i32
infer %1


; start LHS (in $0)
%0:i32 = var
%1:i32 = var
%2 = add %0, %1
infer %2

We run souper/build/souper2llvm fib.opt > fib.ll and then we have error message

  File "../../souper/build/souper2llvm", line 850, in <module>
    insts = parseInsts(lines)
  File "../../souper/build/souper2llvm", line 246, in parseInsts
    res.append(parseInst(line))
  File "../../souper/build/souper2llvm", line 135, in parseInst
    assert len(tmp) == 2, "wrong reg length %d, %s" % (len(tmp), tmp)
AssertionError: wrong reg length 1, ['%1']

Actually we need valid width info as type

It is okay to fix this issue by improving souper2llvm.in, but we are seeking one more proper way for conversion.

@jianguda
Copy link
Collaborator

WAST -> WASM -> Souper -> LLVM -> WASM

@jianguda
Copy link
Collaborator

As discussed with @shrin18, for the souper2llvm step, we have to improve existing solutions (Python or C++) in souper by ourselves.

I am doing the C++ plan at KTH fork, based on the "Souper codegen to LLVM IR" PR.

@monperrus
Copy link
Collaborator

monperrus commented Oct 28, 2019 via email

@jianguda
Copy link
Collaborator

Progress Note: We have tried the C++ plan and found currently it seems only support RHS, which is consistent with this statement. @shrin18 and I will discuss and decide our next step.

@monperrus
Copy link
Collaborator

monperrus commented Oct 31, 2019 via email

@jianguda
Copy link
Collaborator

jianguda commented Nov 7, 2019

Hey @Jacarte
This is one sample which shows the correctness of most steps in our pipeline
(.ll->.bc->.cand.opt->.lhs.opt->.rhs.opt->.new.ll->.new.bc)

infer.ll was converted from infer.opt by using the souper2llvm.py script

@Jacarte
Copy link
Collaborator Author

Jacarte commented Nov 7, 2019 via email

@Jacarte
Copy link
Collaborator Author

Jacarte commented Nov 18, 2019

We patch the issue where the souperify pass was broken (here the PR) in binaryen. Now we can go directly from WASM to souper without the use of LLVM. This is better for our purposes because we can avoid initial optimization from LLVM. Preliminary results show even more candidates going directly from WASM, which means more variations for our final goal.

@jianguda
Copy link
Collaborator

We are able to reproduce experiments shown in the Souper paper, once we sync latest commits from the upstream repo to KTH/souper. Check this issue for more info.

@bbaudry
Copy link
Member

bbaudry commented Nov 18, 2019

that's a major milestone. well done!

@Jacarte
Copy link
Collaborator Author

Jacarte commented Nov 22, 2019

We are closing it because we have the first implementation of this pipeline. We decided to improve the feed of souper going directly from WASM to Souper IR. The main idea is to extend swam due to, first, it's a well-tested parser, second, we have no feedback from the binaryen PR with the Souper integration fix.

@Jacarte Jacarte closed this as completed Nov 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

5 participants