-
-
Notifications
You must be signed in to change notification settings - Fork 312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tail recursion modulo cons #5569
Conversation
@brian-carroll my attempt at writing down what is happening here. The current implementations of the wasm parts are just me trying to puzzle things together. I just don't have that good of a mental model of the details of wasm, so it is hard to implement truly new things. so our code in question is this roc function
Which we compile to this IR now
Some notes
So, 3 new things lowlevels
None of these do anything with RC (in fact, anything in a Then also a new Expr was added,
|
@@ -27,7 +27,7 @@ extern fn free(c_ptr: [*]align(Align) u8) callconv(.C) void; | |||
extern fn memcpy(dst: [*]u8, src: [*]u8, size: usize) callconv(.C) void; | |||
extern fn memset(dst: [*]u8, value: i32, size: usize) callconv(.C) void; | |||
|
|||
const DEBUG: bool = false; | |||
const DEBUG: bool = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't forget to revert this
// TODO perhaps we need the union_layout later as well? if so, create a new function/map to store it. | ||
environment.add_union_child(*structure, *binding, *tag_id, *index); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the pointer should belong as union_child
. We don't want it used in specialization.
crates/compiler/mono/src/borrow.rs
Outdated
PtrStore => arena.alloc_slice_copy(&[owned, owned]), | ||
PtrLoad => arena.alloc_slice_copy(&[owned]), | ||
PtrToStackValue => arena.alloc_slice_copy(&[owned]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As they are not rc'd. Why not mark them as irrelevant
?
crates/compiler/gen_wasm/src/lib.rs
Outdated
@@ -249,7 +249,7 @@ pub const DEBUG_SETTINGS: WasmDebugSettings = WasmDebugSettings { | |||
let_stmt_ir: false && cfg!(debug_assertions), | |||
instructions: false && cfg!(debug_assertions), | |||
storage_map: false && cfg!(debug_assertions), | |||
keep_test_binary: false && cfg!(debug_assertions), // see also ROC_WRITE_FINAL_WASM | |||
keep_test_binary: true && cfg!(debug_assertions), // see also ROC_WRITE_FINAL_WASM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't forget to revert
OK so my trusted debugging technique for Wasm code gen is to dump the disassembly to a .tmp file and annotate it line-by-line with:
This usually uncovers anything weird happening. And there are a couple of weird things below, marked with
|
Oh it's also helpful to set
|
My analysis so far is
I still don't feel like I've figured out every detail of this and I cannot point to an exact fix yet. Just sharing my analysis so far. |
re.
yes that is how the IR works, the extra symbol is not needed there. If it helps we could use one of the the overwriting of the frame pointer is definitely not by design. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@brian-carroll I got it to work, for that one example at least. I'm still not 100% sure about the implementation being correct in general though. I've left some inline questions.
fn expr_union_field_ptr_at_index( | ||
&mut self, | ||
structure: Symbol, | ||
tag_id: TagIdIntType, | ||
union_layout: &UnionLayout<'a>, | ||
index: u64, | ||
symbol: Symbol, | ||
storage: &StoredValue, | ||
) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should try and share code with expr_union_at_index
. may require some refactoring.
let symbol_local = match self.storage.ensure_value_has_local( | ||
&mut self.code_builder, | ||
symbol, | ||
storage.clone(), | ||
) { | ||
StoredValue::Local { local_id, .. } => local_id, | ||
_ => internal_error!("A heap pointer will always be an i32"), | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what I'm not sure about
- when is
ensure_value_has_local
needed - under what circumstances is it allowed to make assumptions about the storage. how can we be sure that
Local
is the only option here
_ => internal_error!("A heap pointer will always be an i32"), | ||
}; | ||
|
||
self.code_builder.set_local(symbol_local); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe there is some helper for this (ensure symbol, then write to it)?
let (ptr_local_id, offset) = match backend.storage.get(&ptr) { | ||
StoredValue::Local { local_id, .. } => (*local_id, 0), | ||
_ => internal_error!("A pointer will always be an i32"), | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this one is weird. This storage.get
gets the correct local (the hole
symbol, defined by the join point. But ensure_value_has_local
instead gives LocalId(3)
, which in practice corresponds with the frame pointer. I don't know why, but this works.
is the fact that the symbol comes from a join point relevant (because that is kind of brittle). again, should we consider options beside StoredValue::Local
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Early comments
when x1 is | ||
Val a -> | ||
when x2 is | ||
Val b -> Val (a + b) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do these changes make the program more amenable to TRMC, or is this for another reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I briefly thought so, but no. This is more in line with koka, so it is easier to compare our output to theirs.
queen != q && queen != q + diagonal && queen != q - diagonal && safe queen (diagonal + 1) t | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be nice if we could auto-desugar this to hit the tail-recursion analysis, rather than relying on an explicit if/else. Filed #5593
// we never consider pointers for refcounting. Ptr is not user-facing. The compiler | ||
// author must make sure that invariants are upheld |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add this to the doc comment on LayoutRepr::Ptr
?
crates/compiler/mono/src/layout.rs
Outdated
Boxed(InLayout<'a>), | ||
Ptr(InLayout<'a>), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add some comments regarding the distinction between Boxed and Ptr? Namely that, if i understand correctly, Ptr is an alloca. I was thinking it would be too hard to do earlier, but maybe we could even reuse Ptr for pass-by-reference later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ptr
is just some pointer (heap or stack) without any reference counting. The (lack of) RC is the defining feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incredible work! The logic looks great to me. Left various housekeeping comments.
let trmc_candidate_symbols = trmc_candidates(env.interner, proc); | ||
|
||
if !trmc_candidate_symbols.is_empty() { | ||
let new_proc = | ||
crate::tail_recursion::TrmcEnv::init(env, proc, trmc_candidate_symbols); | ||
*proc = new_proc; | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for the future: i wonder if we could combine the TRMC + tail recursion search to avoid possibly two walks of a function
Ptr(inner_layout) | Boxed(inner_layout) => { | ||
let inner_type = | ||
layout_spec_help(env, builder, interner, interner.get_repr(inner_layout))?; | ||
let cell_type = builder.add_heap_cell_type(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does morphic distinguish heap/stack alloca values, or are they equivalent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we model the alloca as if it were on the heap here. morphic does not really have a reference
operator, nor the concept of stack allocations. There are just stack values and everything else.
@@ -1165,7 +1165,12 @@ pub(crate) fn build_exp_expr<'a, 'ctx>( | |||
env.builder.position_at_end(check_if_null); | |||
|
|||
env.builder.build_conditional_branch( | |||
env.builder.build_is_null(tag_ptr, "is_tag_null"), | |||
// have llvm optimizations clean this up |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice. maybe we want to do it ourselves in the future given the latency of roc-pg earlier, but i hope llvm just entirely skips unreachable blocks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with optimization it does for sure.
it creates too many allocas, growing the stack frame
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔥 🚀
still incomplete for wasm, but I want to see the benchmark statistics