Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

My two cents #1

Open
Boris-Barboris opened this issue Jan 17, 2018 · 7 comments
Open

My two cents #1

Boris-Barboris opened this issue Jan 17, 2018 · 7 comments

Comments

@Boris-Barboris
Copy link

In continuation of our discussion here: vibe-d/vibe-core#48

baseline API with no dynamic per-request memory allocations

Well, it can't be completely avoided. I'd rather formulate in like this:

  • all the memory required to hold one http request with all of it's lookup optimization structures (header collection and all that stuff) and various contexts should be allocated efficiently and released after the response is built and control returned to the framework code.
  • all strings (header values etc.), contained in request, should span the request buffer

per-request memory must be safely managed (scoped and .dup'ed as soon as the user tries to escape it, either explicitly or implicitly)

I'm strongly against implicit copying. Marked with scope Non-copyable HTTPRequest and HTTPResponse structures that implement Stream interfaces themselves (no easily-accessable socket leakage this way, so no refcounting needed under the hood) as parameters to user callback, all "dangerous" fields and properties return something like

/// Doc-comment that warns the programmer about the lifetime of the memory behind m_val
struct RequestScoped(T)
{
    private T m_val;
    /// doc-comment that warns the programmer about the consequences
    ref T get() { return m_val; }
    T escape() { /* conveniently-written recursive duplication to GC-heap */ }
}

You cannot force user to write memory safe code. D does not support this. No, DIP1000 + "@safe" is not an enforcement. When something cannot be enforced, it is better to provide convenient and non-encumbering conventions, documentation and examples.

no stream proxy objects or dynamic dispatch by default - all combinations of encryption/encoding/compression streams must be instantiated as static types, need to keep in mind how to minimize the amount of template bloat this produces.

encoding - transport
encoding - compression - transport
encoding - compression - encryption - transport
Looks rather modest, templates should not be that terrible.

A drawback of this approach is that each API call would have to determine the static stream type all over again (e.g. ChunkedOutputStream(TLSStream(TCPConnection))).

Let's say dynamic dispatch between streams in the pipeline was collapsed, but this is the end. You need dynamic dispatch for HTTPInterchange.BodyReader\Writer, one way or another. Class will allow socket leak (again, only if you think a programmer is a baboon. Seriously, there is no hope if you escape a variable of heavily-documented type scope RequestScoped!BodyWriter). Or you can emulate VFT in a struct through function pointers, wich will keep uncopyability, but will look questionable.

I also don't think unified HTTPInterchange object is something good. Chained calls with staged callbacks:

  • looks ugly, IMO.
  • May perform worse, more indirection. On the freight trains of headers, sent by modern browsers, readHeaders may create more problems, than it solves. What should user do there, write a giant switch? How is switch over strings implemented by the compilers? Does it compare lengths first? Does it choose between an rb-tree and hash table, or does it build a state machine? I think an optimized and fine-tuned pre-built header lookup collection is more user-friendly and will be faster.
  • Writing straight back to response body without even looking at request is perfectly viable behaviour. Why shoud I call three bethods that do nothing to get it done? I have to write code that means nothing.
  • Body should already be read by vibe.d parser well before any user callback are invoked. Body start and length were found, CLRFs were found, all that rfc2616 stuff. All this validation and parsing should be already done before the client code. Client callbacks should not be ran if the request does not conform to HTTP standard. I don't see a benefit from forcing this specific order on the user.

I think the current form of HTTPServerRequestHandlerS callback is very nice.

@s-ludwig
Copy link
Member

In continuation of our discussion here: vibe-d/vibe-core#48

baseline API with no dynamic per-request memory allocations

Well, it can't be completely avoided. I'd rather formulate in like this:

  • all the memory required to hold one http request with all of it's lookup optimization structures (header collection and all that stuff) and various contexts should be allocated efficiently and released after the response is built and control returned to the framework code.
  • all strings (header values etc.), contained in request, should span the request buffer

The idea is actually to have something that does not allocate anything apart from the TCP connection structure itself. No strings would be stored, but would be available as some kind of input range. It may end up having to make little compromises in that regard, but anyway that would be the goal.

But the high-level API would then indeed be built with the goals that you outlined (which is already the idea for the current implementation, but there are still a number of exceptions).

per-request memory must be safely managed (scoped and .dup'ed as soon as the user tries to escape it, either explicitly or implicitly)

I'm strongly against implicit copying. Marked with scope Non-copyable HTTPRequest and HTTPResponse structures that implement Stream interfaces themselves (no easily-accessable socket leakage this way, so no refcounting needed under the hood) as parameters to user callback, all "dangerous" fields and properties return something like

/// Doc-comment that warns the programmer about the lifetime of the memory behind m_val
struct RequestScoped(T)
{
    private T m_val;
    /// doc-comment that warns the programmer about the consequences
    ref T get() { return m_val; }
    T escape() { /* conveniently-written recursive duplication to GC-heap */ }
}

You cannot force user to write memory safe code. D does not support this. No, DIP1000 + "@safe" is not an enforcement. When something cannot be enforced, it is better to provide convenient and non-encumbering conventions, documentation and examples.

Enforcement is not the goal here, if you want you can of course always override the safety measures. The real idea here just is to make it (close to) impossible to make accidental mistakes that open up security holes. As long as we rely on the GC that is not so problematic, but the moment we start to do manual memory management, something like DIP1000 or an approximation thereof using API design facilities is essential in that regard.

The main problem is that using scope+DIP1000, which definitely would be my preferred approach, will break user code, because scope will have to be added in all places that accept HTTP requests/reponses. Using an approach where the data is elevated to ref counted or GC managed memory when it is still referenced outside of the scope of the request handler on the other hand would not result in this kind of breakage and would also keep the API cleaner due to the lack of explicit escape methods for all kinds of req/res attached data.

IMO, the tradeoff of this implicit approach is pretty good, in the sense that it makes it seamless to escape request data to a background task or for "permanent" storage, for example in an instance or global variable of some kind, while still keeping the risk of accidentally producing a performance hazard low (the request data actually has to outlive the handler, just copying it around will not do any harm).

The reference counting is thread-local, so while the performance implications need to be observed carefully, it's unlikely to have a big impact. Like for the current issue with InterfaceProxy invoking RC, passing things by reference (or using move) will also eliminate the RC operations completely.

If the automatic approach turns out to be harmless w.r.t. performance, I'd pick that, to avoid the rift that is otherwise created for all the existing code. Getting people to make such big upgrades is sometimes rather difficult and even now there are still many that for example use the 0.7.x vibe.d branch due to the mild breaking changes that 0.8.x introduces.

no stream proxy objects or dynamic dispatch by default - all combinations of encryption/encoding/compression streams must be instantiated as static types, need to keep in mind how to minimize the amount of template bloat this produces.

encoding - transport
encoding - compression - transport
encoding - compression - encryption - transport
Looks rather modest, templates should not be that terrible.

The thing to keep in mind is that if the templated stream is handed over into user code without a proxy of some sort with a fixed type, then the number of combinations may still amplify there. But apart from that I agree, these combinations will only be there once for the whole application, so like this it's nothing to worry about yet.

A drawback of this approach is that each API call would have to determine the static stream type all over again (e.g. ChunkedOutputStream(TLSStream(TCPConnection))).

Let's say dynamic dispatch between streams in the pipeline was collapsed, but this is the end. You need dynamic dispatch for HTTPInterchange.BodyReader\Writer, one way or another. Class will allow socket leak (again, only if you think a programmer is a baboon. Seriously, there is no hope if you escape a variable of heavily-documented type scope RequestScoped!BodyWriter). Or you can emulate VFT in a struct through function pointers, wich will keep uncopyability, but will look questionable.

True, the high-level API will need this in all likeliness. A possibility for the low-level API would still be to either invert control or to define the request handler callbacks as template alias arguments to let them accept different types for those streams. This may be a good candidate for a quick benchmark experiment.

(I really disagree with the "only dump programmers make mistakes" philosophy - reality simply tells a different story, namely that such mistakes are made frequently in practice, and in the end it doesn't really matter how good the programmer was. Since the implications of making such a mistake can be disastrous for public services, I'm convinced that we should do everything that we can to mechanically make it as difficult as possible to make them.)

I also don't think unified HTTPInterchange object is something good. Chained calls with staged callbacks:

  • looks ugly, IMO.
  • May perform worse, more indirection. On the freight trains of headers, sent by modern browsers, readHeaders may create more problems, than it solves. What should user do there, write a giant switch? How is switch over strings implemented by the compilers? Does it compare lengths first? Does it choose between an rb-tree and hash table, or does it build a state machine? I think an optimized and fine-tuned pre-built header lookup collection is more user-friendly and will be faster.
  • Writing straight back to response body without even looking at request is perfectly viable behaviour. Why shoud I call three bethods that do nothing to get it done? I have to write code that means nothing.
  • Body should already be read by vibe.d parser well before any user callback are invoked. Body start and length were found, CLRFs were found, all that rfc2616 stuff. All this validation and parsing should be already done before the client code. Client callbacks should not be ran if the request does not conform to HTTP standard. I don't see a benefit from forcing this specific order on the user.

I'm not really happy with the outlined draft either. The idea is just to enforce a proper order one way or another, because such mistakes surfaced multiple times in the project's history. But I'd value the general usability of the API higher than statically enforcing this, so if there is no nice solution to it in D, I don't mind letting this requirement go.

I think the current form of HTTPServerRequestHandlerS callback is very nice.

Another possible approach could be to use the return value for the response, as some frameworks do (and also the vibe.web.rest module in higher level form), but that also impacts the usability somewhat. But I'd still like to think through the different directions a bit more before committing to a final approach.

@Boris-Barboris
Copy link
Author

Boris-Barboris commented Feb 14, 2018

The idea is actually to have something that does not allocate anything apart from the TCP connection structure itself.

This is pretty much impossible due to HTTP. No apriori-known length, tonns of ambiguity in the spec, streams. State is required, it's size is dynamic. State lies on the heap or fiber stack (wich is on the heap as well). Allocation is imperative. But it should be efficient and non-excessive.

The main problem is that using scope+DIP1000, which definitely would be my preferred approach, will break user code, because scope will have to be added in all places that accept HTTP requests/reponses.

Reference-counted variables have reference-counted wrapper type. It will break user code just like the 'scoped' attribute or RequestScoped wrapper will. You are either explicitly reference-counted, or you are in GC realm. Or are you calling GC implicit?
Semantics of escaping a variable from request-local memory to GC heap (via duplication) are up to you, it can be made pleasant and short, thus minimizing programmer butthurt, especially since existing projects can simply use old vibe-d.http.

The reference counting is thread-local, so while the performance implications need to be observed carefully, it's unlikely to have a big impact. Like for the current issue with InterfaceProxy invoking RC, passing things by reference (or using move) will also eliminate the RC operations completely.

If it's thread-local, you just introduce different front for mistakes while thinking that you are reducing it.

Also, think about this example:
header value string wich spans the request buffer (wich will be freed after the request is completed) is escaped to god-knows-where (let's say some background worker). It will hold the whole request buffer pinned (and it's large). You then can decide to manage such strings granularily, increasing allocation pressure in initial parse stage. Header values are now passed to client code as refcounted wrappers. You just reinvented GC.

And GC will be faster, it's ought to. GC pauses for whole process took a small fracture of a time RC inside vibe-d took in that test we talked about.

The thing to keep in mind is that if the templated stream is handed over into user code without a proxy of some sort with a fixed type, then the number of combinations may still amplify there.

I agree, it should be proxied in external API.

(I really disagree with the "only dump programmers make mistakes" philosophy - reality simply tells a different story, namely that such mistakes are made frequently in practice, and in the end it doesn't really matter how good the programmer was. Since the implications of making such a mistake can be disastrous for public services, I'm convinced that we should do everything that we can to mechanically make it as difficult as possible to make them.)

All three approaches (GC, scoped, RC) are practically safe and make it nontrivial to dereference incorrect pointer. All three allow to shoot yourself in the foot in D. Last two of them require user code rewrite, the first one is implemented currently.
Of the last two approaches, one is zero(almost)-CPU-cost but requires absent type system support to be safe and more typing from the programmer to use (while simultaniously actually making him aware about the memory situation). The other is costly CPU-wise, semantically-implicit, and also requires absent type system support as well (you can simply escape a reference to RC proxy and crash just as easily) to be safe.

RC problem domain is the situation when object lifetime is unknown. That is not the case for a lot of state in HTTPRequest\Response. All stream cascades die when request is served. It doesn't matter if you RC them, or place them on GC - no data should be read from the socket or written to it after the request is served. Lifetime is strictly defined and static. Their memory lifetime, if guarded by RC, just creates the illusion that this object makes any sense even after the request is served, wich is a lie. The fact that their memory can still exists due to RC is a logical mistake on it's own.

If the type system\language support is not to your liking, maybe descriptor approach will satisfy you.
User code receives a number, for example serial request id, wich is then passed to OS-like interface wich accept this number (or a pointer, like in PIMPL) as a descriptor. If request is already served, those functions will throw\assert. Every byte of memory that those functions escape will be put on GC heap (you can go fancy and accept custom allocators here) and will not be subject to lifetime ambiguity. That will completely solve the problem of escaped streams.

Just for example, I dunno:

struct req_id
{
    package void* ptr;    // used internally by the framework
    const ulong id;  // additional safeguard to make sure the same address is not occupied by another request after some time
}

// returns GC-allocated copy of the actual header value wich lies on task-local buffer
string getRequestHeader(req_id req, scope string key) @safe;

bool setResponseHeader(req_id req, string key, string value) @safe;
size_t writeResponseBody(req_id req, scope ubyte[] data) @safe @nogc;

// finalizes request, after this call this req will always throw in any framework functions
void finalizeResponse(req_id req) @safe;

@s-ludwig
Copy link
Member

This is pretty much impossible due to HTTP. No apriori-known length, tonns of ambiguity in the spec, streams. State is required, it's size is dynamic.

While stack allocated memory is technically also dynamically allocated, in reality it just reuses the same fixed buffer for each run, so I'm excluding that from the list. Apart from that, only the lower level stream types that are defined in terms of other libraries (TLS and ZLIB) have uncontrolled dynamic allocations. Everything else, at least at the basic HTTP protocol level gets away with small fixed size buffers. Only higher level concepts may need to store dynamically sized data that would then simply be transient in the low level API.

Reference-counted variables have reference-counted wrapper type. It will break user code just like the 'scoped' attribute or RequestScoped wrapper will. You are either explicitly reference-counted, or you are in GC realm. Or are you calling GC implicit?

Consider this:

void handleRequest(HTTPServerRequest req, HTTPServerResponse res) {
    res.writeBody("Hello");
}

With reference counted structs, the code above could stay exactly like it is, while with scope, the signature would have to be changed. Of course there are other aspects that still break, but those are things that normally do not happen in user code. Others are not possible anyway, such as instantiating request/response objects directly.

If it's thread-local, you just introduce different front for mistakes while thinking that you are reducing it.

Almost everything in vibe.d is thread-local and it's part of the type system to disallow moving thread-local data between threads (even if Druntime and Phobos unfortunately provide some means to circumvent this, but vibe.d actually enforces it as far as possible). This is mainly a question of performance and having to synchronizing everything would simply yield unacceptable performance.

header value string wich spans the request buffer (wich will be freed after the request is completed) is escaped to god-knows-where (let's say some background worker). It will hold the whole request buffer pinned (and it's large). You then can decide to manage such strings granularily, increasing allocation pressure in initial parse stage.

You can simply use a pay-for-what-you-use approach here. The strings would initially reside in a single buffer. But once (and only then) a string is still referenced after the handler is left, it would be re-allocated individually, while the request buffer gets freed. I have to admit though that D's movable struct semantics make this less efficient that it could be, which could be a deciding factor.

Regarding the safety of the three approaches - the GC approach is in theory perfectly safe (apart from a few practical holes), and you are right that the other two approaches require additional type system support (DIP1000) to really be safe. But it simply is my assumption that this support will materialize over the course of this year, so the RC approach would be more backwards compatible and reasonably safe for now, while the scope approach would be preferable performance-wise, but actually depends on DIP1000. The questions is how much of a difference this actually makes in practice, and there really is no way other than measuring it.

But the ID based approach will most likely be slower than the RC approach, because of the additional indirection and the need to manage IDs (while they were for free in the form of an anready allocated pointer in the other approaches), and then it still escapes memory just like the other approaches. The moment you receive a reference to the underlying data (e.g. getRequestHeader), you are free to store it and dereference it without first querying the validity of the ID again.

@Boris-Barboris
Copy link
Author

Boris-Barboris commented Feb 19, 2018

void handleRequest(HTTPServerRequest req, HTTPServerResponse res) {
    res.writeBody("Hello");
}

With reference counted structs, the code above could stay exactly like it is, while with scope, the signature would have to be changed. Of course there are other aspects that still break, but those are things that normally do not happen in user code.

Escaping either request or response object is not normal, it is rare case. Escaping the string from the header of 'req' is however pretty common. Do you plan to escape it as RC-string? Because if yes - it will break interfaces (it's not a string type anymore, at most it will be a string subtype), and it will have separate semantics for duplication and scope-checked usage. If not - then it is once again a mystery for me, why wrap objects like streams wich rarely (never) escape, and not wrap request body or headers, wich are common to escape\outlive request.

Almost everything in vibe.d is thread-local and it's part of the type system to disallow moving thread-local data between threads

Remember "the truth about shared"? It's still very applicable. =)

But once (and only then) a string is still referenced after the handler is left, it would be re-allocated individually, while the request buffer gets freed.

Does not sound possible with D structs, nor does it sound feasible. Were you meaning, that when response is finalized, all structs that wrap the escaped object, on all stacks of all threads and fibers, and every instance of such reference on the heap get the updated reference? Sound really bad. And if you make every such reference a class, you get mandatory double dereferencing on every usage of every string. Such object should have been GC-allocated from the start.

But the ID based approach will most likely be slower than the RC approach, because of the additional indirection and the need to manage IDs (while they were for free in the form of an anready allocated pointer in the other approaches), and then it still escapes memory just like the other approaches. The moment you receive a reference to the underlying data (e.g. getRequestHeader), you are free to store it and dereference it without first querying the validity of the ID again.

My point was to always escape only gc-allocated copies, but I do not like it just as much as you do, it's bad as well, I agree.

@s-ludwig
Copy link
Member

Escaping either request or response object is not normal, it is rare case. Escaping the string from the header of 'req' is however pretty common.

Agreed. scope, or its RC based approximation need to act recursively.

Do you plan to escape it as RC-string?

Yes, it would have to, and would be implemented as a sub type (i.e. still not safe until DIP1000 is there). But you are right, this may be a critical flaw of this approach - either the conversion to string/const(char)[] is implicit, which then doesn't help much in terms of safety - pass it to a foo(string) and it's escaped silently - or explicit, in which case the likeliness of breaking user code becomes much higher, with potentially more code being broken than with a scope parameters.

Remember "the truth about shared"? It's still very applicable. =)

Yes, shared is still a mess, and there are some different ideas of what it should be. Especially the conflation of shared with implicit atomic operations is IMO a foolish idea, but with regard to shared references the semantics are actually pretty clear and sound, except for syntax and object construction issues.

So the essential property is just that as long as you don't (ab)use the holes in Thread, std.parallelism or std.concurrency, and not cast away shared explicitly, moving thread-local objects between threads is prohibited effectively. So far that is the only important property of shared in this context, and furtunately it is highly unlikely that this will change, no matter which direction shared will or will not be developed.

Does not sound possible with D structs, nor does it sound feasible.

It certainly is possible and feasible, but I would tend to agree that it's probably not possible to actually make it perform better than GC allocating in the first place. A trivial way to implement it would be to use an ID based approach (IDs staying alive until all references are gone, but point to different memory after the handler leaves). Alternatively, keeping a doubly linked list of all references would work in C++, with the advantage that there are no additional dereference costs, while the reference copy cost increases. Doesn't work in D, though, anyway.

So summing this together, the most promising approach so far appears to be to rely on scope/DIP1000 and define HTTPServerRequest/Response as structs that hold a pointer to the actual data inside, but only return references to it as scope. That way, the user code example above will still work, and ideally only the places where data is actually escaped will get broken and require an explicit .dup.

@Boris-Barboris
Copy link
Author

So summing this together, the most promising approach so far appears to be to rely on scope/DIP1000 and define HTTPServerRequest/Response as structs that hold a pointer to the actual data inside, but only return references to it as scope. That way, the user code example above will still work, and ideally only the places where data is actually escaped will get broken and require an explicit .dup.

Agreed.

@jacob-carlborg
Copy link

I'm been working on the approach, at least in my application code, to try to make all allocations request local and use DIP1000 to make sure nothing escapes. To me it sounds very good, at least in theory, all allocations, both library and application, were request local. If all the data of a request was allocated on a stack buffer or thread local buffer in the form of a static array and the use it as a bump-the-pointer allocator it should be very efficient. When freeing the memory the pointer/counter just needs to be reset to its initial state. Using stdx.allocator would allow to have a primary stack allocator and a fallback allocator using malloc, the GC or similar.

To me I think it would be very rare to escape any request data and I would be fine by explicitly copy the data in the few cases it would be needed.

If significant performance improvements can be show, I think people would be ok with breaking the API, i.e. need to add scope.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants