Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do we handle 304 responses correctly? #4026

Open
nigoroll opened this issue Dec 4, 2023 · 8 comments
Open

Do we handle 304 responses correctly? #4026

nigoroll opened this issue Dec 4, 2023 · 8 comments

Comments

@nigoroll
Copy link
Member

nigoroll commented Dec 4, 2023

I'm wondering if there aren't some potential problems unrelated to this patch that we have uncovered. The fact that Varnish will enter 304 revalidation logic even though we never asked for it (didn't send out any IMS/INM headers) is worrying to me. I wonder if we shouldn't keep track somehow of whether to expect a 304, and error out if we still got a 304. Could prevent a future gotcha at least.

Originally posted by @mbgrydeland in #4013 (comment)

@walid-git
Copy link
Member

We already error out when we get an unexpected 304 response from the backend :
https://github.com/varnishcache/varnish-cache/blob/master/bin/varnishd/cache/cache_fetch.c#L375-L378

	if (bo->stale_oc != NULL &&
	    ObjCheckFlag(bo->wrk, bo->stale_oc, OF_IMSCAND)) {
	    ....
	} else if (!bo->uncacheable) {
		/*
		 * Backend sent unallowed 304
		 */
		VSLb(bo->vsl, SLT_Error,
		    "304 response but not conditional fetch");
		bo->htc->doclose = SC_RX_BAD;
		vbf_cleanup(bo);
		return (-1);
	}

@mbgrydeland
Copy link
Contributor

I guess the discrepancy to investigate is that OF_IMSCAND doesn't track whether we sent IMS/INM on the fetch, only if we could've done that. And especially for zero-length objects, it is possible to not send IMS/INM, but still execute 304 handling if the backend responded with 304 anyways.

@dridi
Copy link
Member

dridi commented Dec 6, 2023

I'm of the opinion that if we can't see it this problem does not exist, so may I suggest starting with a new addition to vmod_debug?

.. NB: keep enum in sync with obj_attr.h
$Function INT obj_attr_size($Enum {LEN, VXID, FLAGS, GZIPBITS, LASTMODIFIED, VARY, HEADERS, ESIDATA, GZIPED, CHGCE, IMSCAND, ESIPROC})
$Restrict vcl_deliver, vcl_backend_refresh, vcl_backend_response

Give the size of an attribute or zero if it is missing for:

- ``obj`` in ``vcl_deliver``
- ``obj_stale`` in ``vcl_backend_refresh``
- ``beresp`` in ``vcl_backend_response``

This way it can also be used in boolean expressions. Once we have that we can add more test cases, or more check to existing test cases. And since we want to solve this before #3994 we would initially skip the obj_stale inspection.

Does it make sense?

@bsdphk
Copy link
Contributor

bsdphk commented Dec 18, 2023

My view is that the IMS header indicates to VCL that conditional fetch is available, but nothing prevents VCL from replacing it with If-None-Match or another "better" conditional, for whatever value of "better" the VCL writer prefers.

The only dubious case that the bereq has no conditionals and a buggy backend sends 304, something I do not see as a high priority for us.

I'm not even convinced that all conditionals, also private ones, are mandated to have "If-*" as a prefix, is that written down somewhere ?

@dridi
Copy link
Member

dridi commented Jul 8, 2024

I wen through what the RFCs have to say about 304 in general and caching, and my conclusion is that as of today we are doing things mostly correctly, and probably close to the best we can do with the current management.

One big difference between the way we handle revalidation and what the RFCs have to say is the single stale object we embark in a [cond]fetch task. We're supposed to update the headers of all the stored responses matching a 304, but we only do it with the single stale response. For practical reasons, we should keep doing it.

The RFCs talk about preconditions based on weak and strong validators, but says nothing about preconditions being expressed with header fields prefixed with if-:

Other preconditions, defined by other specifications as extension fields, might place conditions on all recipients, on the state of the target resource in general, or on a group of resources.

We support three preconditions, two of which raising OF_IMSCAND.

Regarding validators, only two are defined by the standard, and those are the two we currently use. last-modified is considered weak by default, but can be treated as strong by the application if the clock resolution is deemed good enough. etag is explicitly marked as weak, or strong by default, and we already have rules to weaken a strong entity tag when we fiddle with a response body.

As of today, with revalidation handled by the core code, I think it is fair game to accept 304s only when we expected them after setting up the preconditions.

Another thing to consider is the request method, and in order to refresh a stored response there needs to be a match there as well. Considering how all cache misses are turned into GET backend requests, we can leave this as an exercise to the VCL author to support revalidation in other contexts. I didn't look at the WebDAV spec, but it is the main well-known extension, plus the built-in VCL currently turns WebDAV traffic into pipe mode (caching WebDAV has always been an exercise left to the VCL author).

What we should consider for #3994 moving forward:

  • check current preconditions in vcl_backend_refresh
    • the built-in VCL "fails" (see below) a refresh in the absence of known preconditions
  • check stale_obj validity in vcl_backend_refresh
    • if the stale template object is no longer usable, "fail" the refresh (see below)
    • likewise, by default we only refresh 200 responses
  • when a refresh logically fails

Something like this for the built-in:

sub vcl_backend_refresh {
        call vcl_builtin_backend_refresh;
        return (merge);
}

sub vcl_builtin_backend_refresh {
        call vcl_refresh_valid;
        call vcl_refresh_status;
        call vcl_refresh_conditions;
}

sub vcl_refresh_valid {
        if (!obj_stale.is_valid) {
                call vcl_refresh_error;
        }
}

sub vcl_refresh_status {
        if (obj_stale.status != 200) {
                call vcl_refresh_error;
        }
}

sub vcl_refresh_conditions {
        if (!bereq.http.if-modified-since &&
            !bereq.http.if-none-match) {
                call vcl_refresh_error;
        }
}

sub vcl_refresh_error {
        unset bereq.http.if-modified-since;
        unset bereq.http.if-none-match;
        return (retry(fetch));
}

Anyway, as far as this ticket is concerned, I think we are fine with our current 304 handling and we can close it.

@dridi
Copy link
Member

dridi commented Jul 8, 2024

Slightly modified built-in to avoid side effects on bereq.retries that could disturb existing VCL:

sub vcl_backend_refresh {
        call vcl_builtin_backend_refresh;
        return (merge);
}

sub vcl_builtin_backend_refresh {
        call vcl_refresh_valid;
        call vcl_refresh_status;
        call vcl_refresh_conditions;
}

sub vcl_refresh_valid {
        if (obj_stale.retried) { # read-only, analogous to bereq.retries, but BOOL
                return (error);
        }
        if (!obj_stale.is_valid) {
                call vcl_refresh_retry;
        }
}

sub vcl_refresh_status {
        if (obj_stale.status != 200) {
                call vcl_refresh_retry;
        }
}

sub vcl_refresh_conditions {
        if (!bereq.http.if-modified-since &&
            !bereq.http.if-none-match) {
                return (error);
        }
}

sub vcl_refresh_retry {
        unset bereq.http.if-modified-since;
        unset bereq.http.if-none-match;
        # Same transition as return (fetch) from vcl_backend_fetch,
        # but turns into an error if obj_stale.retried already true.
        return (fetch);
}

@walid-git
Copy link
Member

sub vcl_refresh_conditions {
        if (!bereq.http.if-modified-since &&
            !bereq.http.if-none-match) {
                call vcl_refresh_error;
        }
}

I think this should rather be return (error); because if we got a 304 from the backend when we didn't ask for it, then there is no point in retrying the exact same request another time and expect a different response.

@dridi
Copy link
Member

dridi commented Jul 9, 2024

I agree, I will edit my previous comment accordingly.

edit: I also renamed vcl_refresh_error to vcl_refresh_retry

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants