-
Notifications
You must be signed in to change notification settings - Fork 428
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTTP API metrics #1969
base: master
Are you sure you want to change the base?
HTTP API metrics #1969
Conversation
942845e
to
9b5a91f
Compare
5077.1 / Erlang 19.3 / small_tests / fa7b041 5077.6 / Erlang 19.3 / elasticsearch_and_cassandra_mnesia / fa7b041 5077.5 / Erlang 19.3 / ldap_mnesia / fa7b041 5077.3 / Erlang 19.3 / mysql_redis / fa7b041 5077.2 / Erlang 19.3 / internal_mnesia / fa7b041 5077.4 / Erlang 19.3 / odbc_mssql_mnesia / fa7b041 5077.8 / Erlang 20.0 / pgsql_mnesia / fa7b041 5077.9 / Erlang 21.0 / riak_mnesia / fa7b041 mod_global_distrib_SUITE:multi_connection:test_muc_conversation_history{error,{{assertEqual,[{module,mod_global_distrib_SUITE},
{line,418},
{expression,"exml_query : attr ( AlicePresence , << \"from\" >> )"},
{expected,<<"[email protected]/alicE4.776981">>},
{value,<<"[email protected]/eve5.71013">>}]},
[{mod_global_distrib_SUITE,'-test_muc_conversation_history/1-fun-1-',
3,
[{file,"mod_global_distrib_SUITE.erl"},
{line,418}]},
{mod_global_distrib_SUITE,'-test_muc_conversation_history/1-fun-5-',
3,
[{file,"mod_global_distrib_SUITE.erl"},
{line,417}]},
{escalus_story,story,4,
[{file,"/home/travis/build/esl/MongooseIM/big_tests/_build/default/lib/escalus/src/escalus_story.erl"},
{line,72}]},
{mod_global_distrib_SUITE,test_muc_conversation_history,1,
[{file,"mod_global_distrib_SUITE.erl"},
{line,390}]},
{test_server,ts_tc,3,[{file,"test_server.erl"},{line,1545}]},
{test_server,run_test_case_eval1,6,
[{file,"test_server.erl"},{line,1063}]},
{test_server,run_test_case_eval,9,
[{file,"test_server.erl"},{line,995}]}]}} 5077.8 / Erlang 20.0 / pgsql_mnesia / fa7b041 5077.9 / Erlang 21.0 / riak_mnesia / fa7b041 offline_SUITE:mod_offline_tests:max_offline_messages_reached{error,
{timeout_when_waiting_for_stanza,
[{escalus_client,wait_for_stanza,
[{client,<<"bOb8.11573@localhost/res1">>,escalus_tcp,<0.8734.1>,
[{event_manager,<0.8708.1>},
{server,<<"localhost">>},
{username,<<"bOb8.11573">>},
{resource,<<"res1">>}],
[{event_client,
[{event_manager,<0.8708.1>},
{server,<<"localhost">>},
{username,<<"bOb8.11573">>},
{resource,<<"res1">>}]},
{resource,<<"res1">>},
{username,<<"bOb8.11573">>},
{server,<<"localhost">>},
{host,<<"localhost">>},
{port,5222},
{auth,{escalus_auth,auth_plain}},
{wspath,undefined},
{username,<<"bOb8.11573">>},
{server,<<"localhost">>},
{password,<<"makrolika">>},
{stream_id,<<"3D08F955361D3E89">>}]},
5000],
[{file,
"/home/travis/build/esl/MongooseIM/big_tests/_build/default/lib/escalus/src/escalus_client.erl"},
{line,138}]},
{offline_SUITE,'-max_offline_messages_reached/1-fun-1-',6,
[{file,"offline_SUITE.erl"},{line,112}]},
{escalus_story,story,4,
[{file,
"/home/travis/build/esl/MongooseIM/big_tests/_build/default/lib/escalus/src/escalus_story.erl"},
{line,72}]},
{test_server,ts_tc,3,[{file,"test_server.erl"},{line,1545}]},
{test_server,run_test_case_eval1,6,
[{file,"test_server.erl"},{line,1063}]},... |
%% | ||
%% This middleware needs to run after `mongoose_cowboy_metrics_mw_before' and `cowboy_router' | ||
%% middleware. However, it does not need to run after `cowboy_hander' middleware because metrics are | ||
%% updated in `onresponse' callback which is called whenever the request is sent. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be whenever the response is sent
instead of request
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, thanks!
Method = get_req_method(Req), | ||
HandlerToPrefixMappings = proplists:get_value(handler_to_metric_prefix, Env, #{}), | ||
Prefix = maps:get(Handler, HandlerToPrefixMappings, [Handler]), | ||
mongoose_metrics:update(global, mongoose_cowboy_metrics:request_count_metric(Prefix, Method), 1), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to move the request
counter update to the before
middleware?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately we can't do that 😢 "Before" middleware doesn't know which handler will serve the request, because cowboy_router
has not been run yet.
On the other hand we can't merge the middleware, because in case of no matching routes cowboy_router terminates the middleware chain. In those cases we'd lose even the request count metric (currently we lose response count and latency).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, thanks for explanation :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@michalwski sorry, I was wrong. We can't report the metric until we know the handler, which means the reporting needs to occur after cowboy_router. However, if we're willing to not report the time which it takes for cowboy_router to execute (i.e. only measure the run time of the handler itself) then we could merge the middleware into one which could be placed after cowboy_router.
?assertNotEqual(0, proplists:get_value(value, DataPoints)). | ||
|
||
run_requests(Methods, Statuses) -> | ||
{ok, Conn} = gun:open("localhost", ranch:get_port(?LISTENER)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In small tests we already use fusco
when there is a need to send an HTTP request. It'd better to re-use what is already used, in my opinion. I do agree that there are better libs than fusco
. At some point it will be replaced by sth else (gun
or sth else).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😢
5078.1 / Erlang 19.3 / small_tests / b7bffef 5078.2 / Erlang 19.3 / internal_mnesia / b7bffef 5078.3 / Erlang 19.3 / mysql_redis / b7bffef 5078.4 / Erlang 19.3 / odbc_mssql_mnesia / b7bffef 5078.6 / Erlang 19.3 / elasticsearch_and_cassandra_mnesia / b7bffef 5078.5 / Erlang 19.3 / ldap_mnesia / b7bffef 5078.9 / Erlang 21.0 / riak_mnesia / b7bffef 5078.8 / Erlang 20.0 / pgsql_mnesia / b7bffef |
Codecov Report
@@ Coverage Diff @@
## master #1969 +/- ##
==========================================
- Coverage 74.91% 68.32% -6.59%
==========================================
Files 313 315 +2
Lines 28459 28468 +9
==========================================
- Hits 21319 19450 -1869
- Misses 7140 9018 +1878
Continue to review full report at Codecov.
|
5080.1 / Erlang 19.3 / small_tests / 7be9fc1 5080.5 / Erlang 19.3 / ldap_mnesia / 7be9fc1 5080.6 / Erlang 19.3 / elasticsearch_and_cassandra_mnesia / 7be9fc1 5080.3 / Erlang 19.3 / mysql_redis / 7be9fc1 5080.2 / Erlang 19.3 / internal_mnesia / 7be9fc1 5080.4 / Erlang 19.3 / odbc_mssql_mnesia / 7be9fc1 5080.8 / Erlang 20.0 / pgsql_mnesia / 7be9fc1 5080.9 / Erlang 21.0 / riak_mnesia / 7be9fc1 |
%% See the License for the specific language governing permissions and | ||
%% limitations under the License. | ||
%% | ||
%% @doc Cowboy middleware updating metrics related to request handling |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be even better to have all of this information in a markdown docs. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we document each MongooseIM module in markdown docs? Not only the ones which are configurable by users? I mean, I know that one can configure this middleware in ejabberd_cowboy's protocol opts, but it's rather aimed at MongooseIM's developers, and not operators.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The metrics names and meaning must be described in user docs. The module role description may remain here.
ejabberd_listener:start_listeners(), | ||
mongoose_metrics:init(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was there some kind of problem with old order?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, mongoose_metrics:init()
subscribes all reporters to all metrics which are running at the moment of its start. Since ejabberd_cowboy
is started as one of the listeners, it would define metrics after the reporters were subscribed, which means that no reporter reported HTTP API metrics.
when Acc :: #{module() => mongoose_cowboy_metrics:prefix()}. | ||
build_metric_prefixes(_BasePrefix, [], Acc) -> | ||
Acc; | ||
build_metric_prefixes(BasePrefix, [{_Host, _Path, Handler, _HandlerOpts, Opts} | Tail], Acc) -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if these metrics are that valuable with no path distinction. Various actions triggered by different paths (and their :parameters
) may have drastically different impact on latency stats (if they all apply e.g. to 200 responses to GETs).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree that in general case it won't work well. However, in our HTTP API handlers usually do the exact same thing given some HTTP method (i.e. it doesn't matter what was the path or parameters).
The other solution would be hand-crafted metrics, different for each handler (i.e. we explicitly report metrics from each HTTP module). However, we would not get any metrics in case the handler has crashed.
Maybe the metrics for each handler in a single ejabberd_cowboy's
scope should opt-in as well? This way we can turn them on only for endpoints which we know it makes sense to have these kinds of metrics for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I understand it's good enough until someone actually needs a more fine-grained names. :)
This way reporters will be automatically subscribed to all HTTP metrics.
b491188
to
e18e89c
Compare
5164.1 / Erlang 19.3 / small_tests / 741de3b 5164.3 / Erlang 19.3 / mysql_redis / 741de3b 5164.5 / Erlang 19.3 / ldap_mnesia / 741de3b 5164.6 / Erlang 19.3 / elasticsearch_and_cassandra_mnesia / 741de3b 5164.2 / Erlang 19.3 / internal_mnesia / 741de3b 5164.4 / Erlang 19.3 / odbc_mssql_mnesia / 741de3b offline_SUITE:mod_offline_tests:headline_message_is_not_stored{error,
{{assertion_failed,assert,is_presence,
{xmlel,<<"message">>,
[{<<"from">>,<<"alicE46.330326@localhost/res1">>},
{<<"to">>,<<"bOb46.330326@localhost/res1">>},
{<<"xml:lang">>,<<"en">>},
{<<"type">>,<<"headline">>}],
[{xmlel,<<"body">>,[],[{xmlcdata,<<"msgtxt">>}]}]},
"<message from='alicE46.330326@localhost/res1' to='bOb46.330326@localhost/res1' xml:lang='en' type='headline'><body>msgtxt</body></message>"},
[{escalus_new_assert,assert_true,2,
[{file,
"/home/travis/build/esl/MongooseIM/big_tests/_build/default/lib/escalus/src/escalus_new_assert.erl"},
{line,84}]},
{offline_SUITE,login_send_and_receive_presence,2,
[{file,"offline_SUITE.erl"},{line,171}]},
{offline_SUITE,'-headline_message_is_not_stored/1-fun-0-',3,
[{file,"offline_SUITE.erl"},{line,95}]},
{escalus_story,story,4,
[{file,
"/home/travis/build/esl/MongooseIM/big_tests/_build/default/lib/escalus/src/escalus_story.erl"},
{line,72}]},
{test_server,ts_tc,3,[{file,"test_server.erl"},{line,1529}]},
{test_server,run_test_case_eval1,6,
[{file,"test_server.erl"},{line,1045}]},
{test_server,run_test_case_eval,9,
[{file,"test_server.erl"},{line,977}]}]}} 5164.8 / Erlang 20.0 / pgsql_mnesia / 741de3b 5164.9 / Erlang 21.0 / riak_mnesia / 741de3b |
@@ -130,8 +130,9 @@ do_start_cowboy(Ref, Opts) -> | |||
TransportOpts = gen_mod:get_opt(transport_options, Opts, []), | |||
Modules = gen_mod:get_opt(modules, Opts), | |||
Dispatch = cowboy_router:compile(get_routes(Modules)), | |||
ProtocolOpts = [{env, [{dispatch, Dispatch}]} | | |||
gen_mod:get_opt(protocol_options, Opts, [])], | |||
{MetricsEnv, MetricsProtoOpts} = maybe_init_metrics(Opts), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm maybe we can move whole metrics logic to separate module ?
The goal of this PR is to introduce metrics to the HTTP API ("backend", "frontend", whatever). It consists of two parts:
mongoose_cowboy_metrics_mw_before
andmongoose_cowboy_metrics_mw_after
)ejabberd_cowboy
responsible for assembling names of metrics and defining them (currently metrics are opt-in)For each handler (or rather for each prefix) "types" of metrics are recorded:
..., Method, request, count]
..., Method, response, Class, count]
)..., Method, response, Class, latency]
)These metrics together cover 3 of 4 golden signals.
Metrics need to be enabled by providing
{metrics, true}
option toejabberd_cowboy
. All defined metrics are global. By default they are named like[http] ++ lists:flatten([HandlerBaseName]) ++ MetricTypeSuffix]
.The
HandlerBaseName
part is the handler's module name by default. The handler's definition tuple can now contain a fifth element, which is not passed to handler but is interpreted byejabberd_cowboy
. This element is a list of options, and currently the only supported one ismetrics
:false
, in which case the metrics for the handler won't be created{prefix, Prefix}
option, in which casePrefix
becomes aHandlerBaseName
There is a small issue with this design - if one configured the same handler multiple times under different paths, metrics for both these handlers will have the same name (which obviously might skew the results if the handlers are configured differently). In such cases when we have duplicated metric names we could:
I haven't decided which option to choose yet.
Oh, and recording of metrics can be disabled by executing
cowboy:set_env(Ref, record_metrics, false)
.TODOs:
ejabberd_cowboy
's new optionsejabberd_cowboy
with metrics