- Author(s): lidizheng
- Approver: ericgribkoff
- Status: Draft
- Implemented in: Python
- Last updated: 12-10-2018
- Discussion at: https://groups.google.com/forum/#!topic/grpc-io/8Ys6ba8gpLc
The current design of the gRPC Python servicer API for RPCs terminated with a non-OK status does not easily allow pairing the failure with associated additional information not expressible by the grpc-status
and grpc-message
fields. This proposal adds a rich Status
class to gRPC Python that allows coupling the failure's status code, error string, and arbitrary additional information about the failure (sent as trailing metadata) in a unified interface.
The gRPC Spec defined two trailing data to indicate the status of an RPC call grpc-status
and grpc-message
. However, status code and a text message themselves are insufficient to express more complicated situation like the RPC call failed because of quota check of specific policy, or the stack trace from the crashed server, or obtain retry delay from the server (see error details).
The one primary use case of this feature is Google Cloud API. The Google Cloud API needs it to transport rich status from the server to the client.
So, the gRPC team introduced a new internal trailing metadata entry grpc-status-details-bin
to serve this purpose. It takes in serialized proto google.rpc.status.Status
message binary and transmits as binary metadata. Notably, the google.rpc.status.Status
message contains its own code
and message
fields. However, since gRPC is a ProtoBuf-agnostic framework, it can't enforce either the content of the binary data set in the trailing metadata entry, nor the consistency of status code/status message with the additional status detail proto message. The result of this unsolvable problem is that for three major implementations of gRPC we have, each of them is slightly different about this behavior.
C++ implementation tolerates anything to be put in the grpc-status-details-bin
entry; Java checks if the code of the proto status is matching the status code of the RPC call; Golang enforces both code and message of the proto status, and the RPC call be the same.
So, which paradigm should Python follow?
The proto of status is well-defined and used in many frameworks. Here is the definition of Status
, for full version see status.proto.
message Status {
int32 code = 1;
string message = 2;
repeated google.protobuf.Any details = 3;
}
# Client side
stub = ...Stub(channel)
try:
resp = stub.ARpcCall(...)
except grpc.RpcError as rpc_error:
code = rpc_error.code()
details = rpc_error.details()
# Unable to get rich status
# Server side
def ...Servicer(...):
def ARPCCall(request, context):
context.set_code(...)
context.set_details(...)
# Unable to set rich status
This class is used to describe the status of an RPC.
The name of the class is Status
, because it is shorter and cleaner than RpcStatus
or ServerRpcStatus
. In the future, we might want to utilize it at client side as well.
The metadata field of Status
is trailing_metadata
instead of metadata
. The Status
should be the final status of an RPC, so it should assist developers to get information that only accessible at the end of the RPC.
class Status(six.with_metaclass(abc.ABCMeta)):
"""Describes the status of an RPC.
This is an EXPERIMENTAL API.
Attributes:
code: A StatusCode object to be sent to the client.
details: An ASCII-encodable string to be sent to the client upon
termination of the RPC.
trailing_metadata: The trailing :term:`metadata` in the RPC.
"""
Available options are listed in the Rationale
section. Here is our final consensus as a team, but feel free to comment about other options.
During the discussion we come up with several criteria for server-side API:
- Shouldn't pass
ServicerContext
around and let extension package to mutate it. - Should minimize the changes for our main package.
- Should allow the status code/message validation at some point.
- Should reserve the extensibility for future updates.
- Should not confuse developers about the priority between the new API and old APIs.
- Should be Pythonic.
As a result, we finally decided to add a new API named abort_with_status
that accepts the new interface grpc.Status
. Unlike set_code
/set_details
can be called multiple times, the abortion mechanism ensured the code, details, and metadata of the would be set as the final status of grpc.Call
.
def abort_with_status(self, status):
"""Raises an exception to terminate the RPC with a non-OK status.
The status passed as argument will supercede any existing status code,
status message and trailing metadata.
This is an EXPERIMENTAL API.
Args:
status: A grpc.Status object. The status code in it must not be
StatusCode.OK.
Raises:
Exception: An exception is always raised to signal the abortion the
RPC to the gRPC runtime.
"""
Besides the change to status API, there should be a new gRPC Python extension package that depends on ProtoBuf named grpcio-status
. The new package should provide convenient functions to help developers transform from ProtoBuf instance to gRPC status. The usage should look like:
# Client side
from grpc_status import rpc_status
stub = ...Stub(channel)
try:
resp = stub.AMethodHandler(...)
except grpc.RpcError as rpc_error:
rich_status = rpc_status.from_rpc_error(rpc_error)
# `rich_status` here is a ProtoBuf instance of `google.rpc.status.Status` proto message
# Server side
from grpc_status import rpc_status
from google.protobuf import any_pb2
def ...Servicer(...):
def ARPCCall(request, context):
...
detail = any_pb2.Any()
detail.Pack(
rpc_status.error_details_pb2.DebugInfo(
stack_entries=traceback.format_stack(),
detail="Can't recognize this argument",
)
)
rich_status = grpc_status.status_pb2.Status(
code=grpc_status.code_pb2.INVALID_ARGUMENT,
message='API call quota depleted',
details=[detail]
)
context.abort_with_status(rpc_status.to_status(rich_status))
# The method handler will abort
There are six more alternative options for implementing this feature.
We expose a new public exception interface that developer can raise within their servicer method handler. The exception itself contains information like status code, status message, and most importantly the rich status details. As for the extension package, we shall expose a function to help developer assemble that exception.
Current server side abortion mechanism relies on exception as well. When a developer called context.abort(...)
, a designated exception raises. Then upstream function should catch that exception and perform an abortion for the RPC call.
PS: The custom error handler mechanism is needed for gRPC Python but unsupported yet. The correct way is to do it through fully-featured server-side interceptor in the proposal L13. However, it is never got implemented.
# Usage Snippet
import grpc_status
from google.protobuf import any_pb2
def ...Servicer(...):
def AMethodHandler(request, context):
...
detail = any_pb2.Any()
detail.Pack(
error_details_pb2.DebugInfo(
stack_entries=traceback.format_stack(),
detail="Can't recognize this argument",
)
)
rich_status = grpc_status.status_pb2.Status(
code=grpc_status.code_pb2.INVALID_ARGUMENT,
message='API call quota depleted',
details=[detail]
)
raise grpc_status.to_exception(rich_status)
- Follows the existing implementation mechanism (abort)
- Allows raising the exception from any layer of the stack, also prevents servicer context be passed around
- Adds a new public interface
- Adds a new exception handling mechanism (@gnossen: it's magic!)
- No rich status for the successful call
In C++/Java/Golang, the status is implemented in a cohesive class that handles status-related information. Unfortunately, in Python, current design doesn't support this. And as @ericgribkoff mentioned, Python API name mixed the concept of the status message and status details. There is a set_details
method for servicer context, but its used to set a status message. So, we presumably can reorganize that information as a server-side status class an alternative method of existing methods, and deprecate those single-value setters in future.
Also, the new interface allows gRPC Python to validate the code, message, and details provided by the developer are matched. Even with the validation, developers still able to abuse this API by providing arbitrary status details.
# Add a new interface for ExtraDetails
class ExtraDetails(abc.ABCMeta):
@abc.abstractmethod
def code(self): pass
@abc.abstractmethod
def message(self): pass
@abc.abstractmethod
def details(self): pass
# Usage Snippet
import grpc_status
from google.protobuf import any_pb2
def ...Servicer(...):
def AMethodHandler(request, context):
...
detail = any_pb2.Any()
detail.Pack(
error_details_pb2.DebugInfo(
stack_entries=traceback.format_stack(),
detail="Can't recognize this argument",
)
)
rich_status = grpc_status.assemble_status(
code=grpc_status.code_pb2.INVALID_ARGUMENT,
message='API call quota depleted',
details=[detail],
)
context.set_extra_status(grpc_status.convert_to_extra_details(rich_status))
...
- Expose the setting of status to developers like Java/Go
- Status-related information managed in one place
- Clean, plain design if
set_code
/set_details
removed
- Have to add a brand new
Status
class - Ambiguity of responsibility of the new status class
- Asymmetric with client side channel design
- Needs to educate developer about priority
In this design, the new extension package provides a single function abort_with_status
that accepts the server context and status proto instance. It will set code, message and trailing metadata inside the function. The downside is developers have to pass a mutable instance to another package to change it. Also, the left semantic ambiguity about the abortion behavior during exception whether to continue abortion or not.
# Usage Snippet
import grpc_status
from google.protobuf import any_pb2
def ...Servicer(...):
def AMethodHandler(request, context):
...
detail = any_pb2.Any()
detail.Pack(
error_details_pb2.DebugInfo(
stack_entries=traceback.format_stack(),
detail="Can't recognize this argument",
)
)
rich_status = grpc_status.status_pb2.Status(
code=grpc_status.code_pb2.INVALID_ARGUMENT,
message='API call quota depleted',
details=[detail]
)
grpc_status.abort_with_status(context, rich_status)
# An exception will be raised. RPC call abort.
- Zero code change to the main package
- Passing server context around (@ericgribkoff: ugly!)
- Ambiguous abortion behavior
The new package handles the conversion from ProtoBuf message instance to gRPC Python metadata, which is a double-value tuple
. It is the most hands-off option. Our main framework promised nothing to developers about the usage of the grpc-status-details-bin
.
# Usage Snippet
import grpc_status
from google.protobuf import any_pb2
def ...Servicer(...):
def AMethodHandler(request, context):
...
detail = any_pb2.Any()
detail.Pack(
error_details_pb2.DebugInfo(
stack_entries=traceback.format_stack(),
detail="Can't recognize this argument",
)
)
rich_status = grpc_status.status_pb2.Status(
code=grpc_status.code_pb2.INVALID_ARGUMENT,
message='API call quota depleted',
details=[detail]
)
context.set_code(rich_status.code)
context.set_details(rich_status.message)
context.set_trailing_metadata(
grpc_status.to_metadata(rich_status)
)
- Zero code change to the main package
- Verbosity
- No code/message validation at all
- Confusing definition of message/details
Similar to option 4, this option is one-step further that the framework automatically set the string to the metadata entry grpc-status-details-bin
.
### Client side ###
stub = ...Stub(channel)
try:
resp = stub.AMethodHandler(...)
except grpc.RpcError as rpc_error:
binary_status = rpc_error.binary_status()
status = grpc_status.parse(binary_status)
# Do stuff with the status
### Server side ###
def ...Servicer(...):
def AMethodHandler(request, context):
...
detail = any_pb2.Any()
detail.Pack(
error_details_pb2.DebugInfo(
stack_entries=traceback.format_stack(),
detail="Can't recognize this argument",
)
)
rich_status = grpc_status.status_pb2.Status(
code=grpc_status.code_pb2.INVALID_ARGUMENT,
message='API call quota depleted',
details=[detail]
)
context.set_code(rich_status.code)
context.set_details(rich_status.message)
context.set_binary_status(rich_status.serializeToString())
...
- Similar to C++'s strictness: code/message/details are independent
- Promised developers the usage of metadata entry
grpc-status-details-bin
- Verbosity
- No code/message validation at all
- Confusing definition of message/details
The new API will be added to ServicerContext, and named set_status
. It accepts two positional arguments and a keyword argument, so setting the status code and status message is mandatory but not details. At the same time, current API set_code
/set_details
will be labeled as "deprecated".
The function signature should be:
def set_status(code, message, details=""): pass
The usage should look like:
# Server side
from grpc_status import rpc_status
from google.protobuf import any_pb2
def ...Servicer(...):
def ARPCCall(request, context):
...
detail = any_pb2.Any()
detail.Pack(
rpc_status.error_details_pb2.DebugInfo(
stack_entries=traceback.format_stack(),
detail="Can't recognize this argument",
)
)
rich_status = grpc_status.status_pb2.Status(
code=grpc_status.code_pb2.INVALID_ARGUMENT,
message='API call quota depleted',
details=[detail]
)
context.set_status(*rpc_status.convert(rich_status))
- Concise API call
- Avoid introducing new classes
- Recommend using splat is not Pythonic
- Need to educate developers about the priority of those three APIs
Although the grpcio package can't enforce the consistency of status code and status message with the information embedded in the rich Status
proto, there is still value to provide an additional package - one which can depend on protobuf and is able to verify that these elements are consistent - to help developers use the API in its intended fashion.. It demonstrates our recommendation through the design.
One more reason is that in the future, the google.rpc.status.Status
may not be the only status proto we accept. This package provides a straightforward mapping between different status proto message.
gRPC Python uses the field name details
to store data that will later be transmitted in grpc-message
header. But the details
field in proto Status
is a list for error details proto messages. Unfortunately, this discrepancy is deeply embedded in the gRPC Python API and unlikely to be changed.
Pull Requests:
- Part of new status API is implemented in grpc/grpc#17481
- The new extension package is implemented in grpc/grpc#17490
Issue: grpc/grpc#16366