Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[discussion] make generated JsonCodec re-usable in other code #3070

Open
myazinn opened this issue Aug 30, 2024 · 24 comments
Open

[discussion] make generated JsonCodec re-usable in other code #3070

myazinn opened this issue Aug 30, 2024 · 24 comments
Labels
enhancement New feature or request

Comments

@myazinn
Copy link
Contributor

myazinn commented Aug 30, 2024

Is your feature request related to a problem? Please describe.
Let's say I have a code that looks like this

final case class MyModel(
  @fieldName("dakar_rally") f1: Boolean,
  f2: Boolean
)

implicit val myModelSchema: Schema[MyModel] = DeriveSchema.gen
implicit val myModelHttpContentCodec: HttpContentCodec[MyModel] = HttpContentCodec.fromSchema

OpenAPIGen.fromEndpoints(Endpoint(Method.GET / "foo").out[MyModel])

For my model, schema acts like the source of truth. The problem is that it's not the only place where I'd like to use the same codec for this model. For example, we sent this data to Kafka topic, but also want to provide an HTTP endpoint to manually fetch it if needed. So I would like to keep them as consistent as possible to ensure they're the same everywhere.

Describe the solution you'd like
I'd like to be able to easily and safely "extract" the generated json codec from HttpContentCodec.

Describe alternatives you've considered
I can think of only three solutions:

  1. Generate my own json codec using the schema I already have. While it does work, I'll end up with two different codecs defined in two places, even though the root information (Schema) is the same
  2. Forcefully replace json codec in provided myModelHttpContentCodec with my own. Something like this
implicit val myModelHttpContentCodec: HttpContentCodec[MyModel] = {
  val myOwnJsonHttpContentCodec =
    HttpContentCodec(
      ListMap(
        MediaType.application.`json` ->
          BinaryCodecWithSchema(
            myOwnJsonCodec,
            schema,
          ),
      ),
    )
  HttpContentCodec.fromSchema ++ myOwnJsonHttpContentCodec
}

That could also work, but I'd have to copy-paste some code
3) Extract it from HttpContentCodec directly, e.g.

val binaryCodec = myModelHttpContentCodec.lookup(MediaType.application.`json`).get.codec

While that involves the least amount of code, there are other issues. First, it's not really safe. While lookup is a public API, its internals feel like an implementation detail which I shouldn't rely on. Also, it provides a BinaryCodec from zio-schema, even though under the hood it uses a codec from zio-json. If I need to use zio-json somewhere, I'd have to make a wrap from zio-schema BinaryCodec to zio-json BinaryCodec. It's not a huge deal, but still.

Additional context
Perhaps it's a stupid question and I'm missing something / doing something wrong. Is there no other easy way to do it? That sound like something that should be quite common and easy to do. I know that this HttpContentCodec is supposed to be used not only with json, but I've got a feeling that most of the time it will actually be a simple json. So making a special case for such a common task doesn't sound too bad to me.
I guess it could be fixed with addition of something like updated to HttpContentCodec, so that it could be used like this

implicit val myModelHttpContentCodec: HttpContentCodec[MyModel] =
  HttpContentCodec.fromSchema.updated(MediaType.application.`json` , myOwnJsonHttpContentCodec)

Though it's still not perfect since I'll have no idea if the codec was there in the first place and therefore will actually be used. And I'll also have to have myOwnJsonHttpContentCodec defined manually.

@myazinn myazinn added the enhancement New feature or request label Aug 30, 2024
@987Nabil
Copy link
Contributor

I understand the goal of using the same codec at two places. And it is a good use case. I think the first thing I'd change about how to approach this is, that I'd not try to get a zio-json codec, but just use zio-schema for the kafka json en/decoding as well (also my goal for my day job).
Then this task becomes already easier I think.
I personally would not like to return an explicit zio-json codec from the http content codec, since it reveals implementation details. Also what is returned here would always be optional, since we don't know if there is a json media type registered in the ContentCodec.

While I think there needs to be some R&D to find a nice solution, the feature request is valid.
Maybe there might be already a way in the code we have. Writing docs can also be a valid outcome

@decoursin
Copy link

I'm new to zio-http, and I would also very much prefer a much simpler solution for using zioJsonBinaryCodecs instead of schemaBasedBinaryCodecs.

Based on my research and practice it would seem like zio-schema has more limitations than writing a BinaryCodec in zio-json, or at least writing codec in zio-json is far easier when you bump up to the those zio-schema limitations. That's why I wrote all my codecs in zio-json rather than zio-schema. I mean at the very least, writing Manual Schema Construction in zio-schema is sort of a pain in the ass, but it's much more pleasant and easier in zio-json.

However, using those zio-json codecs with zio-http for the Endpoint server API is surprisingly very difficult.

You know, there is no HttpContentCodec.fromZioJson only HttpContentCodec.fromSchema, why? I mean I understand the schema is necessary for the documentation, but I would think that I should still be able to have my case class outputs encoded using zio-json rather than zio-schema.

I would admit I don't completely understand, so maybe I'm misplaced but I don't think so. I am very thankful for zio and zio-http definitely seems be to heading in a great direction, thank you ❤️.

@decoursin
Copy link

I personally would not like to return an explicit zio-json codec from the http content codec, since it reveals implementation details.

By the way, I understand for certain pristine projects that might be preferred, but for me in my experiences what I usually want is to be able to get something working as quickly as possible. Worrying about revealing json implementation details is usually only a nice-to-have way at the end.

@987Nabil
Copy link
Contributor

@decoursin There is no intention and will never be to support non schema based codecs directly in zio-https Endpoint API. The Endpoint API is schema based. But, we use under the hood zio-json. I used both and know both well. And I'd say zio schema is more powerful in the simple use case. And you should not use manual schemas.
As you said, you should be doing as little as possible to make json work. Also, this seems to go in a different direction then what @myazinn wants.

But please, add more details and let me understand what your use case is. Then we can find a solution. I would assume, that there is either missing understanding or missing documentation for what you want to do.
And I'll be happy to help out with that.

@decoursin
Copy link

@987Nabil Yeah, I didn't mean to hijack the thread, but it seemed to me like there are similarities, and it was labeled as a discussion anyways.

I understand that the zio-http Endpoint API is schema based in generating API documentation. However, zio-http Endpoint API already supports non schema-based codecs. @myazinn demonstrated this himself:

  val myOwnJsonHttpContentCodec =
    HttpContentCodec(
      ListMap(
        MediaType.application.`json` ->
          BinaryCodecWithSchema(
            myOwnJsonCodec,
            schema,
          ),
      ),
    )

The myOwnJsonCodec can be defined using JsonEncoder or JsonDecoder, like this:

case class SomeCaseClass(field: String)

implicit lazy val zioEncoderSomeCaseClass: JsonEncoder[SomeCaseClass] =
  DeriveJsonEncoder.gen[SomeCaseClass]
implicit lazy val zDecodeSomeCaseClass: JsonDecoder[SomeCaseClass] =
  DeriveJsonDecoder.gen[SomeCaseClass]

val myOwnJsonCodec = zio.schema.codec.JsonCodec.zioJsonBinaryCodec[SomeCaseClass]

When you do that in combination with .out[SomeCaseClass] (like in the following), zio-http will encode the json using myOwnJsonCodec (or rather zioEncoderSomeCaseClass) instead of the schema codec:

  val somethingRoute = Endpoint(Method.GET / Root / "something")
    .out[SomeCaseClass]
    .implement(_ => ZIO.succeed(SomeCaseClass("hi")))

This works the same was Tapir. In Tapir, the schema can be defined separately from the json encoding/decoding.

This would also solve @myazinn's problem, because then he would be able to seamlessly use his own json codec (as he said himself).

So it's really the same problem in my opinion, and I haven't hijacked the thread at all I would say.

@decoursin
Copy link

To me it's bewildering that I can't seamlessly use zio-json encoding/decoding with zio HTTP. I can even do with that Tapir.

But I don't mean to disrespect the project, I love zio, I just think it would be a very crucial and important addition, to be able to more seamlessly support zio-json encoding/decoding.

@987Nabil
Copy link
Contributor

@decoursin why do you want to derive the json codec in the first place? Why not just use the default schema based json codec?
Also, yes using a non schema based codec works, but is highly discouraged, since docs would easily diverge from actual behavior

@987Nabil
Copy link
Contributor

To me it's bewildering that I can't seamlessly use zio-json encoding/decoding with zio HTTP. I can even do with that Tapir.

But I don't mean to disrespect the project, I love zio, I just think it would be a very crucial and important addition, to be able to more seamlessly support zio-json encoding/decoding.

Tapir is not using zio schema. And I do not understand why you would like to use zio json anyway. There should be no reason

@987Nabil
Copy link
Contributor

the goal should be, that there is no need for non schema based codecs in almost all cases

@decoursin
Copy link

zio-json is turing complete. zio-schema is not turing complete. There is no limits what I can do with zio-json, that's not the case for zio-schema.

With zio-json I can combine or split different fields on a case class into one. With zio-schema that's not possible.

With zio-json I can dynamically determine how I want to parse or render my data. With zio-schema that's not possible. For example, in zio-json I can dynamically and easily determine using simple scala at runtime based on the data how to parse the data, what case class to use, maybe apply some transformations before inserting into the case classes; none of this is possible with zio-schema or it's extremely tedious and unwanted.

I can do dynamically logging with zio-json, with zio-schema that's not possible.

zio-schema is like having a voucher at a store, rather than having cash (zio-json). I don't want a voucher, I want cash and flexibility.

zio-json is simple. zio-schema is simple when you're just using automatic schema generation, but the second you have to go outside of that, it's not fun, it's hard. zio-schema is like taking a small problem (creating json from a case class) and turning it into something far more complex so that it can try to support all of this.

zio-schema is fundamentally designed to describe the structure of data; zio-json is designed for parsing and rendering data. Those are different purposes.

zio-schema is not popular actually. I don't want to build my software on top of it other than basic stuff like auto generation of schemas and defining the simple the structure of the data.

And you should not use manual schemas.

This is part of the problem. I need to use manual schemas, and manual schema suck. I don't want to write the names of the fields in a "string" for example and the names of the case classes.

I have to write my case classes to match 3rd party APIs that cause all sorts of discrepancies. I'm not going to make all these unnecessary transformations, and multiple case classes for one datatype just to parse json, if that were even possible with zio-schema. I need to not just parse it but also render it as well.

@decoursin
Copy link

the goal should be, that there is no need for non schema based codecs in almost all cases

These goals are fairy tale goals that don't respective real programmer needs, or the convenience and reliability of working directly with the data interchange format (i.e. json). I don't want to be dependent on some other library to generate my json into some other library (from zio-schema to zio-json) that then generates my json, I want to generate it directly from my json library; I don't want middlemen in between my middlemen, I just want to work right with the source.

@decoursin
Copy link

Does zio-http want to be relevant? IDK maybe the developers are happy just using it for themselves, that would be fine, it's their project, I'm not complaining. But if it wants to relevant than it has to make some practical changes IMO

@987Nabil
Copy link
Contributor

987Nabil commented Sep 18, 2024

zio-json is turing complete. zio-schema is not turing complete. There is no limits what I can do with zio-json, that's not the case for zio-schema.

This is not true. Turing complete is a attribute of a programming language. Neither are languages. Neither are are turing complete.

With zio-json I can combine or split different fields on a case class into one. With zio-schema that's not possible.

Please make a concrete example. This is an abstract description. Show me the code.

With zio-json I can dynamically determine how I want to parse or render my data. With zio-schema that's not possible. For example, in zio-json I can dynamically and easily determine using simple scala at runtime based on the data how to parse the data, what case class to use, maybe apply some transformations before inserting into the case classes; none of this is possible with zio-schema or it's extremely tedious and unwanted.

Again, code please. And even if true, the Endpoint API has a clear and narrow purpose. Feel free to use the lowlevel API, if you don't like the constraints.

I can do dynamically logging with zio-json, with zio-schema that's not possible.
Maybe. But might still be out of scope for the Endpoint API
zio-schema is like having a voucher at a store, rather than having cash (zio-json). I don't want a voucher, I want cash and flexibility.

No idea what this should mean.

zio-json is simple. zio-schema is simple when you're just using automatic schema generation, but the second you have to go outside of that, it's not fun, it's hard. zio-schema is like taking a small problem (creating json from a case class) and turning it into something far more complex so that it can try to support all of this.

Yes, schema has a lot of use cases

zio-schema is fundamentally designed to describe the structure of data; zio-json is designed for parsing and rendering data. Those are different purposes.

Correct. That's why with schema you describe mainly data that is already represented by a case class. Then you get for free codecs for many different formats that represent the same structure as the case class. If you case class does not fit the outbound structure, create a new one and map between those two. Or agin, don't use the Endpoint API but the low level API where you can do what you want. The Endpoint API is opinionated by design

zio-schema is not popular actually. I don't want to build my software on top of it other than basic stuff like auto generation of schemas and defining the simple the structure of the data.
If you don't want to relay on zio-schema, the zio-http Endpoint API is not the right tool for you.

And you should not use manual schemas.

This is part of the problem. I need to use manual schemas, and manual schema suck. I don't want to write the names of the fields in a "string" for example and the names of the case classes.

Yes, manual schema generation sucks. Because it is a last resort and not the way to go.

I have to write my case classes to match 3rd party APIs that cause all sorts of discrepancies. I'm not going to make all these unnecessary transformations, and multiple case classes for one datatype just to parse json, if that were even possible with zio-schema. I need to not just parse it but also render it as well.

Ofc this transformation is possible.
The whole design of the Endpoint API is: Give my a type that represents your outbound data. If you don't want that, don't use the Endpoint API. If you need a hammer, don't use an axe.

You also mentioned Tapir can do what you need. Tapir is great. Use Tapir 🙂

@987Nabil
Copy link
Contributor

the goal should be, that there is no need for non schema based codecs in almost all cases

These goals are fairy tale goals that don't respective real programmer needs, or the convenience and reliability of working directly with the data interchange format (i.e. json). I don't want to be dependent on some other library to generate my json into some other library (from zio-schema to zio-json) that then generates my json, I want to generate it directly from my json library; I don't want middlemen in between my middlemen, I just want to work right with the source.

Then don't use the Endpoint API

@987Nabil
Copy link
Contributor

Does zio-http want to be relevant? IDK maybe the developers are happy just using it for themselves, that would be fine, it's their project, I'm not complaining. But if it wants to relevant than it has to make some practical changes IMO

You are entitled to your own opinion. No user of the Endpoint API had such issues so far. Also you did not provide me a clear code example of your use case. But it sounds to me, that you just don't want to use the Endpoint API in its intended way.
You can ofc use it in a not intended way, but then please don't expect that we create APIs to increase your personal DX.
And if you have a common use case, Scala offers many ways to implement some code and integrate it with our API to fit your needs. Extension methods/implicit def for example or just a constructor method. No one is stopping you.

@jdegoes
Copy link
Member

jdegoes commented Sep 18, 2024

I would like to better understand what's being requested.

At first glance, what is being requested is:

To extract the underlying, hidden zio-json codec from HttpContentCodec, so it can be used in other places.

Okay, before we make that possible, it would be nice to understand, "Why?"

I think I have collected some reasons:

  1. So we can have our "wire format" different than our "in-memory format".
  2. So we can reuse the zio-json codec in other places (e.g. Kafka).

Did I miss any reason?

Did I miss any specific feature being requested?

Once we have a complete list, it will be possible to come up with a solution that balances the tradeoffs involved.

@jdegoes
Copy link
Member

jdegoes commented Sep 18, 2024

@decoursin If I understand you correctly, you want to be able to start with a zio-json codec, and then use that to produce an HttpContentCodec?

And, furthermore, you want to do this because zio-json codec is inherently more flexible (and easier to make flexible) than the JSON codec you get from zio-schema?

@decoursin
Copy link

decoursin commented Sep 18, 2024

Hi @jdegoes thank you very much for offering your time on this issue. I am very thankful and deeply impressed by the community and libraries that you have built. I wouldn't want this issue to distract from my gratitude towards you and your community including @987Nabil. I don't mean to be aggressive, just purposely constructive, and whether or not this issue gets resolved in a matter that would satisfy me, it wouldn't deter from being sincerely thankful for everything else you've built (which I myself haven't yet materially profited from just fyi but hopefully I may in the future, because I know it's very powerful - I'm in the process of starting my own company with your software.)

@decoursin If I understand you correctly, you want to be able to start with a zio-json codec, and then use that to produce an HttpContentCodec?

Yeah, pretty much, but it wouldn't have to only be like that. I could imagine other solutions, but I don't know the internals of zio-http, so it's difficult for me to offer concrete solutions. However, for example, instead of .out[ :HttpContentCodec] something like perhaps .outJson[ :BinaryCodec[A], :Schema[A]].

And, furthermore, you want to do this because zio-json codec is inherently more flexible (and easier to make flexible) than the JSON codec you get from zio-schema?

Yeah, and more powerful actually which I think that's what you mean by flexible. It's also more far more convenient and easier to write non-standard stuff, and other reasons I tried to list above.

For example, I have an API that I work with where I need to output OffsetDateTime using seconds from the unix epoch, but internally we (I) serialize and deserialize this as an iso8601 string. This creates a discrepancy; there are different ways to solve this of course, but the solution I like the best is writing manual zio-json decoders and encoders.

Another example, I'm using Stripe and constantly digesting their webhooks to keep a local copy of their data in perfect synchrony. They're constantly making changes to their data models, which causes parsing problems, I want to be able to log these parsing problems immediately as they appear.

Stripe also does things where they use the same field for different types. For example, the subscription.customer is expandable so sometimes it's a string, sometimes it's an object. I really don't know if writing a manual constructor for that using zio-schema is even possible and if so, it wouldn't be nice to have to do that for the 150 Stripe cases classes that I have.

zio-schema can't decode or encode from one field into many. For example, imagine something like this:

  case class HappyClass(message: String, state: State)

and you want to be able to generate multiple fields based on the state, not just one. You can't do that with zio-schema I don't think. zio-schema only allows you to make changes to the single field, it can't make changes depending on multiple fields.

To me, I can only imagine what other zio-schema limitations I would maybe come across, but since it's fundamentally limited, in comparison with a real json parser, I haven't used it that much. I did already come across this one, but you know this wouldn't be a problem at all if I had the control over the encoding and decoding, and who knows what other such problems like this will arise, and then needing unnecessary config details or other make-up solutions to cover the fundamental limitations of zio-schema.

All of these things and others are because of fundamental limitations on zio-schema, and because of these fundamental limitations I'm confident there would arise more practical limitations/problems/bugs etc if I were to depend on it as my json encoder/decoder.

@myazinn
Copy link
Contributor Author

myazinn commented Sep 18, 2024

@jdegoes for me the main reason was 2)
Let me share more context on what we were trying to do and which way we decided to go (that's a long story).

So right now we have services that produce to / consume from Kafka, and also provide some request-response features using custom in-house tool. For all of that we use JSON as a data format and jsoniter library for JSON codecs. Now we want to add an HTTP endpoint to work along with in-house tooling (and get rid of in-house stuff eventually).

Here's the list of features that we were looking for in HTTP libraries

  1. Have declarative endpoint API (as its easier to maintain and read)
  2. Support OpenAPI generation from Endpoint description
  3. Support Endpoint generation from OpenAPI description
  4. Integrate well with what we already have

We tried to use zio-http first, but realised that in order for Endpoint -> OpenAPI generation to work properly, we need to define zio-schema and then feed it to zio-http Endpoint description. Which makes total sense, but now we have existing jsoniter codecs and new codecs for zio-http that defined elsewhere. Since we must have consistent serialization regardless of where the data is going (Kafka / HTTP / custom stuff), it is important for us to have the same codec everywhere. And that means that we have to define zio-schema, feed it to zio-http, and then derive zio-json codecs from zio-schema and replace jsoniter codecs in all other places. Which is not the end of the world, but a bit painful to do, and I was hoping there was a better way of doing it. That's why I opened the ticket.

Regardless, in a few days after I opened it, we realized that it's not how we want to do things. We decided to stick to schema-first approach. In that case, Endpoint -> OpenAPI generation is merely a helper tool to generate those nasty .yaml files automatically, and the Scala code is not the source of truth. So we just wanted to somehow get the .yaml description, then store it somewhere, and then pass it to other teams / expose as documentation in a separate place. Since we don't expect our endpoints to dramatically change over time, it's ok for us to manually fix .yaml files if needed. Having said that, it seems we are going to use tapir for it :( the reason for that is because

  1. tapir already integrates declarative Endpoint descriptions with jsoniter codecs, which is fine if you don't care about Endpoint -> OpenAPI generation. So consistent codecs everywhere without the need to change the existing code
  2. generated OpenAPI -> Endpoint code can be used for both server and client. It seems zio-http was supposed to support it as well, but we couldn't find a way to use Endpoint description for a client code. So basically only OpenAPI -> Server is supported (at least for now from what we could find)
  3. OpenAPI schema generation works good enough for initial setup, and then we can manually fix it if needed

With a bit of code we could integrate zio-http with jsoniter ourselves and just ignore the schema completely. The real show-stopper was that we couldn't find a way to make OpenAPI -> Endpoint -> HTTP Client transition.

I don't want to turn it into "X is better than Y", and perhaps zio-http and tapir are just for different use-cases. I do value the fact that if you have a zio-schema description, you get both documentation and serialisation handled by zio-http in a consistent way. That's just not what we wanted in the end, and now I wanted to share the context on why we decided to go with tapir. For me zio-http seems very cool, but in our cases it's just not there yet. I hope it makes sense.
P.S. I second all the gratitude to you, @987Nabil and other people who did all of that, and also just trying to give a constructive feedback

@987Nabil
Copy link
Contributor

@myazinn I just realized that the docs are missing the client side of the Endpoint API, but it is in the examples. If you need help let me know here or in discord. I'll add docs asap.

@myazinn
Copy link
Contributor Author

myazinn commented Sep 18, 2024

@987Nabil thank you! Somehow I missed it. That changes a lot for us 🙂

@decoursin
Copy link

@myazinn would you use zio-http if you could use your jsoniter codecs? I'm just curious what's holding you back now

@myazinn
Copy link
Contributor Author

myazinn commented Sep 26, 2024

@decoursin we haven't decided fully yet, so yes, probably so.
We are also considering moving all the generated stuff into library-like modules, and in that case code generation approach in zio-http looks better then generating .class files directly like tapir does. Though it'd be cool to have some plugin to generate this code automatically.

@decoursin
Copy link

That's awesome @myazinn, thanks for sharing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants