Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Safer ExampleValue representation #154

Open
ghik opened this issue Apr 15, 2024 · 0 comments
Open

Safer ExampleValue representation #154

ghik opened this issue Apr 15, 2024 · 0 comments

Comments

@ghik
Copy link
Contributor

ghik commented Apr 15, 2024

Current Scala representation of OpenAPI example values is rather unprincipled:

sealed trait ExampleValue
case class ExampleSingleValue(value: Any) extends ExampleValue
case class ExampleMultipleValue(values: List[Any]) extends ExampleValue

The way these untyped Any values are constructed and interpreted is undocumented and unclear. The only way to actually know what they should be is to look into how they are serialized:

  implicit val encoderExampleSingleValue: Encoder[ExampleSingleValue] = {
    case ExampleSingleValue(value: String)     => parse(value).getOrElse(Json.fromString(value))
    case ExampleSingleValue(value: Int)        => Json.fromInt(value)
    case ExampleSingleValue(value: Long)       => Json.fromLong(value)
    case ExampleSingleValue(value: Float)      => Json.fromFloatOrString(value)
    case ExampleSingleValue(value: Double)     => Json.fromDoubleOrString(value)
    case ExampleSingleValue(value: Boolean)    => Json.fromBoolean(value)
    case ExampleSingleValue(value: BigDecimal) => Json.fromBigDecimal(value)
    case ExampleSingleValue(value: BigInt)     => Json.fromBigInt(value)
    case ExampleSingleValue(null)              => Json.Null
    case ExampleSingleValue(value)             => Json.fromString(value.toString)
  }

  implicit val encoderMultipleExampleValue: Encoder[ExampleMultipleValue] = { e =>
    Json.arr(e.values.map(v => encoderExampleSingleValue(ExampleSingleValue(v))): _*)
  }

  implicit val encoderExampleValue: Encoder[ExampleValue] = {
    case e: ExampleMultipleValue => encoderMultipleExampleValue.apply(e)
    case e: ExampleSingleValue   => encoderExampleSingleValue.apply(e)
  }

This implementation does at least two things which are presumably for "convenience" but are in fact completely undocumented and unexpected by the user:

  • A String is first attempted to be parsed as JSON, otherwise it is wrapped in a JSON string - this can lead to some tricky, unexpected behavior and nasty bugs if someone accidentally uses a literal string value which just happens to be a valid JSON (e.g. "[]").
  • Usage of .toString in the fallback case may leak data that the user never intended to be serialized - in the worst case this may cause nasty security issues

The decoder is also not without problems:

  implicit val exampleSingleValueDecoder: Decoder[ExampleSingleValue] =
    Decoder[Json].map(json => json.asString.map(ExampleSingleValue(_)).getOrElse(ExampleSingleValue(json)))

  implicit val exampleMultipleValueDecoder: Decoder[ExampleMultipleValue] =
    Decoder[List[Json]].map { json =>
      val listString = json.flatMap(_.asString)
      if (listString.nonEmpty) {
        ExampleMultipleValue(listString)
      } else ExampleMultipleValue(json)
    }

  implicit val exampleValueDecoder: Decoder[ExampleValue] =
    exampleMultipleValueDecoder.widen[ExampleValue].or(exampleSingleValueDecoder.widen[ExampleValue])
  • First of all, the decoder may return a value different than the one originally serialized, violating the "round-trip law" of serialization, i.e. decode(encode(something)) == something
  • In fact, exampleMultipleValueDecoder is completely useless, as exampleSingleValueDecoder never fails - therefore, an ExampleMultipleValue can never be deserialized
  • The decoder may return (wrap) a un-deserialized, Circe-specific Json value, which is quite bad because ExampleValue should be a model type independent of serialization layer

All these problems are very typical of code that takes shortcuts by using untyped representations and APIs like Any and .toString.

A more principled way to represent examples would be to use a serialization-library-agnostic JSON representation. This could be something as simple as a raw JSON string wrapper or possibly a minimalistic, ad-hoc JSON AST representation.

Solving this issue would likely require changes beyond sttp-apispec as tapir would most likely also be affected since it is the primary user of this library.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant