Quill Dotty Update

deusaquilus · February 26, 2020, 10:11pm

At this point, I would like to share some detail about Quill’s path forward. For the past several months I’ve been heavily focused on prototyping Quill in the Dotty macro system. With the initial POC being almost complete, I can confidently say that Dotty macros provide the needed functionality for Quill to move forward.

Here are a few architecture notes.

Whitebox macros are no longer needed because Dotty’s ‘inline’ construct provides the feature set we need. The ‘inline’ construct is also extensively used (and will be used more!) in the implementation of Dotty-Quill itself. This will allow us to have type-annotations on variables holding Quill Quotes (these will typically be ‘inline def’). Additionally, this will allow us to do many more operations on compile-time queries then we were able to do before. This includes the ability to pass compile-time queries into methods which even includes recursive methods! This allows recursive construction of various query logics during compile-time and well as many other new things!
Quoted is now a case-class and Quotation is static. De-coupling it from a specific context removes the need for the annoying cake patterns that are needed today. Based on my conversations with @fwbrasil and @lihaoyi this will significantly increase Quill useability.
However, this requires that ‘lift’ sections delay summoning encoders until the run method. Consequently, the type-tags of the lifts need to be propagated through the Quoted blocks into the run call. For compile-time (i.e. inline) Queries this is doable since the Dotty tree can be made available to the run macro. For runtime Queries, however, Dotty currently does not support runtime-passable type-tags (@pshirshov and the ZIO folks have expressed that they intend to implement the Izumi LightTypeTag in Dotty which could be a solution to this issue). For now, the solution is to differentiate “lazy lifts” which can be done from a context-less Quoted section and “eager lifts” which are done once a context is imported, “lazy lifts” delay encoder summoning while “eager lifts” summon an encoder immediately. The semantics for the two will be equivalent and in most cases, a user will not need to know the difference. Once created, “lazy lifts” will be useable only with compile-time queries, while “eager lifts” will be useable everywhere.
The mechanism for lifting, encoding, and decoding is significantly changed. Instead of manually constructing an object-encoding lattice via ValueComputation, Dotty’s Typeclass-Derivation is used. This is a very powerful approach that will allow us to implement handling of enums and embedded types much more transparently than in the past. Something like Hibernate’s table-per-hierarchy strategy may now be possible with co-product embedded classes.
The parser will be implemented using Dotty’s quote matching. This mechanism is about as powerful as quasi-quoting but has stricter requirements about typing.
Despite these things, Quill will still largely be able to have the same API with the exception that compile-time queries will need to have an ‘inline’ keyword in their declaration.

I will be working on the following things within the timeframe of a few months:

Parts of Quill in quill-core and quill-sql that are compatible with dotty will be moved out to quill-core-portable and quill-sql-portable. Originally I had planned to move out the handling of macros itself but Scala 2 macros are very deeply embedded into Quill and at this point, it would be impossible to move them out without causing major disruption to Quill’s external API (and hence the ecosystem). quill-sql-portable will only depend on quill-core-portable. This will allow Dotty-Quill to reuse the AST transformations.
Currently the Dotty-Quill design is being chrystallized on the dotty_test project. Once this is complete, I will create a project called ‘Proto-Quill’ (in a new codebase) that will be the initial implementation of Quill-Dotty (if anyone has a better name, please let me know!). Quill features will be migrated into Proto-Quill one by one. Initially, implementation of the Parser and the Metas will be very primitive and they will be fleshed-out module by module.
As time passes, quill-core-portable and quill-sql-portable will need adjustments in order to make Quill more tractable in Dotty. Small changes will be made to Quill’s API (e.g. Encoders/Decoders need to be traits instead of Types). These will be introduced slowly in order to prevent disruption to Quill’s ecosystem.
Proto-Quill will be around for several years as it matures and as Scala 3 becomes productionalized. Eventually, Proto-Quill will be merged to the proper Quill codebase and become Quill 4. The Scala 2 Quill (i.e. Quill 3) will then be moved in a different repo and called Exo-Quill which will eventually be deprecated.

Overall, I find the task ahead of us daunting but also inspiring. Scala 3 macros are substantially better behaved then Scala 2 macros and the Scala 3 Typeclass-Derivation mechanisms are really, good. There are many, many, many things that we will be able to do with Quill in Scala 3 that are completely impossible today. Amongst these are the ability re-use inline code between Quill transpiled expressions and regular Scala code. The ability to use Quill quotes inside of Typelevel code (e.g. Aux-Pattern returning Quoted blocks), recursive construction of Quill queries (since inlined blocks can be recursive), and many, many other things.

lihaoyi · February 26, 2020, 10:29pm

Is this really necessary? I wonder if it’s possible to summon the encoders when constructing the quoted block, but only execute them in the run method. Summoning the encoder at the run callsite seems like it would make error messages tough, since the source location which requires the encoder may be far away from where run ends up being called

mdedetrich · February 26, 2020, 11:21pm

I don’t think you can have your cake and eat it too, this is the downside of removing the cake pattern which actually carried the context needed to materialize the quote into an actual SQL query at compile time to provide better error messages.

@deusaquilus At least personally, if its going to be a tradeoff of good error messages vs “simpler non cake pattern API” I would always go for the good error messages. By far the most frustrating thing about Quill both when myself and my coworkers work on it are the highly confusing error messages.

Also its arguable about whether removing the context in the traits in the API is a better design, I actually prefer Quill’s design in Scala2 more.

lihaoyi · February 27, 2020, 5:36am

I suppose to me the cake pattern itself isn’t a huge problem from a user point of view, but rather the fact that you have to pass the cake around because the database connection is buried within it is what causes problems. Passing around a cake full of types I am required to import results in all sorts of inference problems related to path dependent types, when really the types are all identical and the only thing different between the cake instances is a single reference to the current database connection

As long as the cake is static, and I can just import from it like a package, it wouldn’t cause me any grief personally. Ideally the “code generate SQL for database” would be in the cake, and the actual database connection would be in a trivial implicit parameter (without any path-dependent types or things I need to import) i can pass around separately.

Not sure whether that matches other’s experience with quill

Krever · February 27, 2020, 7:49am

Isn’t it exactly what doobie-quill implements? Im using it for all my new code and I’m very happy with it. Disentangling db connection from quill context is VERY beneficial.

lihaoyi · February 27, 2020, 8:22am

I don’t know, I’ve only used quill-postgres. Maybe I should start using doobie-quill? I don’t use doobie elsewhere which is why I went straight to quill-postgres

mdedetrich · February 27, 2020, 12:03pm

But that’s the point, its not really static because the SQL that gets generated depends on dynamic context which is what Quill calls a dialect, i.e. the SQL you generate for a Postgres dialect is different to the SQL you generate for the MySQL dialect.

This is also the same reason why I prefer this design, its actually more explicit (hence more clear) that you are dealing with different databases. If the SQL was the same for every SQL database then we wouldn’t have this issue (also take note that Quill even abstracts over non SQL databases, i.e. CQL for Cassandra).

In any case I haven’t had complications API wise because of this design hence why I am wary of throwing out the baby with the bathwater. As mentioned before the biggest issue currently I see with Quill is unhelpful error messages (this isn’t surprising since its macro based) and this design from what I can see will make things much worse.

LPTK · February 27, 2020, 12:48pm

It sounds like some people are conflating “the cake pattern” (using mixin composition and overriding) with “first-class modules” (using type members and path-dependent types to track distinct types statically). The former is discouraged, but the latter really is an important abstraction that enhances type safety.

However, this is not necessarily the best API for this particular use case. I don’t know much about Quill, but from what @mdedetrich describes, it seems to me that a more appropriate API design could be to:

represent the core differences in capabilities of the various SQL dialects using Scala abstractions; for instance, use a WindowFunctions trait to represent the capability of expression window functions
generate query types that refer to the capabilities they require:
we’d have class Query[+T, -Capa <: SQLDialect],
possibly with a synonym for convenience: type SQLQuery[+T] = Query[T, SQLDialect];
a query using plain SQL with window functions would get type Query[T, WindowFunctions]
have the different database drivers provide different capabilities, for example the PostGres driver would be a SQLDriver[WindowFunctions with AutoIncrement with ...]. This way, we can make them handle only queries that do not require more capabilities.

And naturally surface differences in SQL syntax should be handled dynamically at the point where the SQL is actually generated, and should not require path-dependent types or type parameters.

lihaoyi · February 27, 2020, 12:55pm

@LPTK what you describe is almost exactly how the Scalatags library is implemented: a mix of compile time types to distinguish the static capabilities of different backends, and runtime dynamism for the “boring” differences that do not require separate types. Traits, generics, convenience type aliases, all exactly as you stated. Even cross-backend code is allowed, if you adhere to the common-subset of the API between the two backends

Works great for Scalatags, it’s not immediately clear to me how such a design could apply to Quill (not familiar enough with the internals) but i think the broad approach is a good one

LPTK · February 27, 2020, 1:13pm

Good to know! The community might be converging towards idiomatic designs for Scala after all

mdedetrich · February 27, 2020, 5:10pm

To be clear, Quill does both. It uses path dependant types with traits, i.e.

This is what Quill does currently in Scala 2. To be more precise, I don’t think current Quill uses the “cake pattern”, the cake pattern itself is an overloaded term and most people seem to equate “traits with undefined members” = “cake pattern” (which is not true).

In any case, you can view the current Quill codebase here GitHub - zio/zio-quill: Compile-time Language Integrated Queries for Scala. For example there is a PostgresDialect which is a trait that is mixed in (you can see the trait definition here zio-quill/quill-sql/src/main/scala/io/getquill/PostgresDialect.scala at ca23d3c75386073082f28a1d41dd12d4ba6721f5 · zio/zio-quill · GitHub).

The core issue here is really not the cake pattern or traits or mixins, the real problem (and annoyance) is that since Quill is implemented using Macros having the Quill ctx passed along (which is usually a combination of various traits mixed in, i.e. PostgreSQLDialect with JDBCContext) means that the macro invocation at compile time has access to all of the information it needs to generate the SQL. The annoying part that people are complaining about is having to pass this ctx everywhere, because if you use Quill’s DSL this ctx has to be passed in at this location (i.e. there is no global ctx singleton, its a value providing a path dependant type). This however also is currently the reason why it can even provide nice error messages (i.e. right now Quill can tell you that its impossible for to generate SQL for a piece of code because its not supported on MySQL since it has access to this ctx). Furthermore editors which use the Scalac compiler can immediately show you the generated SQL without even having to run your application (see the gif at the homepage here https://getquill.io/ , this also works with vscode).

The difference with ScalaTags is that it doesn’t use macros, the HTML code which ScalaTags generates is done at runtime. If ScalaTags generated HTML code at compile time using macros (in the same way Quill does) you would have the exact same problem.

LPTK · February 27, 2020, 5:23pm

I don’t understand why you think that macros are the reason a ctx has to be passed around.

If the parameterized query approach was used (with a contravariant type indicating the required capabilities), I do not think passing ctx around would be necessary. It could be passed solely at the end, at the place where we want to generate the query at compile time (the sub-query ASTs would be retrieved either using the Scala 2 annotations-based trick, or using Dotty’s inline). Or maybe you are not telling us the full story.

mdedetrich · February 27, 2020, 5:32pm

Sure but this is just putting the problem elsewhere, instead of passing a value containing the type information along you are passing types as type parameters, in terms of ergonomics I don’t see how one is that much better than the other (if I understand you correctly). Its a bit hard to explain in words but you can try Quill out in Scastie here Scastie - An interactive playground for Scala., how it works currently is the ctx is passed around which contains a path dependent type and you then import ctx._ which provides the specific DSL for the SQL dialects that you have. Of course its possible to only provide ctx at the edge of the application, my point is not about what is or what isn’t possible but rather how good the error messages are, i.e. the point here

Since currently the ctx is passed around in various places in your application its fairly easy to provide nice error messages. If you only have ctx at the edge of the application it makes it more difficult to provide nice error messages if the error is far away from that edge.

LPTK · February 27, 2020, 5:51pm

No, I think you will not be passing type parameters most of the time.

Your code may look like this:

object ModuleA {
  inline def query: Query[Int, SQLDialect] = ...
}
object ModuleB {
  inline def query: Query[String, PostgresDialect] = ...
}
object ModuleC {
  inline def query = quote {
    for (x <- ModuleA.query; y <- ModuleB.query if x > y.size)
    yield (x, y)
  }
}
object App {
  val q = ModuleC.query
  val ctx = new PostgresContext(...)
  ctx.run(q)
}

And the errors will be reported at the appropriate place, as long as you provide the expected types (which is strictly less bothersome than passing around a ctx).

mdedetrich · February 27, 2020, 6:49pm

Actually in your case you are, when you have

object ModuleA {
  inline def query: Query[Int, SQLDialect] = ...
}

The type parameters of Query, i.e. [Int, SQLDialect] is something you have to pass around (currently in query definitions you don’t need explicitly provide the type), either that or your Quote’s will need it (or both, can’t say for sure now). Unless I am mistaken in your example you have to give the type for Query and/or Quote otherwise its impossible to figure out what SQL dialect you are using just from the DSL since there isn’t a ctx.

Furthermore I think there would actually be 3 parameters, Quill has Dialect and NamingStrategy so it would look something like

object ModuleA {
  inline def query: Query[Int, SQLDialect, NamingStrategy] = ...
}
object ModuleB {
  inline def query: Query[String, PostgresDialect, NamingStrategy] = ...
}

So I again I don’t think we are gaining that much, its already more boilerplatery. Of course its possible to abstract over the type parameters but then again you can make the same argument that you can abstract over the ctx value right now without any real issue, i.e. in our applications currently we have

trait Quotes[N <: NamingStrategy] extends TableSchemas[N] {
   val ctx: JdbcContextBase[N]
   import ctx._

   def ageQuote = quote {
      query[Table].filter(_.age > 10)
   }
}

class Queries(val ctx: PostgresMonixJdbcContext[SnakeCase]) extends Quotes[SnakeCase] {
  import ctx._

  def age = ctx.run(ageQuote)
}

where TableSchemas is where you can provide schemaMeta for custom table/column mappings.

We don’t ever have to pass ctx around, we provide it once when we create a Queries class which is our application “edge”. I mean honestly I don’t know what this fuss is around, Scala has plenty of abstractions to prevent you from “passing around a ctx” or anything else for that matter, I don’t see any net gain here.

The above example has also been abstracted for different backends, you can simplify it further if you want (i.e. you don’t have to abstract over a NamingStrategy if you don’t need to, you can just hardcode it)

I guess my point is, I don’t really mind either approach if it doesn’t effect error message ergonomics but if the newer design effects the type of error messages we get than I see this as a loss. We also have to now deal with 2 types of Encoder/Decoder’s (the lazy/strict which @deusaquilus mentioned earlier with dynamic queries) which has already complicated things compared to the current design so if its a case of manually passing around ctx versus having to redefine liftings for dynamic vs static encodings I would definitely go for the former.

EDIT: Also I think by definition delaying the liftings of Decoder/Encoder’s until ctx.run can effect error message’s that are due to liftings because they have been lazily suspended (same reason why Shapeless can provide weird errors right now).

LPTK · February 27, 2020, 8:30pm

Ok, personally I don’t have a strong opinion on whether keeping a ctx around is too great a burden or not.

It’s just that in my experience, path-dependent types can be rather inflexible compared to a type-parametric approach, especially if the underlying types really are mostly the same; I just don’t know if that’s a problem in the context of Quill.

Note: I don’t agree at all with your characterization that having type parameters is as bad as passing a ctx value around.

[Int, SQLDialect] is something you have to pass around

No, it’s really not passed around in any sense. And if there’s a particular dialect you use often, you can make an alias for it. Making a type alias is much easier than injecting a value everywhere.

Also, I think you misunderstood the need for explicit types. You don’t need any explicit types. Having explicit types will just make a dialect violation be reported a little earlier (note that Query is contravariant, so yo ucan upcast a query into requiring more capabilities). Also, I don’t think dialect violations will lead to complicated errors; it will just be of the form: expected Query[T, Blah]; got Query[T, Blah with Bleh].
In any case, it’s good practice to specify the types of public members. The ctx-passing approach is strictly more boilerplate, requiring you to lift things into classes.

PS: I’m using the name “Query” for my examples, but I think Quill uses "Quoted " here.

PPS: friendly tip: it’s “affect error messages”, not “effect error message’s” — sorry, this specific mistake really irritates me

tpolecat · February 27, 2020, 9:19pm

For doobie-quill Quill provides SQL generation, statement preparation, and resultset processing based on the existing JDBC support. Then doobie-quill packages this up as a ConnectionIO[A] or fs2.Stream[ConnectionIO, A] and then you compose those and execute using doobie’s machinery. It happened to align easily and there was not much to it in the end … cloc says 150 lines of code.

deusaquilus · February 28, 2020, 3:52pm

Hi Guys,
This discussion has been very constructive.

My original motivation was to make Quotation be database independent and maybe even Quill independent so that it could be used in other frameworks (e.g. Monadless). Error messages from compile-time (i.e. inline) queries are not an issue because the entire AST is available and I can just pass the Expr on which the error is happening into QuoteContext.error. Errors from runtime queries however would not only have bad messages, the messages would only be generated during runtime! For this reason, I now think that we should probably drop lazy-encoders entirely and make Quotation always context-specific. This could mean that we need to return to PDTs but I have an alternative which is…

If we completely extract sessioning out of Contexts i.e. move all the execute___ methods out of Context and make run take an implicit QuillSession parameter… then we can probably make all database-specific contexts (e.g. PostgresJdbcContext) be static; we already do something like this with QuillSparkContext. Since the contexts contain the encoders, all encoding will be eager and error messages should be good. One caveat is that the Modular Context Pattern will still require PDTs only needed for multi-db use cases which are infrequent.

The only question with this approach is what do we do with Dialect and NamingStrategy? We could potentially put these things on QuillSession and the parametrize the run method with them (this requires changes to the Scala 2 macro code but it is manegeable). The sematics of this API might look like the following:

import io.getquill.PostgresJdbcContext._

// Don't need a session yet!
val q = quote {
  query[Person].filter(p => p.name == "Joe")
}

implicit val session: QuillSession[PostgresDialect, Literal] = newSession[Literal](connection)
run(q)

The challenge of this approach is that it is a substantially different API from what we have currently. How could we provide users with a simple migration path?

Edit: Fix Quote/Query

lihaoyi · February 28, 2020, 4:15pm

This sounds like a reasonable tradeoff for me. I agree that multi-db use cases are probably uncommon enough (<1% of users?) that optimizing for the single-db case with static contexts makes sense, pulling session handling out of the context to allow them to be static.

What if we made Dialect and NamingStrategy something configured on the static context that a user instantiates? So the user would say object MyPostgresContext extends PostgresContext with SnakeCase and then use MyPostgresContext throughout their application. This would rule out “multi naming strategy” use cases, but I suspect similar to DBs most people would use a single naming strategy throughout their application.

mdedetrich · February 28, 2020, 4:17pm

Is there a problem in actually keeping the current design (with all of the obvious improvements that Dotty’s inline provides?). I understand the annoyance of people having to pass around a ctx but honestly you can even provide a hardcoded global ctx in your own application currently if you only target a specific database dialect and driver (i.e. JDBC) as well as a single NamingStrategy.

I think the core issue here is actually a documentation one. These to me are non issues but we probably should document better code examples on how to structure the ctx/quotes for your “typically application”