Proposal to remove XML literals from the language

I think that what @jducoeur meant is that, despite parsing xml being complex and the xml library being complex, a rewriting tool should do more or less the following:

val literal = <hello> {foo()} </hello>

=>

val literal = xml"""<hello> ${foo()} </hello>"""

That’s a rewrite that looks like it shouldn’t be too hard to automate. The complexity is in (now) the compiler and (then) the string interpolator. That shouldn’t necessarily make the rewrite from literals to interpolated strings complex. Unless we are overlooking something, in which case it would be very valuable to have some examples of cases that cannot be trivially rewritten.

After reading this thread I’m left wondering how many people actually use XML literals for something other than HTML and what is their opinion on having to switch to string interpolation.

As has been pointed out above, today’s HTML syntax (i.e. HTML5-based) is not XML, it is not a subset or superset, either. So if you want XML support in the language, do you really want XML or do you want something more or less suitable for encoding HTML?

Actually this is an unrelated question, IMO. There will be discussion threads for every change in the language.

2 Likes

Correct – a little more to it than that, mostly to preserve whitespace and keep the formatting clear in the case of multi-line tags, but that seems to be most of it.

XML is complex, sure – but it seems like migrating it should be entirely automateable. It doesn’t even look like one of the harder migration problems, unless I’m missing something…

Scalafix already has a RemoveXmlLiterals migration rule contributed by @allanrenucci that converts XML literals into interpolators. To run it on your codebase

coursier launch ch.epfl.scala:scalafix-cli_2.12.4:0.5.10 --main scalafix.cli.Cli -- -r RemoveXmlLiterals path/to/scala-sources

An example diff from migrating lift/framework can be seen here lift-xml-interpolator.diff · GitHub. Running the rewrite on the 60k loc Lift codebase took a few seconds on my laptop. It’s also been validated that the Lift sources still compile after the rewrite when using the interpolator library GitHub - densh/scala-xml-quote: Prototype of xml string interpolator for Scala. developed by @allanrenucci.

Some caveats, if I recall correctly:

  • the interpolator library does not support patterns so the rewrite leaves XML patterns alone.
  • the rewrite unescapes all {{ into { which is incorrect behavior for some presumably rare corner-cases (I don’t remember the details). The strategy was to leave it to users to manually review the diff and validate which {{ should be left unchanged.
7 Likes

It is not required, but recommended, that your HTML is at the same time XML.

This is an excellent news! And I’m amazed (one more time) by scalafix :slight_smile:

Good job!

Knowing that and the fact that we have a strawman library to test against, the next step seems to be getting metrics and feedbacks from available (open source for ex) code base to assess the effort to bring the poc to a good-enough replacement. And perhaps gain one year on the proposed migration plan outlined by @farmdawgnation :slight_smile:

The current HTML standard defines both, an HTML syntax and an XML syntax. They are generally incompatible and they support different subsets of the full standard. I have never encountered the XML syntax in practice, as far as I can tell it is irrelevant.

You can use an XML DOM for HTML but it is overkill. If we only wanted plain HTML5 support in Scala, the DOM could be simpler (in particular, it doesn’t need to support namespaces). You also need different parsers and renderers for HTML.

1 Like

Another option is the JSX approach: embed enough of the xml spec into ours to support a practical subset. It sounds like this could get tricky, but at least we have prior art (JSX) to inspire it.

2 Likes

Not only JSX.

TypeScript, which also supports XML literals, is becoming the lodestar among static typed functional programming languages.

TypeScript even has an advanced type system and some type level programming libraries. Scala / Scala.js is going to lose the war against deno / TypeScript if the ability of customizing the language is gone, as Martin mentioned in another topic:

I am worried about the new goal of Scala 3, which may be very different from the reason why existing Scala users chosen the language.

3 Likes

I did forget about Scala 2.14, so that may, indeed, help. I couldn’t say with certainty how much that shrinks our timeline, at least in part because it’s still an unknown quantity. Each step that breaks binary compatibility comes with its own set of migration timelines as everyone gets caught up.

It’ll take us awhile to get up to Scala 2.13, for example, once that goes final because we depend on other libraries that have to go through their own migrations. It’s just hard to predict how long it’ll take folks to catch up, and we try to provide support for 18 months after a release and would like to avoid forcing a hard cut to a new version of Scala in client code if at all possible.

This is very good news, indeed.

I think we may use some of the double braces, so we’d have to evaluate whether or not the behavior is still correct, apart from just compiling.

This is a way off, but if this proposal moves forward, I’d love to take this out for a spin on the main Lift codebase once we have a formal xml interpolator in the language. I’m less inclined to spend the work validating a third part interpolator because the implementation could change in meaningful ways before adoption into Scala proper and I don’t want to risk potentially adding further churn for users of Lift.

This topic was automatically closed after 30 days. New replies are no longer allowed.