Proposal to remove XML literals from the language

More broadly, I think there’s a tendency to feel that “Scala 3 is our chance to get in all the breaking changes so that we can avoid doing breaking changes too often.” I think that’s wrong. The need to have a strategy for evolving the language without leaving people behind should really be orthogonal to Dotty. Dotty/Scala 3 is primarily about changes to the underlying calculus. To the extent that any language change may affect that, it’s necessarily to design that change in tandem with releasing Dotty. In my opinion, any change that without doubt does not affect the premise of Dotty, should be pushed off to 3.1. We should get 3.0 released as soon as possible (but no sooner of course), and once it’s released work on getting 3.1 released as soon as possible.

A major side benefit of this approach is greatly mitigating the fears that abound, that Dotty will be a python-3-esque community split.

In that light I would agree with deferring XML deprecation. (Of course, then certain people will start saying it will never really happen. But that’s the lesser of the two evils. We should definitely prove them wrong, of course…)

4 Likes

Lift would be left out in the cold until we tracked down and changed all of our XML literals in one cycle.

How hard would that be with a re-write tool that modified everything for you? My understanding is that rewrite tools would be provided; there are lots of things changing with Scala 3 that need a migration tool and this would be one of them.

A rewrite tool may be an help, but it would be delusional to believe it will cover 100% of cases, especially with something as complexe as XML literals.

And the problem is not only on Liftweb framework part, but also in all the hundreds (thousands? Lift is OLD) applications using lift which will need to migrate. For example, Rudder used it, we have a faire amont (15k? More?) lines of html / xml which rely on XML literals. We will needs a good span of time to migrate all of that.

1 Like

Not doing the deprecation/removal too quickly seems sensible to me.

Perhaps by the time Scala 3.0 lands, we’ll have a better story for compiler warning suppression. In that case, 3.0 could contain the deprecation, people still using XML literals could easily selectively disable those warnings, and then the removal would come in 3.1.

3 Likes

Is there a reason to believe this? Offhand, this seems like one of the more straightforwardly mechanical rewrites. I can’t say I know the subject intimately, but I don’t see any reason offhand why we shouldn’t expect to be able to completely rewrite XML literals to XML string interpolators…

1 Like

XML is extremelly complexe. See for example the message by @adriaanm above. And there is tens of suble case because of the nature of XML class hierarchie.

In all cases, that tool does not exists yet. Once (if) it exists and prove that it can be performant, of course it will help. But I can’t believe it is a 3 month project - please prove me wrong on that.

1 Like

I think that what @jducoeur meant is that, despite parsing xml being complex and the xml library being complex, a rewriting tool should do more or less the following:

val literal = <hello> {foo()} </hello>

=>

val literal = xml"""<hello> ${foo()} </hello>"""

That’s a rewrite that looks like it shouldn’t be too hard to automate. The complexity is in (now) the compiler and (then) the string interpolator. That shouldn’t necessarily make the rewrite from literals to interpolated strings complex. Unless we are overlooking something, in which case it would be very valuable to have some examples of cases that cannot be trivially rewritten.

After reading this thread I’m left wondering how many people actually use XML literals for something other than HTML and what is their opinion on having to switch to string interpolation.

As has been pointed out above, today’s HTML syntax (i.e. HTML5-based) is not XML, it is not a subset or superset, either. So if you want XML support in the language, do you really want XML or do you want something more or less suitable for encoding HTML?

Actually this is an unrelated question, IMO. There will be discussion threads for every change in the language.

2 Likes

Correct – a little more to it than that, mostly to preserve whitespace and keep the formatting clear in the case of multi-line tags, but that seems to be most of it.

XML is complex, sure – but it seems like migrating it should be entirely automateable. It doesn’t even look like one of the harder migration problems, unless I’m missing something…

Everything is a 3 month project given sufficient numbers of GSOC interns.

3 Likes

Scalafix already has a RemoveXmlLiterals migration rule contributed by @allanrenucci that converts XML literals into interpolators. To run it on your codebase

coursier launch ch.epfl.scala:scalafix-cli_2.12.4:0.5.10 --main scalafix.cli.Cli -- -r RemoveXmlLiterals path/to/scala-sources

An example diff from migrating lift/framework can be seen here https://gist.github.com/olafurpg/3d71a100ce43c4222b311ed8c5ab67bc. Running the rewrite on the 60k loc Lift codebase took a few seconds on my laptop. It’s also been validated that the Lift sources still compile after the rewrite when using the interpolator library https://github.com/densh/scala-xml-quote developed by @allanrenucci.

Some caveats, if I recall correctly:

  • the interpolator library does not support patterns so the rewrite leaves XML patterns alone.
  • the rewrite unescapes all {{ into { which is incorrect behavior for some presumably rare corner-cases (I don’t remember the details). The strategy was to leave it to users to manually review the diff and validate which {{ should be left unchanged.
7 Likes

It is not required, but recommended, that your HTML is at the same time XML.

This is an excellent news! And I’m amazed (one more time) by scalafix :slight_smile:

Good job!

Knowing that and the fact that we have a strawman library to test against, the next step seems to be getting metrics and feedbacks from available (open source for ex) code base to assess the effort to bring the poc to a good-enough replacement. And perhaps gain one year on the proposed migration plan outlined by @farmdawgnation :slight_smile:

The current HTML standard defines both, an HTML syntax and an XML syntax. They are generally incompatible and they support different subsets of the full standard. I have never encountered the XML syntax in practice, as far as I can tell it is irrelevant.

You can use an XML DOM for HTML but it is overkill. If we only wanted plain HTML5 support in Scala, the DOM could be simpler (in particular, it doesn’t need to support namespaces). You also need different parsers and renderers for HTML.

1 Like

Another option is the JSX approach: embed enough of the xml spec into ours to support a practical subset. It sounds like this could get tricky, but at least we have prior art (JSX) to inspire it.

2 Likes

Not only JSX.

TypeScript, which also supports XML literals, is becoming the lodestar among static typed functional programming languages.

TypeScript even has an advanced type system and some type level programming libraries. Scala / Scala.js is going to lose the war against deno / TypeScript if the ability of customizing the language is gone, as Martin mentioned in another topic:

I am worried about the new goal of Scala 3, which may be very different from the reason why existing Scala users chosen the language.

3 Likes

I did forget about Scala 2.14, so that may, indeed, help. I couldn’t say with certainty how much that shrinks our timeline, at least in part because it’s still an unknown quantity. Each step that breaks binary compatibility comes with its own set of migration timelines as everyone gets caught up.

It’ll take us awhile to get up to Scala 2.13, for example, once that goes final because we depend on other libraries that have to go through their own migrations. It’s just hard to predict how long it’ll take folks to catch up, and we try to provide support for 18 months after a release and would like to avoid forcing a hard cut to a new version of Scala in client code if at all possible.

This is very good news, indeed.

I think we may use some of the double braces, so we’d have to evaluate whether or not the behavior is still correct, apart from just compiling.

This is a way off, but if this proposal moves forward, I’d love to take this out for a spin on the main Lift codebase once we have a formal xml interpolator in the language. I’m less inclined to spend the work validating a third part interpolator because the implementation could change in meaningful ways before adoption into Scala proper and I don’t want to risk potentially adding further churn for users of Lift.

This topic was automatically closed after 30 days. New replies are no longer allowed.