Move scala.io and scala.sys.process to external modules

hepin1989 · November 11, 2018, 8:03am

I think the Scala core library itself should be more elegant, and there are many great libraries out there do this kind of job very well nowadays.
So I want the scala.io and scala.sys.process be a Scala module instead of parts of the core library.

MarkCLewis · November 11, 2018, 2:49pm

I’m going to speak against this for two practical reasons, based on the assumption that these modules won’t be automatically included in scripts. Right now, Scala has a scripting environment that works nicely, but only has easy access to core libraries. Both scala.io and scala.sys.process are really useful for scripting, and it doesn’t make sense to me to require anyone who wants to write a script to have to do a lot more work to get this type of functionality. The appeal of the scripting environment is simplicity.

The second reason is for education. I teach CS1 and CS2 using Scala. The CS1 students have often never programmed before and we use the Scala scripting environment for this. I think it works great. A lot of introductory teaching is moving toward Python these days in large part because they have a simple scripting environment. (I will not that this bothers me because I worry Python allows to many bad habits and doesn’t allow for the teaching of certain key concepts, but that’s a different topic.) Scala scripting is, IMO, a better, type-safe way of doing this. However, it only works well if it remains simple to get access to the key functionality that we want to teach. That definitely includes the file access and basic input that are part of scala.io. For the Scala community to grow, we need more developers with experience using Scala. The only way to really get that to happen at scale is to get colleges to use Scala for several courses, and while changes like this might make sense for developers, they are likely going to hurt the educational use case.

hepin1989 · November 11, 2018, 3:28pm

For Scripting , amm is better and powerful right?
Move them to a stand alone module allow them to evolve fast and separately too, back to the scripting Env, is it hard for any student to learn about just one line of dependency adding?When they finally finish their education, the real world is using SBT/Maven/Gradle like thing.

I think the main reason for the colleges move to Python is because of BigData ,AI and Tensorflow. Not the Friendly scripting Env.

I really hope the Scala core library be small and elegant, for it easy to ship and grow.There are many code that is not been quite used in industry lives in the core library, which should and could be separate modules.

I know your concern , for Education you could let them using a configured online playground which even not needed them to install anything.

MarkCLewis · November 11, 2018, 4:30pm

Let me repeat that this is for CS1. Most of these students have never programmed before. Every extra step you add is significant. Every extra piece of software you make them download is significant. I have them for eight semesters to get them up to using industry tools. It doesn’t happen in the first semester. Indeed, I have them using sbt (though not setting it up, just cloning a git repository) in CS2. However, what you are describing is taking away readLine and readInt. Those are things I want to be able to do very early on. The educational argument for Scala in CS1 was already damaged when they made it so the readX methods need an import.

If you think that CS0 and CS1 are being taught in Python because of BigData, AI, and TensorFlow then I think you are a bit out of touch with what those courses cover and what it is like to be a beginner. Courses on BigData and AI can certainly gravitate toward Python, but your first program is “Hello World”, not a highly functional chatbot. The beauty of starting with scripts is that Hello World is a single line of code and responding to simple user input isn’t much longer. Every extra step you throw in at that point is actually a very large percentage change in what students need to know.

My fear with the “playground” idea is that if it is too far removed from real tools, then you are spending time teaching students to use things that aren’t going to matter. Knowing a text editor like vi and how to run scripts on the command line is a tool they will be able to use their whole careers. The same is true for professional IDEs. However, that isn’t true for educational environments in general. For teaching Java people created tools like BlueJ and Greenfoot, in large part to get away from the fact that Hello World is multiple lines of code with lots of keywords that aren’t important for that particular task. The problem is that those tools have to be thrown out quickly because they don’t appear anywhere outside of academia, so students getting familiar with them is purely overhead.

lihaoyi · November 11, 2018, 5:08pm

Right now, Scala has a scripting environment that works nicely, but only has easy access to core libraries.

Ammonite makes access to any libraries in the ecosystem really easy; once installed you can run any script you want using whatever library you want, imported just as easily as any core library via import $ivy.

That definitely includes the file access and basic input that are part of scala.io

Ammonite comes bundle with OS-Lib, which is modelled on the Python sys/subprocess/os modules and contains all the same functionality laid out in all the same ways. If you think Python is good for simple scripting because of its IO libraries, here you have almost exactly the same thing in Scala.

The appeal of the scripting environment is simplicity.

Ammonite is a 1-line 1-file download and install on all major operating systems. Doesn’t get much simpler than that. If they can brew install scala or ... && sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 2EE0EA64E40A89B84B2DF73499E82A75642AC823 && ...-install SBT, they can brew install ammonite-repl just as easily

After that, it’s just amm to open a REPL or amm foo.sc to run a script. Just like Python.

My fear with the “playground” idea is that if it is too far removed from real tools, then you are spending time teaching students to use things that aren’t going to matter.

Ammonite is used heavily in many professional environments, both the REPL and scripts. IntelliJ-Scala has first-class support for it built in. It is very much a real tool, and isn’t some toy playground. In fact, it is much more a real tool than Scala’s in-built scripting functionality, which is very much not a real tool which to a first approximation nobody really uses it to do work professionally.

Everything you’ve said so far is compatible with a more streamlined core library and using Ammonite to teach introductory classes via it’s scripts and REPL. Could be worth giving it a shot

hepin1989 · November 11, 2018, 8:40pm

Sorry for some conclusion for I got that from what I got here (in China), different countries may come out very variant.

But I have to say, even in China where our main language is not English, we could get into programming during the first semester. But I have to say, we are not using Scala for the introduction, just starting with C99.
Clojure, Racket or C99 like thing may be better for serving the first programming language.

For the tools part, if the main purpose is working on Linux envs, then that should be a dedicated course to teaching them Linux、Shell and common tools. But if the main purpose is teaching them what programming is, Then I would never think Scala is the best choice. But if you want to teach them, Scala, Then I think teaching them like how they will really do when they finishing their learning would be best.

Never underestimate the students, They can and they do, do it well. And JetBrains have special tools for education too, that may help: https://www.jetbrains.com/education/?fromMenu

Back to the request, move scala.io and scala.sys.process out will really not put many barriers, and your students will enjoy and thank you too if they were learning something really work out from the first day they were in the college.

hepin1989 · November 11, 2018, 8:49pm

And you could really give Ammonite a try, it’s extremely simple powerful and elegant.
And I must confess that I know no one is using scala script for real job.

joshlemer · November 12, 2018, 3:47pm

If people are unsatisfied with scala.io and scala.sys.process, rather than pulling these libraries out of core (since there is no Scala Platform for them to reside in), I’d rather that issues with these packages be addressed by fixing them (or at least describing the issues in tickets so others can fix them). I personally find that Scala’s standard library is already pretty anemic as it is, and have encountered multiple people at my work from different backgrounds (with no experience in Scala) who expressed either disdain for lacking features in the stdlib (upon hearing that something was in stdlib, getting comments like “you’re kidding, Scala actually includes something useful in the standard library for once?”), or explicitly told me that they chose not to adopt Scala for lacking features (specifically, “native” support for JSON).

Are there practical reasons to remove these packages? I don’t see io and sys being major burdens on maintainers of scala/scala.

MarkCLewis · November 12, 2018, 3:52pm

Thank you for the response Li. I have considered using Ammonite multiple times in the past and have just never pulled the trigger on it. It is still on my radar as a possible change. Even with Ammonite though, this does add a complexity that is hard to explain. It is nice that there are special imports to bring in packages like import $ivy, but that isn’t a standard import that will work in other tools, so it has to be explained at some point what it does and why it works there and not in other places. When a student is trying to understand what an if does, expecting them to grasp package management is a rather lofty goal.

MarkCLewis · November 12, 2018, 4:00pm

Instead of going into a long description here about why we use Scala in CS1 and CS2, I’ll just share links to blog posts I wrote about it back in 2013.

Note that I’m primarily comparing to Java in these because in 2013, Java was still very much the dominant introductory language. Python has gained a lot of ground in the last five years.

I would note that I think it is significant to teach things that are useful, but you can’t throw everything at the students at once. You have to ease them into this. As for why not use other languages like C99, Racket, and Clojure, the blog posts probably address this, but the quick answer is that I can’t cover all the topics I want to do in CS1 and CS2 in any of those three languages. Two are dynamically typed and two have no real concept of OO. That means they won’t scale to the concepts that I want to teach in CS2.

One key thing to remember about CS education is that it isn’t about teaching languages, it is about teaching concepts. The languages are the vehicles for teaching those concepts. One of the big advantages of Scala is that it is capable of expressing a broader range of concepts than many other languages and every language switch you go in an educational space is basically giving up 2-4 weeks of time to get students up to speed on that new language. So you want to find a set of languages that isn’t too large, but which spans paradigms and allows you to cover all the concepts that you want in your curriculum.

sjrd · November 12, 2018, 5:11pm

Well, for one, they are not portable. They don’t work on Scala.js. Yet, since they are in the standard library, trying to use them will result in the code compiling, but then it won’t link (typically with obscure linking errors).

As far as I’m concerned, I think the standard library should only contain portable stuff.

lihaoyi · November 13, 2018, 1:34am

I think the reason they aren’t major burdens on maintainers is that these packages are literally unmaintained. It just takes 1-2 clicks through the git blame to see the last major changes took place >8 years ago, when paulp imported the code wholesale from SBT without review. Since then the code has been reformatted once or twice, and that’s about it.

The community has learned a lot about how to write “good” Scala code in the last 8+ years. If you assigned people to try and improve it without breaking compatibility (otherwise what’s the point v.s. just throwing it out?) I think you’d quickly find the burden is impossibly large.

hepin1989 · November 13, 2018, 3:44am

refs: Modularization of Scala 2.13 standard library
refs: https://github.com/scala/scala/pull/5677

adriaanm · November 13, 2018, 1:07pm

Until the official Scala REPL has functionality to easily resolve ivy/maven artifacts (maybe even by default), I don’t want to remove functionality from the stdlib that’s typically used from the REPL. It would be too burdensome to have to pull in a separate module. Thus, this needs to remain in core and ship with the stdlib, which has strict compatibility requirements.

However, the code quality of these packages is lacking, and should be improved.

To reconcile these two, how about we relax our compatibility policy for these packages? We’ve been talking about adding Akka’s ApiMayChange annotation to the stdlib. Here’s their policy for APIs with this annotation: https://github.com/akka/akka/blob/master/akka-docs/src/main/paradox/common/may-change.md

yangbo · November 13, 2018, 2:43pm

We can improve scala.io by introducing scala.nio without breaking backward-compatibility.

hepin1989 · November 13, 2018, 6:14pm

Pull them out let them be at the same competition position as the libraries from the community, and people could choose which one to use more wisely.

I want to core library to be more concise, not a supermall.

lihaoyi · November 14, 2018, 3:07am

Here’s another proposal that’s consistent with these requirements: what if we shipped scala.sys and scala.io with the REPl (scala-compiler.jar) rather than with the standard library? Then people who are using the REPL can continue using it, while it can live with the lower binary/source compatibility guarantees of scala-compiler (which makes sense if it’s intended for manual REPL usage) and evolve faster than the standard library

som-snytt · November 14, 2018, 5:11am

That’s an interesting idea. Stuff good enough to bundle for REPL usage, but maybe not production quality for apps.

My deleted reply mentioned that XML lib is already not on the REPL class path. Adriaan joked that it was incentive for someone to contribute magic imports.

From the lack of uproar, I take it that people aren’t doing much XML.

scala 2.13.0-M5> <hi/>
                 ^
                 error: To compile XML syntax, the scala.xml package must be on the classpath.
                 Please see https://github.com/scala/scala-xml for details.

yangbo · November 14, 2018, 10:17am

Why not just just soft-link scala to amm?