And you rarely do (i.e. in terms of actually commiting the memory) even in the case of using platform threads directly (i.e. just as we all do today). If you set -Xss8m
that doesn’t mean every new thread eagerly asks operating system to physically allocate that much memory. It’s lazily commited, so you can e.g. set -Xss1g
and it should work even with many threads if you don’t create deep call stacks.
Who implements tracing and where? Where’s the extra CPU cost? When running Java bytecode the JVM needs to create Java stack frames anyway. JVM doesn’t care if you run your code inside Future
or not, if you run your code inside virtual thread or raw underlying thread. If the call is not inlined then JVM has to create new stack frame. Stack trace is just a metadata dump from current stack frames (includes inlined calls too, but data for inlined calls can be deterministically inferred from stack frames of outer calls).
I think I have to guess where’s the cost you’re mentioning. Is it the cost of GC scanning the GC roots? Have you measured that? People are running thousands of virtual threads at once and reporting massive throughput advantages with that, so most probably the cost of scanning GC roots is not dominating anything.
If you like to make really deep call stacks using tail calls and don’t want them to occupy a lot of memory then tail call optimization is in the scope of OpenJDK: Loom (opt-in per each method)
The goal of this Project is to explore and incubate Java VM features and APIs built on top of them for the implementation of lightweight user-mode threads (fibers), delimited continuations (of some form), and related features, such as explicit tail-call.
but then (most probably) you’ll lose some of the stack trace information. Anyway, who needs really deep stack traces? If your explicit stack trace, in your error logs, is deeper than e.g. 1000 calls (in total, including suppressed exceptions, causes and so on), then you’re sabotaging yourself anyway.
In short, you’re talking about some extra costs, but you haven’t explained where they supposedly come from. Show at least some hypothetical numbers and do some calculations. Compare virtual threads costs to raw platform threads costs.