I believe that these problems of sample size will be overcome. For instance, LLMs are pretty good at Estonian even though Estonian is far less prevalent than English. How do they do it? Bob West, a colleague at EPFL, disected the operations of an LLM by instrumenting internal layers and found out that it looks like the LLM will essentially translate Estonian tokens into something resembling English tokens, do the generation in English and then translate back. Not by any explicit instructions, just because this came out of the training weights. Pretty amazing…
Looking further ahead, we can predict that code generation will be very cheap, but establishing trust that the code does the correct thing will remain expensive. So we will have to work on how one can trust LLM generated code. Ultimately, it might mean that LLM generated code needs to come with a proof that it behaves according to specification. Of course, the proof should be generated automatically as well. But what is the specification? There the whole thing cycles back: We need high-level languages and rich type systems to express specifications. And to do that scalably we need to have good module abstractions. To get good module abstractions that are more than just primitive access restrictions and syntactic sugar, we need good support for capabilities.