Infrastructure in '23

Image courtesy of DALLE-2

For the third year running, I set aside some time at the beginning of the year to share what I believe to be the most dynamic and important areas of innovation in infrastructure. If you share my interest in any one or more of these areas, I would love to hear from you.

The future of cloud is here, and it’s Javascript

While I have written previously about the rise of serverless computing, I was slow to appreciate the role Javascript would play in pushing it forward. Javascript is the only language that lives up to “write once, run anywhere.” It has the most vibrant ecosystem of any language on the planet, unmatched startup times, and is secure enough to run untrusted code on behalf of users without modification or special tooling. There is also a clear plurality of engineers who rely on it as their primary language. Thus, it’s hardly a coincidence that the emerging serverless compute players found success with Javascript developers. Their products present credible alternatives to AWS for hosting web apps, notably via the same sandboxing technology that powers Google Chrome - V8. Javascript engines like V8 do not require operating system virtual machines or containers to run, and offer unique performance and security advantages. They also run non-JS code via WebAssembly. Javascript’s dominance is furthered by the way it unifies front-end and back-end development with Typescript, and frameworks like NextJS and Remix. At the same time, runtimes like Bun and Deno are improving upon its usability and performance.

As Javascript continues to surge, so will demand for infrastructure that leverages its sandboxing technology to provide serverless, global compute. The cloud providers currently dominate the market for hosting web apps, but target a far broader developer audience. The resulting design trade-offs of this approach leaves a widening gap between their offerings, and those tailored to Javascript. While the footprint of web apps written in other languages remains vast, new platforms like Fly are challenging the major clouds with a similar, albeit container-based approach for this audience. I believe the shift in developer preference toward these platforms will become more apparent this year, and continue to grab the attention of startups seeking to reimagine core layers of the application stack for the era of serverless compute.

Workflow systems

In infrastructure, “back-end” elicits thoughts of request-oriented web servers, backed by databases that insert, update, and retrieve data for the front-end. However, there is a second, equally valuable, half of the back-end that handles business-critical, longer-running tasks such as billing, reporting, and reconciliation. Ensuring the stability and scalability of how state is managed across these tasks becomes far more complex due to microservices, and the siloes of state they represent. To tame this complexity, engineering teams are increasingly turning to workflow systems. For example, Uber’s ridehailing “workflow” relies on services to match the passenger with a driver, calculate and maintain the wait time, track the fare, and process payments. This workflow might take hours or days to complete. If a failure occurs, operators need tools to detect and mitigate them, often by retrying the failed step or restarting at some checkpoint. Workflow systems promise to eliminate the need for developers to write the burdensome code that protects against such failures by providing such tools. This promise is indisputedly appealing to engineers, and illustrated by surge in enterprise adoption of open-source projects like Temporal and Conductor.

However, the space seems too important for innovation to stop here. The emergence of workflow systems raises the question of how state management will evolve in this part of back-end. Today, workflow systems connect to a database service that handles state. As workflows systems become widely adopted, there appears to be opportunity to better customize databases and persistence layers to support their requirements. One potential outcome could be that workflow systems bundle state management into their cloud offerings to differentiate. It seems equally possible that database vendors will explore ways to tailor their offerings to better position for this increasingly strategic workload. I expect this to be an active design space this year, and am excited to see what comes of it.

The unbundling of OLAP

As businesses embrace frugality in response to the macroeconomic environment, the merits of centralizing data in a single analytics platform are being called into question. While BigQuery and Snowflake’s separation of storage and compute has revolutionized the industry, it can lead to unexpectedly high costs, and comes at the expense of lock-in to their custom storage format. Many are also realizing they don’t have the “big data” that warrants distributed compute to begin with. I believe these factors are contributing to the emergence of a new, unbundled OLAP architecture. In the unbundled OLAP architecture, data is stored directly in object storage like S3 or GCS. Indexing is handled by open-source formats like Hudi and Iceberg, which then structure and provide transactional guarantees over the data to be queried by a distributed query engine like Trino, or in-process with DuckDB. This allows for the right storage, indexing, and querying technologies to be applied to each use case on the basis of cost, performance, and operating requirements. I’ve found it easy to underestimate the power of “ease of use” in infrastructure, which is why I’m particularly excited by DuckDB’s in-process columnar analytics experience. At the same time, open-source projects like Datafusion, Polars, and Velox are making it possible to develop query engines for use cases that were previously considered “too niche” to build for. As the industry standardizes on Arrow for in-memory data representation, the challenge of how data is shared across these new platforms is solved. I expect this will lead to rapid innovation in analytical databases, by commoditizing the approach to query-execution that was a major driver of Snowflake’s success.

The success of this architecture seems likely to chip away at the marketshare of cloud data warehouses. It is being championed by a growing ecosystem of startups, whose collective focus is to reduce the inevitable added complexity of unbundling. I am watching this ecosystem closely, and expect it to spawn multiple large data infrastructure businesses over time.

Foundation Models: a powerful new cloud primitive

Recent breakthroughs in the scaling of language and diffusion models have opened our minds to a new universe of AI applications. Historically, AI adoption has been held back by the technical expertise and proprietary data required to operationalize it. Language models upend this constraint. By using what is effectively “the internet” as a training corpus, models like GPT-3 are scaling to billions of parameters without diminishing returns in performance on general purpose tasks. The implications of this are evident in the cambrian explosion of products that build on these pre-trained models to bring AI to their users. This illustrates the true power of “foundation models” - their ability to lower the barrier to adoption of AI to near-zero. Exactly how we will consume them is an open question. OpenAI models are proprietary, and only accessible via API. However, there are many cases in which leveraging open-source models via platforms like Huggingface makes sense. As the performance gap narrows, and the infrastructure for self-hosting models matures, I expect the use of open-source models to become increasingly popular.

As we learn to wield foundation models in useful ways, questions about how to integrate them into software naturally emerge. Natural language “prompting” as a UX breeds new and interesting challenges for developers. The opportunity to build new infrastructure and tools that make it easier to build with language models is increasingly clear, and has become the single most active design space in infrastructure over the past year. As compute platforms tailored to web development emerged, so will those for AI developers. I believe the winning platform will offer strong Python support, a seamless experience between the user’s local environment and cloud, the ability to scale-out quickly, serverless access to GPUs, and integration with existing data infrastructure and tools. Vector databases built for storage and retrieval of embeddings are another exciting area. OpenAI’s embeddings API makes it easy to build semantic search applications with proprietary data, which should drive demand for them. I also see tremendous, albeit rapidly evolving, opportunity in orchestration. Projects like Langchain and LlamaIndex help developers integrate the process of “prompting” language models into their applications. While multi-modal models with larger context windows may simplify things, I believe we are only scratching the surface of the intelligence and automation that language models will bring to every aspect of software development.

— Bucky