Technology

The state of microservices according to Temporal Technologies


clock.jpg

For the end user, cloud-native services are supposed to simplify life and provide more agility, but for the developer, they can make life far more complex because of their distributed nature. Among the challenges is managing state, something that is second nature to database practitioners, but not necessarily app developers. That’s the challenge that Temporal Technologies has taken on, providing the state management behind the orchestration of microservices, picking up where service meshes like Istio leave off.

It’s understandable that you’ve probably never heard of this two-year old company before as its sparse website makes the company almost look like it’s still in stealth. But Temporal has several dozen paying customers, among them Datadog, Netflix, Instacart, Qualtrics, Box and others. And if you dig down closely enough, you can actually find some real documentation. And just in case we forget to mention it, Temporal just secured a $103 million Series B round.

Specifically, Temporal pinpoints a narrow task: managing the state of microservices. Given that microservices typically fire up in highly distributed cloud environments, managing state is akin to choreographing transactions in a masterless or multimaster database. That’s a challenge that, for instance, Cassandra developers know quite well. In databases, it’s all about balancing transactional consistency with write availability. In the application, or microservices tier, it’s about availability, where the chain (in this case, compute nodes hosting specific microservices) will only be as strong as its weakest link.

Managing state, which commits transactions, is key to ensuring that results are valid and current, and for keeping the system – whether it is a database or application – from crashing. For instance, when you withdraw cash from a bank ATM machine, state management is essential for ensuring that the transaction is only completed when the account has been debited.

The need to manage state in distributed environments is very critical because, with multiple moving parts, there’s decent likelihood that one of them will misfire. And so anything running on the Internet or in the cloud requires engineering for failure, involving failover and workarounds so the outage of a single node won’t crash the whole application or service.

In the database world, state engines were typically built in; if you launch a database, you don’t have to write your own state engine. In the AppDev world, that’s not the case; developers typically had to write their own.

For microservices, organizations would typically have to write their own state machines in addition to application code. For Temporal customer Checkr, a service that provides online employee background checks, a typical workflow often involves a series of 50 – 60- automated and manual steps (each of them microservices) retrieving data from a wide variety of external sources. There were lots of Kafka queues to juggle, writing data to multiple target databases, then writing logic to merge the results. With Temporal server, they could focus on the app rather than the state engine.

Temporal characterizes its solution as “the open source platform for orchestrating highly reliable, mission-critical applications at scale.” For microservices, at first glance that sounds a lot like what service meshes do. But service meshes operate at infrastructure level, making connections and ensuring failover if nodes go down. By contrast, Temporal focuses on application level, and more specifically, checking whether the code or logic in the microservice executed, and if not, managing workarounds dealing with cascading dependencies.

The problem that Temporal solves with microservices is nothing new. As noted above, in the AppDev world, state engines have to be written as external code, or bundled as part of some framework. That’s exactly the problem that Internet applications also had to resolve because the web was stateless, and that’s what led to dedicated middleware, or appservers, to handle the process with web applications, where popular language like Java carried their own mechanisms for managing state.

With Temporal history is repeating itself in the microservices tier. Its technology, a state management server, comes from a five-year old open source project that was the outgrowth of work developed at Uber.  It’s built around Temporal Server, a microservice orchestration platform that sits between compute servers and executable source code.

That prompts the obvious question: if microservices are distributed in nature, executing in distributed computing environments, won’t a central orchestration server defeat the purpose by introducing a single point of failure? The answer is a new “experimental” multi-cluster asynchronous replication feature that should provide the necessary failover capabilities. When it comes to transactional guarantees for microservices, the future is still a work in progress.



Source link