Where should microservices go after entering the deep water area

Author | Luo Guangming

In 2022, several interesting things happened with microservices. First, Elon Musk, who has been officially in charge of Twitter for a short time, expressed his opinion on Twitter’s development team “criticismHe said he was sorry for how slow Twitter was in many countries. It was so slow because the app needed to do 1000+ “bad” batch RPCs just to render the timeline of the home page Musk said that “part of today’s work will be to shut down bloated “microservices.” In fact, less than 20% of the microservices Twitter needs. “Second, Jason Warner, former CTO of GitHub, said on social media: “I am convinced that one of the biggest architectural mistakes in the past ten years has been the full use of microservices. “Anyone who’s ever built a large distributed system knows they don’t really work that way, but have to get used to it.” “So, is microservice architecture a mistake, or are microservices obsolete?

Does the business need microservices?

The origin of microservices can be traced back to the Micro-Web-Service proposed by Dr. Peter Rodgers in 2005, which is to design applications as fine-grained Web services. In 2014, Martin Fowler and James Lewis formally proposed the concept of microservices for the first time, definingThe microservice architecture is to develop an independent application system by developing a group of small services, each service runs as an independent process, and each service communicates with other services using lightweight communication mechanisms (RPC, HTTP). These microservices are built around business functions and can be in different programming languages.

The birth of microservices is no accident. The emergence and development of CICD & Infrastructure As Code has gradually simplified infrastructure management and made function iterations faster and more efficient, which has promoted and promoted the further development and implementation of microservices.

However, the microservice architecture is not a silver bullet. While the microservice architecture brings benefits to the business, it also brings corresponding side effects.Adopting the microservice architecture at the right stage, reasonably determining the split granularity of the microservice architecture, and selecting the appropriate microservice technology are the core factors that determine the success or failure of microservices. Regardless of the above three points, criticizing the necessity of the existence of microservices and clamoring to return to the monolithic era are all untenable. Carefully analyze the views of Jason Warner, the former CTO of GitHub, and you will find that what he advocates is the orderly splitting and advancement from monomers to microservices, and the number of splitting microservices needs to be suitable for the size of the organization. Warner encourages enterprises to Choose according to the situation, not blindly follow. This is essentially consistent with the viewpoint advocated above.

The right stage to adopt the microservice architecture

It is observed thatAlmost all successful microservice architectures start out as a giant monolith. In the face of a new product and a new field, it is difficult to understand the business clearly at the beginning. It is often after a period of time that the business is gradually clarified before gradually transforming into a microservice architecture. In addition, in the case of limited physical and human resources, the risk of adopting the microservice architecture is relatively high, and many advantages cannot be reflected. Especially when the technology is not selected properly, the disadvantages of microservice performance will be more prominent.

Before the division of microservices, it should also be ensured that the company’s internal infrastructure and public basic services are fully prepared. Service faults can be quickly located through monitoring components and products, services can be deployed and managed automatically through tools, service development costs can be reduced through a one-stop development framework, service availability can be verified and improved through gray release and swim lanes, and resources can be used The scheduling platform quickly applies for and releases resources, and can quickly expand applications through elastic scaling services.

Reasonably determine the split granularity of the microservice architecture

The split granularity of the microservice architecture should be reasonable,The micro-words in the micro-service architecture do not mean that the smaller the granularity, the better, nor does it mean that the more the better, a reasonable intermediate value should be pursued. If the granularity is too small, the disadvantages of microservices will be magnified. Compared with the original monomer, the disadvantages of microservices at this time are indeed greater than the advantages; if the granularity is not enough, the disadvantages of monomers have not been eliminated, and the advantages of microservices have not been highlighted, which is also unreasonable. of. The granularity splitting of microservice architecture is a very complicated matter. For newcomers or outsiders who do not have a deep understanding of business and team organizational structure, it is impossible to judge a reasonable microservice splitting granularity.

With the development of the business, changes in the organizational structure, and the improvement of the level of team developers, the granularity may change at any time. This is a process of continuous evolution., there is no absolute right or wrong.The design of microservice architecture and the splitting of microservices should conform toConway’s lawthat is, the microservice architecture needs to be consistent with the organizational structure that produces these designs, even dynamically.

Of course, it is not enough to be familiar with the business and team organizational structure, and a reasonable microservice splitting strategy is also required. For example,Relatively independent new businesses give priority to adopting microservice architecture, giving priority to abstracting general services, giving priority to abstracting services with clear boundaries, and giving priority to abstracting services with independent attributes.Finally, it is recommendedPrioritize abstraction of core servicesBecause the operation and maintenance costs of microservices are high, not all places need to be split. In addition, as time goes by, the business may change, while the core services are relatively stable and do not require major adjustments.

Proper Microservice Technology Selection

After deciding to use microservices, the selection of microservice-related technologies is crucial. The microservice framework can encapsulate and abstract common capabilities in distributed scenarios, such as load balancing, service registration discovery, fault tolerance, and basic remote communication capabilities, allowing developers to quickly develop high-quality services. Therefore, before adopting microservices, the choice of development language and the selection and trial of the corresponding microservice framework should be carried out.

So how to choose a microservice framework?It is recommended to measure from four perspectives: scalability, ease of use, feature richness, and performance.

first of all Scalabilityif an open source framework is strongly coupled with internal capabilities, supports a single scenario, and cannot be expanded, then this framework will be difficult to implement in the customized scenario within the enterprise.
followed by ease of usebusiness developers do not want to pay attention to many underlying details of the framework, and it needs to be simple enough to use; while secondary developers who are oriented to the framework, they need to do some custom support for the framework. If the expansion capabilities provided by the framework are too broad, the expansion cost will be high. Or the ability to expand is not enough, then this framework also has limitations.
again feature richnessalthough the framework can be customized based on extensibility, not all developers have enough energy to do customized development. If the framework itself provides different support for various expansion capabilities, developers only need The infrastructure can be combined to run in its own environment.
The first three points are the indicators that need to be focused on when selecting a microservice framework in the early stage of microservice transformation. However, as the service scale and resource consumption become larger,performance becomes an issue that cannot be ignored.In the long run, you must pay attention to performance when choosing a framework, otherwise you will only face the huge cost of framework replacement or be forced to customize, optimize and maintain the framework.

In general, microservices are not a silver bullet, but monomers also have many defects.Microservice is a modern and cloud-native architecture that is naturally derived from the monolithic architecture, is the general trend.just business needsAdopt the microservice architecture at the right stage, reasonably determine the split granularity of the microservice architecture, and select the appropriate microservice technology. If the pace is too fast, the service split is unreasonable, some mistakes are made in technology selection, and the architecture needs to be rolled back to a single body. This is also an individual case, but it is also something everyone needs to be vigilant and take as a warning.

Microservices have been popular for nearly ten years, and many technological innovations and open source projects have been born around microservices. There are also quite a few companies that have completed the implementation and promotion of microservices internally.Taking ByteDance as an example, the number and scale of its microservices has ushered in rapid development in recent years. In 2018, the number of online microservices was about 7000-8000. By the end of 2021, the number of microservices has exceeded 100,000.. It is quite rare for the number of microservices within an enterprise to reach such a scale, but this also means that microservices have been developed in depth in Byte, so where is the next step?

combineByteDance service framework team practices and industry trends, we believe that the subsequent development direction of microservices will mainly focus on security, stability, cost optimization and standardization of microservice governance.

Microservice Security

In the microservice architecture, the number of microservices increases exponentially with the decomposition and growth of the business. Because microservices are distributed across different servers, they often expose a larger and more diverse attack surface than a single implementation of the same platform, making it more important to find and fix vulnerabilities as quickly as possible to avoid problems. difficulty. therefore,Each microservice needs to authenticate and authorize the user’s behavior, and clarify the identity and authority level of the current access user. At the same time, the entire system may also need to provide certain services to the outside world, such as third-party login authorization. In this case, if each microservice is required to implement its own user information management system, it will not only increase the workload of development, but also increase the probability of errors. Therefore,Unified authentication and authorization and support for enabling mTLS on demand are particularly important.

Service authentication is a microservice access control capability based on identity verification request identity legitimacy. The service achieves the effect of strict identity authentication by configuring the global switch to be on for a specific method or a wildcard method. In order to meet compliance requirements, ByteDance has carried out special governance for data access across business lines. A prerequisite for governance is to fully implement the service identity of Zero Trust (ZTI), so as to identify the trusted identity of the data requester and further realize fine-grained access. Control to meet user privacy compliance requirements, ensure the security of microservice interfaces, and prevent misoperations such as deleting databases.

Strict authorization is a microservice access control capability based on allowed access list + disallowed access list. The service can achieve strict access control effect by configuring the specific cluster or wildcard cluster authorization for the upstream service for the specific method or the wildcard method, combined with the global switch being turned on.

Some RPC frameworks also have strict authorization capabilities, but the implementation of this capability by different frameworks is not aligned, and it is increasingly difficult to meet business configuration requirements. ByteDance is based on the service grid ByteMesh to achieve a unified and strict authorization capability, which also benefits from the fact that the service grid technology has been fully implemented within ByteDance.

Microservice Stability

The stability of online services is crucial to Internet applications, and services with poor stability will bring users a bad experience and even cause direct economic losses to enterprises. An important indicator to measure the normal stability of services is SLO (Service-Level Objectives), which represents the service availability level target. For example, if the success rate SLO of the service interface is set as 99.99%, the unavailable time below the SLO baseline within a week is less than X hours, and the service stability is not up to standard if it exceeds X hours.Typically, the governance capabilities of microservices, such asCapabilities such as timeout, retry, fault tolerance, and current limiting can meet most scenarios and improve the stability of microservices. ByteDance has a high level of internal micro-services and a large scale. In order to do a good job in fine-grained service governance, a series of service governance solutions such as single-instance governance and dynamic overload protection were born to further improve the stability of micro-services.

Microservices and large-scale distributed deployment bring a certain burden of operation and maintenance, and sometimes it is difficult for the business to distinguish whether the problem is the infrastructure or the business itself. Among them, the most common problem that often does not require business awareness is: the abnormal service capability of a very small number of (one or several) instances leads to service SLA jitter.We refer to these problems collectively assingle instance problem. The cause of the single instance problem is complex, mixing multiple factors such as physical level and business level, and it is almost impossible to completely eradicate it.

In order to reduce the business perception of single-instance problems and improve the SLA of business core services, the ByteDance service framework team built a solution for single-instance problem jitter governance based on the dynamic load balancing capabilities of the service grid ByteMesh. This solution dynamically updates the weight of server instances by collecting RPC-related indicators (delay, error rate, etc.) To achieve a more even service load effect. This centralized control solution has been verified on a large scale in Douyin e-commerce and other businesses, effectively improving the efficiency of fault identification. Taking a specific service as an example, when a single-instance failure occurs, the default configuration will execute service discovery within 1 minute to downgrade or remove it. Therefore, the service SLO unavailable time is controlled within 1 minute, and single-instance governance is provided to remove the large disk, which can facilitate Troubleshoot single instance issues.

Microservice Cost Optimization

In the cloud-native era, with the continuous splitting and large-scale growth of microservices,The emergence of microservices is an inevitable problem, and the disadvantages of additional delay and resource consumption are becoming more and more prominent.. There are also many anti-microservice voices in the industry, calling for a return to monomer. However, monomers are not a silver bullet. Returning to monomers is tantamount to drinking poison to quench thirst, and it is not in line with business development trends. to this end,For the cost optimization of microservices, ByteDance has explored many paths, including but not limited to merged deployment, JSON serialization optimization, and development framework overhead optimizationwait. In addition, under the background of the company’s promotion of cost optimization, the service framework team has promoted more cooperation with business and other basic teams, and strives to dig out more performance optimization points.

merge deployment

Byte made an in-depth exploration of performance optimization solutions based on the microservice architecture, and explored a variety of merged deployment solutions. Although other companies in the industry have also made some explorations on merged deployment, the relevant solutions have a greater impact on the existing system in terms of compilation, deployment, monitoring, service governance, and service isolation, and cannot well support the existing system. Simultaneous deployment There is a problem that mutual influence leads to a decrease in collaboration efficiency. In response to this problem, the infrastructure team proposed a new merged deployment solution, combining container affinity scheduling, traffic scheduling calculations, and more efficient local communication, so that the network communication that originally required cross-machines becomes an inter-process call on the same machine , which can not only integrate with the existing system, but also reduce the performance loss caused by the microservice link. At QCon 2021, the infrastructure service framework team shared the “ByteDance Microservice Consolidation and Deployment Practice”, which received widespread attention from other companies in the industry.The main idea of the merged deployment solution is to combine container affinity scheduling, traffic scheduling calculation, and more efficient local communication, so that the network communication that originally needs to be cross-machine becomes an inter-process call on the same machine, which can not only integrate with the existing system but also reduce micro- Performance loss caused by service link.

Taking the IO-intensive test service as an example, after implementing the combined deployment solution, the CPU is reduced by 30% – 40%, the delay is more stable, and the fluctuation problem is gone. At present, the merged deployment has been implemented in multiple business parties of Byte. After further optimization of performance and strategy, through global decision-making control, the merged deployment is expected to be implemented on a large scale throughout the company.

JSON serialization optimization

With its concise syntax and flexible self-describing ability, JSON is widely used in various Internet services. However, since JSON is essentially a text protocol, and there is no mandatory model constraint similar to Protobuf, the encoding and decoding efficiency is often very low. Coupled with the improper selection and use of JSON libraries by some business developers, it eventually led to a sharp deterioration in service performance. ByteDance also encountered the above problems. According to the performance analysis data of the company’s CPU ratio TOP 50 services, the overall JSON encoding and decoding overhead is close to 10%, and the proportion of a single business even exceeds 40%. It is very important to improve the performance of the JSON library. Therefore, ByteDance has developed a set of JSON high-performance codec library sonic-go and sonic-cpp for C/C++ services, which are currently open sourced under the Bytedance GitHub organization.

In the process of designing sonic-go, the team borrowed optimization ideas from other fields/languages (not limited to JSON) and integrated them into each processing link. There are three core technologies:JIT, lazy-load and SIMD . In addition to the technologies mentioned above, there are many detailed optimizations inside Sonic, such as using RCU to replace sync.Map to improve the loading speed of codec cache, using memory pool to reduce the memory allocation of encode buffer, and so on. Since its release in July 2021, sonic-go has been adopted by many businesses such as Douyin and Toutiao, saving hundreds of thousands of CPU cores for ByteDance. Currently sonic-go v2 is being designed and developed, and it is expected to achieve greater performance improvements.

And sonic-cpp integrates the advantages of rapidjson, yyjson and simdjson in design, and further optimizes on this basis.In the process of realization, the mainTake full advantage of key technologies such as vectorized (SIMD) instructions, optimized memory layout, and on-demand parsing, enabling serialization, deserialization, and CRUD to achieve the ultimate performance. Since sonic-cpp was launched internally in Byte, it has saved hundreds of thousands of CPU cores for core businesses such as Douyin and Toutiao.

Both sonic-go and sonic-cppCompatible with all interfaces of common json libraries, the transformation cost is extremely low, and it is convenient for the rapid migration of stock services. The larger the scale of the business, the greater the benefits of migrating to sonic-go and sonic-cpp.

Development framework overhead optimization

For business, choosing a high-performance programming language is very important, but it is often limited by the technology stack and historical background of the R&D team. Moreover, language migration is often costly and requires a lot of motivation to promote. For some start-up companies, or for new business lines of companies with a certain scale,Golang is one of the best choices in the cloud-native era. It is not difficult to master this language. It is ecologically friendly to cloud-native and has high performance..

In 2014, Golang was introduced to ByteDance to quickly solve the high concurrency problem faced by long connection push business. Subsequently, the technical team launched the Kite and Ginex (Gin-based extension) frameworks. The introduction of these two original frameworks has greatly promoted the application of Golang within the company. At the beginning of the release of Kite and Ginex, due to the low version of many functions, including Thrift, which was only v0.9.2 at the time, they actually had many problems. There are big problems. To sum up all the above reasons, the service framework team officially launched the reconstruction of Kite, the byte-owned RPC framework, and the new framework is named Kitex.This is a bottom-up overall upgrade refactoring aroundperformanceandscalabilityThe demands of the design.Similar design ideas and underlying modules have also been applied to ByteDance’s self-developed Golang HTTP framework Hertz, which also hasHigh Performance and High Scalability. Since then, Kitex and Hertz have entered the stage of large-scale implementation, and are still continuously iterating and optimizing around performance and scalability. In 2022, Kitex launched serialization and Thrift special optimizations to further increase the overhead associated with optimized content copying and IO operations.

Under the premise of choosing Golang, it is also crucial to choose the appropriate Golang microservice framework. As mentioned earlier,The most suitable microservice framework should have high scalability, high usability, high performance and rich enough built-in functions. CloudWeGo is just such a collection of open source microservice frameworks and middleware, including the above-mentioned Byte open source Kitex and Hertz. It is the recommended microservice framework in the Golang field and also the microservice framework recommended by CNCF Landscape. In addition to the large-scale implementation of CloudWeGo’s internal business in Byte, since its open source in 2021, it has been implemented in dozens of companies such as Semir, Huaxing Securities, Tanwan Games, Heduo Technology, and Mutong Technology. Microservices based on CloudWeGo have been implemented. The scale ranges from dozens to thousands, and all have gained performance and stability after going online.

Microservice Governance Standardization

Microservices are inseparable from supporting governance capabilities, such as service observability, full link pressure testing and grayscale, registration discovery, configuration center, etc. The realization of these governance capabilities relies on service frameworks, SDKs, Java Agents, and service grids. During the development of these technologies, the industry has gradually formed a situation where a hundred flowers bloom, and different development languages, frameworks, and architectures have emerged, which has brought heavy maintenance burdens and troubles in technology selection to enterprises. Moreover, there are various loss and high complexity problems in the intercommunication between different frameworks. Different microservice development frameworks and tool chains have differences in understanding and implementation of the service governance system, which is not conducive to the precipitation and development of microservice technology. Long-term development. End users have to make difficult choices between different infrastructures and appropriate tools in order to solve various problems in the process of implementing the microservice architecture, which increases the cost of enterprises in the process of implementing the microservice architecture.

In order to solve this problem, two sets of microservice governance standardization solutions were born in 2022, oriented to multi-language, multi-framework and heterogeneous infrastructure, covering key governance areas such as traffic governance, service fault tolerance, service meta-information governance, and security governance. A series of governance capabilities and standards, ecological adaptation and best practices.

In March 2022, the NextArch Foundation officially announced the establishment of the microservice SIG. Technical experts from Tencent, ByteDance, Qiniuyun, Kuaishou, BIGO, TAL and BlueFocus became the first members. The group is committed to promoting the sustainable development of microservice technology and its open source ecology. It will address the problems encountered by enterprises in microservice production practice, output standardized solutions for different industries and application scenarios, and cooperate with PolarisMesh, TARS, go-zero Open source communities such as , GoFrame, CloudWeGo, and Spring Cloud Tencent provide out-of-the-box implementations, thereby lowering the landing threshold for microservice users. According to the experience and pain points of their respective companies in distributed or microservice production practice, for multi-language, multi-framework and heterogeneous infrastructure, output standardized solutions for microservice implementation for different industries and application scenarios, and provide recommendations based on relevant open source communities Realization, convenient for end users to land.

In April 2022, the Microservice Governance Specification OpenSergo project was officially open-sourced. OpenSergo is an open and general-purpose microservice governance project covering microservices and upstream and downstream related components. From the perspective of microservices, it covers key governance areas such as traffic governance, service fault tolerance, service meta-information governance, and security governance. It provides a series of Governance capabilities and standards, ecological adaptation and best practices, support Java, Go, Rust and other multi-language ecosystems. The OpenSergo project is jointly initiated by enterprises such as Alibaba, bilibili, China Mobile, and SphereEx, as well as communities such as Kratos, CloudWeGo, ShardingSphere, Database Mesh, Spring Cloud Alibaba, and Apache Dubbo, and jointly leads the construction of microservice governance standards and the evolution of capabilities. The biggest feature of OpenSergo is that it defines service governance rules with a unified set of configuration/DSL/protocols, is oriented to multi-language heterogeneous architectures, and covers microservice frameworks and upstream and downstream associated components.

On the whole, microservice governance standards are still in the early stage of construction, from the definition of microservice governance standards, to the realization of the control plane, and the realization of multi-language SDKs and governance functions such as Java/Go/C++/Rust, and then to the implementation of various microservices. There is still a lot of evolution work in the integration and implementation of the ecology, which still needs to be further iterated and improved.

Summary and Outlook

Microservices are not a silver bullet, but they are not obsolete either.Microservices are not sweets and potatoes, and there are certain challenges in implementing microservices; microservices are not wolves and tigers, and there is no need to avoid them. The continuous rapid development of microservices has made it the infrastructure of cloud computing just like computing, storage, network, database, and security. It’s just that microservices face different challenges at each stage of development. Before the popularization of cloud native, microservice developers focused on the architecture, iteration, delivery, and operation and maintenance of microservices.With the maturity of cloud-native technology, microservices are also becoming cloud-native. At this time, developers and architects are more concerned about how to take advantage of the advantages of the cloud to simplify the governance and operation and maintenance of microservices, andFocus more on business delivery efficiency. The selection and migration of high-performance service frameworks and the standardization of microservice governance are also directions that need to be continuously explored for microservices to enter the deep water area.

With the development of cloud-native and microservices, a service grid emerges as the times require. We usually refer to the architecture with the service grid as the core of the cloud-native microservice architecture.The cloud-native microservice architecture has the following four characteristics: elastic computing resources; native microservice basic capabilities; service grid unified traffic scheduling; multilingual RPC governance and upgrade issues. In addition, the implementation of the service grid can better realize the security control and stability governance of microservices. However, service mesh is also not a silver bullet and cannot solve all problems. Under the cloud-native microservice architecture, Sidecar increases the complexity of the system and O&M. The poor performance of some communities will also bring significant microservice communication delays. The problem of component multilingual SDKs still exists and is very serious. Service dependencies such as gateways still need to be explicitly accessed.

In order to continuously sink general capabilities and realize the reuse of basic capabilities of the service grid, the cloud-native microservice architecture is gradually evolved into a multiple-runtime microservices architecture (Multiple-Runtime Microservices).The multi-runtime micro-service architecture realizes the further sinking of the multi-language SDK, and provides safe, flexible and controllable changes. Of course, the multi-runtime architecture also has certain limitations, and the business still needs to be transformed before it can be accessed, and the competition between runtime resources and business resources will cause some more complicated problems.In order to solve these problems, we hope to achieveMore standardized and platform-based service grid development and operation and maintenance capabilities, standardize the definition of Sidecar and runtime, and make the operation and maintenance platform more standardized and easy to use.

The road is long and long, and I will search up and down.

about the author:

Luo Guangming, ByteDance service framework team architect, successively engaged in cloud native, microservices and open source related work in Ericsson and Baidu, and then joined ByteDance to be responsible for open source related work of microservice projects such as CloudWeGo. Long-term focus on cutting-edge technologies, architecture evolution and standardization work in the field of cloud native & microservices.

#microservices #entering #deep #water #area #Luo #Guangmings #personal #space #News Fast Delivery

Does the business need microservices?

The right stage to adopt the microservice architecture

Reasonably determine the split granularity of the microservice architecture

Proper Microservice Technology Selection