For well-known reasons, Russian developers have been treated unfairly in the field of open source. But even so, Yandex, a large Russian technology company, recently open-sourced a project on GitHub called YTsaurus, a platform for storing and processing big data that is used by most Yandex services.
Maxim Babenko, head of distributed computing at Yandex, said:
Yandex has been developing YTsaurus (known internally as YT) since 2010. Because there is no single solution on the market that can meet all our requirements, we decided to start building our own big data ecosystem. Now YTsaurus is one of the key technologies of Yandex internal infrastructure.
According to the official statement, YTsaurus is suitable for a wide range of tasks, from data analysis to training complex models with billions of parameters. For example: Yandex Search uses it to build a search index, self-driving cars use it to process massive amounts of data and improve algorithms, YTsaurus also manages Yandex’s supercomputers, distributing the load so that computing power is used as efficiently as possible.
The platform advantages of YTsaurus include:
- Multi-tenant ecology
- reliability and stability
- rich functions
- CHYT powered by ClickHouse
- SPYT powered by Apache Spark
Alexey Bashkeev, head of Yandex Cloud, said:
YTsaurus has proven itself inside Yandex and now we make it available to all developers. Large companies that process large amounts of data on thousands of servers with ever-increasing loads will reap the greatest benefits. We believe that open sourcing YTsaurus will take it to a new level of development, as has happened with our other products.
The source code and documentation for YTsaurus are available on GitHub, and the project is licensed under the Apache 2.0 license. The project address is as follows: link.
#Yandex #open #source #big #data #storage #processing #platform #YTsaurus #News Fast Delivery