US-based open Source data orchestration compay Alluxio has raised US$50 million in a Series C round of funding, capital the company will use to fuel the growth of its global operations and continue building out the capabilities of its data orchestration software for managing large-scale distributed data workloads.
The company is particularly looking to expand its Asia-Pacific presence and just opened an office in Beijing, China.
With the additional capital Alluxio will “enlarge our bandwidth in research and development to expand product capabilities, as well as increase our go-to-market capacity in different regions,” said founder and CEO Haoyuan Li in an interview with CRN US.
Alluxio also announced the availability of version 2.7 of its Data Orchestration Platform with improved I/O performance for machine learning and support for open table formats such as Apachi Hudi and Iceberg.
Alluxio’s software, a virtual distributed file system that separates compute from storage, provides a way to unify access to data scattered across widely distributed hybrid-cloud and multi-cloud environments, making all data appear local no matter where it’s stored. That provides a way to link business analytics and data-driven applications to distributed data sources and makes management of distributed data more efficient.
Alluxio said the $50 million funding round brings its total funding to more than $70 million. The round was led by “a leading global Investment firm” that Alluxio isn’t identifying, along with participation from existing investors including Seven Seas Partners, Volcanic Ventures and a16z.
Alluxio is finding success in industries with distributed analytical and AI workloads that involve huge volumes of distributed data, particularly in financial services, telecommunications and media, pharmaceuticals and genomics.
“The more digitiised an industry is, the higher rate of adoption we are seeing,” Li said. “The deeper they are into their digitisation journey, the more value they will find in our software.”
Alluxio said the new 2.7 release, available in both community and enterprise editions, provides a five-fold improvement in I/O efficiency for machine learning training “at significantly lower cost” by parallelizing data loading, data pre-processing and training pipelines, according to the company.
The new edition offers enhanced performance insights and support for open table formats like Apache Hudi and Iceberg, allowing the system to scale up access to data lakes for faster Presto- and Spark-based analytics.
The software also now supports a native Container Storage Interface (CSI) driver for Kubernetes and a Kubernetes operator for machine learning. And a new capability called Shadow Cache makes it easier to balance high performance and cost by dynamically measuring the impact of cache size on response times.
“Alluxio 2.7 further strengthens Alluxio’s position as a key component for AI, machine learning and deep learning in the cloud,” Li said. “With the age of growing datasets and increased computing power from CPUs and GPUs, machine learning and deep learning have become popular techniques for AI. This rise of these techniques advances the state-of-the-art for AI, but also exposes some challenges for the access to data and storage systems.”