ScaleGenAI reduces the compute cost of training and inference of generative AI workloads by 3x-6x via its universal compute layer technology
Current Status
We launched the product in March 2024. We have 32 production customers with a total ARR of USD 1.4mn. We expect to get to 200 customers by the end of 2024 with an ARR of USD 8mn and 1,500 customers by end of 2025 with an ARR of USD 60mn.
Problem or Opportunity
The need for AI compute is expected to be a 1,000x of what it is currently. However, this capability is available to a select few companies. ScaleGenAI's universal AI compute layer makes this available to every AI developer while reducing the cost of AI compute by close to 90%
Solution (product or service)
Islands of AI compute are spread across different clouds and different GPU and accelerator SKUs. Currently, AI developers are locked into choosing any one such cloud and architecture. ScaleGenAI leverages accelerator capacity from over 200 different datacenters and more than 50 GPU/accelerator types and abstracts away the complexity via a OpenAI and HuggingFace compatible API. Hence an AI developer via a simple CLI can train and deploy AI workloads across an inventory of more than 1.3mn accelerators at a 90% lower cost than getting locked into a single cloud and single accelerator architecture
Business model
ScaleGenAI charges for its software on a pay as you go basis as a percentage of the total AI compute cost. In addition we also buy computing capacity in bulk from a variety of service providers and resell to our customers.