Work stream 1: Geo-distributed Cloud for AI agents

Leaders: @Yona Cao @C.C. Fan
Objective: This work stream focuses on optimizing the performance and scalability of cloud infrastructure across multiple geographical locations. It aims to address challenges related to latency, data sovereignty, and efficiency in distributed computing environments.
Approach: The team will develop strategies for seamless data synchronization and application performance across dispersed networks, ensuring robust, secure, and compliant operations globally.

Introduction to vivgrid (private beta)

vivgrid.com is a geo-distributed public edge cloud, elevate AI applications with global AI inference infrastructure. Vivgrid helps developers deploy their AI apps closer to users for lightning-fast performance. With vivgrid, developers will enjoy higher efficiency at lower costs with global infrastructure solutions. Give users the seamless experience they deserve.

Features

Globally located AI inference infrastructure
- Developers can deploy their AI applications with a single click, and Vivgrid will deploy the application across multiple nodes worldwide, automatically selecting the compute node closest to the user based on their location.AI API bridge
  - For commonly used AI API services such as Open AI API, Vivgrid will provide locally accessible API entry points, ensuring the fastest network route to the API for users, thereby improving the response speed of end-user requests.

Function calling platform
- With the general function calling service, developers only need to configure function calling once on the Vivgrid platform to correctly invoke the function calling capabilities of different large language models
Easily prompt management
- Enable non-developers to easily update system prompts directly from the dashboard, applying changes instantly without coding. Say goodbye to constant modifications for developers and streamline your workflow with our intuitive prompt management feature.

Architecture

AI Inference edge node: AI infrastructure is distributed across five major regions globally.

To ensure low-latency AI inference for users from different countries, vivgrid will offer AI infrastructure across 25 cities in those five major regions globally as shown in this picture. Developers only need to deploy their AI applications once, and users around the world can access the AI nodes nearest to them. This edge cloud architecture ensures that the TTFT (Time to First Token) for AI requests globally averages below 200ms. Vivgrid will also provide optimized network route and data transfer methods between these nodes.

LLM API bridge: API accelerator

For most AI agent development , testing LLM API services is an essential step. Some AI applications are built directly on large model APIs, while others repeatedly compare their own models with services like OpenAI’s API. Therefore, during development, a key challenge is how to test the performance of different API services using a single codebase. Vivgrid’s API service allows developers to write code once and seamlessly call different APIs, Vivgrid will be handling compatibility issues between them.

After the application is released, another critical concern is how to make API respond faster, reducing users' wait times. Since most API services are still deployed in single datacenter, users who are not in United States often encounter network issues when accessing these APIs. Vivgrid offers API acceleration services to ensure that users across the world can quickly access LLM APIs hosted in U.S. data centers.

Function calling platform: Serverless cloud that seamless integrated with function calling

For a production ready application, there may be many questions that llm cannot answer or problems they cannot solve, which requires third-party APIs or developers’ own data. Function calling greatly extends the capability boundaries of large models. However, using function calling typically requires a complex setup for each model to successfully complete a call. Vivgrid offers a simplified function calling platform, where developers can follow the documentation and after easily config ,that will complete the function calling testing and development on the platform. For example, if real-time weather information is needed, a serverless function that calls a third-party weather API can be deployed on Vivgrid. With serverless function calling, AI agents can access APIs like weather API, access important private data, and integrate with existing systems.

Timeline

Phase 1: Provide OpenAI API acceleration gateway, allowing developers to easily start using the globally accelerated OpenAI API on Vivgrid with simple configuration. Additionally, offer easy-to-use best practices for streaming requests and responses with the OpenAI API.

Phase 2: Provide serverless cloud that seamless integrated with function calling. Developers can deploy various serverless functions on the Vivgrid platform to enhance the capabilities of AI agents. For example, if real-time weather information is needed, a serverless function that calls a third-party weather API can be deployed on Vivgrid. With serverless function calling, AI agents can access APIs like weather API, access important private data, and integrate with existing systems.