InfiniEdge AI Overview

Problem Statement

@Sujata Tibrewala 

Edge computing has the power of creating a level playing field for small Tech developers enabling them to create products for markets where traditional big providers do not venture due to limited volumes. Hence we would like to see more and more local players come into this market and serve the needs of local communities, thereby opening up opportunities for them. There are already some government/private public partnerships in this area see this FCC report and XGain project from EU which fosters a sustainable, balanced, and inclusive development of rural, coastal and urban areas, by facilitating access of relevant stakeholders (such as municipalities, policymakers, farmers, foresters and their associations) to a comprehensive inventory of smart XG, last-mile connectivity and edge computing solutions, and of related assessment methods. 

Also with the proliferation of voluminous data being generated, it is logical to process it at or close to where it is created, reducing the costs and energy consumption in moving it to central clouds. Processing data on site/at edge also addresses data privacy and security concerns. 

Both the points above mean, edge needs to be ubiquitous, and that means it needs to be lightweight, efficient and green. This is unlike a traditional centralized data center which usually has massive power and cooling needs, which only seems to get worse with increased computing demands for AI training and inference. This means both training and inference needs to move to Edge to protect the data and maintain its privacy and keep the compute requirements manageable and sustainable.

Use Cases

@Haruhisa Fukano  @Jeff Brower @Moshe Shadmon 

1. Manufacturing Company

@Caleb @Victor Lu

The system architecture for manufacturing company:

The Butler project addresses the digitization challenges faced by small and medium-sized enterprises (SMEs) lacking resources for transformation. The goal is to create an accessible app that equips SMEs with tools to enhance operations and competitiveness in today's digital era. 

Project Objectives 
The Butler aims to provide SMEs with a comprehensive solution to: 

1) Visualize Production Data: The Butler will offer user-friendly data visualization to help SMEs monitor and improve production processes efficiently. 

2) Simplify Knowledge Access: The app will streamline access to operation manuals and guidelines, facilitating swift troubleshooting and minimizing downtime. 

3) Navigate Standards and Regulations: The Butler will assist SMEs in understanding and adhering to industry standards and regulations, ensuring product quality and compliance. 

4) Market Insights: The app will offer market research insights, enabling SMEs to make informed decisions and develop effective marketing strategies.

5) Employee Training and Testing: The Butler will provide training modules and testing features to enhance employee skills and knowledge. 

//Pick one to work on...

Benefits for SMEs: The project offers SMEs numerous advantages: 

1) Affordability: The Butler provides cost-effective digital solutions tailored to SMEs' budget constraints. 

2) Ease of Use: The user-friendly interface ensures simple adoption and utilization of the app's features. 

3)Efficiency: By centralizing resources, The Butler reduces time spent on information retrieval and training, boosting productivity. 

4) Competitiveness: Access to insights, standards, and market data empowers SMEs to stay competitive and informed. 

The Butler is a crucial step toward SME digital transformation. By providing tailored tools, the project empowers SMEs to embrace digitalization, improve efficiency, and succeed in a digital business landscape. This initiative envisions a future where SMEs can leverage digital solutions to foster growth and competitiveness.

Robotics Use Cases

Akraino SSES Robotics Blueprint

for more info contact @Haruhisa Fukano or @Jeff Brower

Factory Floor, First Responders

SLM (small language model) for edge speech recognition, AI+Data Meeting @ ByteDance, Sep 2024

2. Low-latency AI inference on Edge Cloud

@Yona Cao 

Empower edge AI – Challenges at the Edge

  • Computational Power: Edge devices often have limited CPU and GPU. This limits the complexity of the algorithms that can be run efficiently on these devices.

  • Memory and Storage: Edge devices typically have less memory and storage capacity, which restricts the size of the models that can be deployed and the amount of data that can be processed locally.

  • Network Connectivity: While not always a limitation, inconsistent or low-bandwidth connectivity can hinder the ability of edge devices to communicate with centralized clouds or other edge devices, affecting capabilities like model updates, data syncing, and real-time analytics.

  • Security and Privacy: Implementing robust security measures is more challenging at the edge due to device limitations and the distributed nature of deployment, increasing vulnerability to attacks.

  • Latency: For some applications, even the small delays involved in processing data locally (as opposed to the potentially larger delays from cloud processing) can be a significant limitation, particularly in real-time applications

 

.

Edge Apps

@Yona Cao 

1.Realtime-Translate App on Geo-distributed Edge Inference Cloud

2.Stylenow.ai with geo-distributed API gateway

 

 

Architectures

 

AI Applications On The Edge

The provided architecture diagram for the InfiniEdge AI project showcases a comprehensive framework designed to deploy and manage AI applications at the edge, ensuring flexibility, scalability, and efficient resource utilization. The architecture is structured into three main layers: the AI Elastic Framework, Shifu, and the Terminal/EdgeNode layer.

At the core of this architecture is Shifu, chosen as a short-term solution due to the developers’ familiarity with it. Shifu, positioned on the IaaS layer, is a Kubernetes-native, production-grade, protocol & vendor-agnostic IoT gateway. Shifu serves as a middleware layer, bridging applications and IoT devices by abstracting device data and functionalities into REST APIs. This configuration supports flexible deployment options, allowing Shifu to be deployed at the edge for device twins or in the cloud based on user needs. Typically, the control plane for Shifu is deployed in the cloud. It's important to note that Shifu depends on Kubernetes for its operations.

YoMo is a AI elastic framework for AI agent, which will help AI agents build geo-distributed edge nodes infrastructure and AI API gateway, posioned on the PaaS layer, bring low-latency performance for every AI agent. YoMo is an open-source Large Language Model (LLM) Function Calling Framework designed specifically for building geo-distributed AI applications. Unlike Shifu, YoMo does not rely on Kubernetes, providing more flexibility in certain deployment scenarios. AI agents developers can choose YoMo (Near edge) or Shifu (Far edge) based on these factors shown below.

  • scalability (Number of users, number of devices etc.,)

  • Type of application (Gaming, Video streaming, web content, social media etc.,),

  • Latency requirement

  • Throughput 

  • Entity that manages the application (Enterprise, Telco, Cloud Service provider etc.,)

  • Security constraints

Looking ahead, depending on demand, alternative solutions such as EdgeX and Fledge may be considered to either replace or supplement Shifu. The architecture also incorporates a scalable AI Elastic Framework, which includes an AI API gateway, various Edge AI applications, and components for smart elastic computing and intelligent time-sharing scheduling.

Generative AI models and large language models (LLMs) are inferenced on the edge nodes managed by the Kubernetes framework. These edge nodes encompass various terminal devices like cameras, smart terminals, and sensors, which interact with the AI framework. Data from devices connected to Shifu is stored in a user-selected data store, which can range from time-series databases and SQL to object storage or message queues (MQ).

Two common use cases can help illustrate its practical implementation through detailed architectures in real-world scenarios.

The first case involves the management of edge IoT devices, specifically dynamic message signs (DMS) on highways. These signs are essential tools for the police department to convey important messages to drivers, such as traffic updates, warnings, and other critical information. Each DMS operates as an IoT device with a unique IP address, allowing for individual control and management. To efficiently manage these devices, our architecture incorporates Shifu, a platform that enables dedicated Pods for each DMS. This setup ensures that each sign can be independently controlled and updated in real-time, providing a robust and flexible solution for dynamic message dissemination.

The second case focuses on handheld devices used by police officers. These devices are crucial for field officers who need to subscribe to different channels of information, such as emergency alerts, situational updates, and other relevant data streams. In this scenario, we integrate YoMo into our architecture. By utilizing YoMo, police officers can subscribe to a variety of information channels, ensuring they receive timely and real-time updates directly on their handheld devices. This capability enhances the officers' situational awareness and response capabilities, as they can access and process information pertinent to their duties without delay.

In summary, these two cases demonstrate the versatility and effectiveness of edge computing in managing IoT devices and providing real-time information to field personnel. By employing Shifu for dynamic message signs and YoMo for handheld devices, we create a comprehensive and responsive system that meets the specific needs of these real-world applications. This approach not only improves the efficiency and reliability of information dissemination but also enhances the overall operational capabilities of the police department.

 

For manufacturing, retail, and IoT use cases:

Q1. What is the data that being generated?

@Caleb  @Victor Lu 

The data being generated in the context of the Butler project includes:

1. Production Data: Data related to the monitoring and improvement of production processes in SMEs. This may include metrics such as production output, efficiency, machine performance, and downtime.

2. Knowledge Access Data: Information related to operation manuals, guidelines, and troubleshooting processes. This could involve data on frequently accessed documents, common troubleshooting steps, and usage patterns.

3. Standards and Regulations Data: Information on industry standards and regulations that SMEs need to adhere to. This might include compliance checklists, audit results, and regulatory updates.

4. Market Insights Data: Market research data that helps SMEs make informed decisions. This can encompass market trends, consumer behavior, competitor analysis, and sales data.

5. Employee Training and Testing Data: Data generated from training modules and testing features aimed at enhancing employee skills and knowledge. This includes training progress, test scores, and skill assessments.

Additionally, the project also mentions the proliferation of voluminous data generated at the edge. This includes:

- Operational Data: Data generated from the daily operations of SMEs, such as inventory levels, sales transactions, and customer interactions.
- Edge Computing Data: Data processed at or close to where it is created to reduce costs, energy consumption, and address data privacy and security concerns. This encompasses all types of data mentioned above, processed locally rather than being sent to centralized data centers.

 

Q2: For model inference, what data is needed?

@Jeff Brower @Haruhisa Fukano @Feimin Yuan 

For manufacturing, retail, and IoT use cases, a variety of data may be needed, and required models may be multimodal. Some examples:

  1. Video behavior recognition. Use case examples include factory floor, warehouse, and retail surveillance

  2. Speech recognition. Use case examples include factory hands-free assembly line, factory floor and warehouse equipment (e.g. forklifts, cranes), first responders (e.g. automated vehicles)

  3. IoT data gathered and communicated between edge nodes. Use case examples include environmental conditions, fire risk, structural integrity, water quality, waste water virus detection, and many others

  4. Spectral measurements. Use case examples include RAN (smart radios)

Q3: Can the model be generated by FedML?

@Tina Tsou 

InfiniEdge AI is designed to support federated learning, which means that models can be generated using various frameworks that support federated learning, including FedML.

FedML is an open-source library that facilitates the implementation of federated learning, making it possible to train AI models on edge devices while keeping data decentralized and secure. Given InfiniEdge AI's focus on distributed edge clouds, utilizing FedML would be compatible and beneficial.

 

AI Agent Platform

 

The AI Agent Platform is an open-source initiative designed to provide a comprehensive, modular, and scalable framework for developing AI-driven agents across multiple industries. The platform aims to accelerate innovation by offering ready-to-use AI solutions that enhance efficiency, personalization, and user experience across different domains such as supply chain management, sales, customer service, travel, real estate, healthcare, education, finance, retail, automotive, entertainment, and more.

 

Key Components

  1. Agent SDK A comprehensive SDK that allows developers to easily create and customize AI agents. It includes libraries for natural language understanding, emotion detection, behavior modeling, and more, enabling the rapid development of intelligent solutions across various industries.

  1. AI Agent Marketplace A marketplace for sharing and discovering AI agents, extensions, and tools. The marketplace encourages collaboration and the exchange of ideas within the developer community.

  1. Microservices Architecture The platform is built on a microservices architecture, ensuring flexibility, scalability, and modular development. It leverages Kubernetes for deployment, an API Gateway for integration, and supports a variety of machine learning models tailored to specific agent behaviors.

  1. Open Source GovernanceThe AI Agent Platform operates under an open-source governance model, encouraging contributions from a global developer community. It features a transparent roadmap and collaborative development, promoting wide adoption and adaptation across industries.

 

 

Following the key components of the AI Agent Platform, the diagram further elaborates on the specific modules that power the platform’s architecture, providing insight into how various elements contribute to the overall functionality and flexibility.

 

The Agent Build Platform serves as a low-code solution, designed to streamline the creation and management of AI agents with minimal coding effort. This low-code framework allows developers to easily orchestrate and publish agents, using predefined templates, workflows, and plugins to accelerate development. Additionally, the platform is highly extensible, supporting integrations and custom implementations from third-party providers. This ensures a versatile and collaborative environment where different industries and developers can contribute and customize agents to suit specific business needs.

 

A robust Data & Knowledge management framework underpins the agents’ ability to operate intelligently. With capabilities such as Knowledge Base Management, Knowledge Import, and Pre-processing, the platform allows for the ingestion, processing, and retrieval of information at scale. Advanced features like Retrieval Augmentation ensure that agents can efficiently access relevant information, even in complex scenarios, making the platform suitable for industries that rely heavily on real-time decision-making and knowledge retrieval.

 

The Agent Frameworks section highlights examples of various frameworks the platform aims to support, such as AGILE, AutoGen, OpenAI Swarm, and AgentZero. These frameworks provide diverse capabilities, ranging from agile development environments to collaborative multi-agent systems, allowing users to choose the best-suited framework for their specific needs.

 

The SPEAR infrastructure forms the backbone of the platform, providing essential services like cluster management, traffic routing, and agent runtime environments. SPEAR ensures that agents can scale effectively across distributed systems while maintaining security and performance. Components like Sandbox & Security and Image Management ensure that agents operate safely in their environments, while the API Gateway facilitates easy integration with external applications.

 

On the backend, the platform leverages a mix of MAAS (Model as a Service), Open Source Models, and Proprietary Models to provide flexible and high-performance AI solutions. This layer supports the use of both pre-built and custom models, ensuring that organizations can leverage the best available technologies for their specific needs. Network Acceleration and optimization further enhance the platform’s performance, making it suitable for industries requiring low-latency and high-throughput operations.

 

Finally, OPEA (Open Platform for Enterprise AI), an Intel-driven initiative, plays a pivotal role in ensuring that the AI Agent Platform remains an open and adaptable solution for enterprises. By integrating with the OPEA project, the platform promotes wide adoption and allows enterprises to customize the framework for their unique AI-driven use cases, aligning with the open-source governance model.

 

This modular and scalable design, combined with a commitment to open-source development, positions the AI Agent Platform as a powerful tool for developers and enterprises alike, offering cutting-edge AI capabilities across a diverse range of industries.

 

Gaming Industry

AI NPC Agent: Generates personalized non-player characters (NPCs) for intelligent interaction, enhancing game immersion. This agent is designed with a certain level of intelligence, enabling it to interact seamlessly with players. It can receive and act upon players’ commands, such as picking up items, completing small tasks, or assisting in specific in-game objectives. By dynamically responding to player inputs, the agent adds depth and realism to gameplay, creating a more engaging and immersive experience.

 

Game Forum Comment Agent: Facilitates intelligent interactions within game forums by generating context-aware comments and suggestions. This agent can analyze ongoing discussions, understand the context of posts, and contribute meaningful replies, such as tips, strategies, or clarifications about game mechanics. It helps foster engagement within the gaming community, ensuring discussions remain active and relevant while providing players with valuable insights and support.

AI Agent Marketplace

// TODO @Noe

Coze AI Agent Marketplace

Minimum Viable Products

Minimum Viable Product (MVP) For InfiniEdge AI

@Wilson Wang and @C.C. Fan will draft the MVP.

 

@Moshe Shadmon @Victor Lu @Caleb will gather together to go through the though process for manufacturing selected use case and what data the model needs.

 

 

Related pages