InfiniEdge AI Overview
Problem Statement
@Sujata Tibrewala
Edge computing has the power of creating a level playing field for small Tech developers enabling them to create products for markets where traditional big providers do not venture due to limited volumes. Hence we would like to see more and more local players come into this market and serve the needs of local communities, thereby opening up opportunities for them. There are already some government/private public partnerships in this area see this FCC report and XGain project from EU which fosters a sustainable, balanced, and inclusive development of rural, coastal and urban areas, by facilitating access of relevant stakeholders (such as municipalities, policymakers, farmers, foresters and their associations) to a comprehensive inventory of smart XG, last-mile connectivity and edge computing solutions, and of related assessment methods.
Also with the proliferation of voluminous data being generated, it is logical to process it at or close to where it is created, reducing the costs and energy consumption in moving it to central clouds. Processing data on site/at edge also addresses data privacy and security concerns.
Both the points above mean, edge needs to be ubiquitous, and that means it needs to be lightweight, efficient and green. This is unlike a traditional centralized data center which usually has massive power and cooling needs, which only seems to get worse with increased computing demands for AI training and inference. This means both training and inference needs to move to Edge to protect the data and maintain its privacy and keep the compute requirements manageable and sustainable.
Use Cases
@Haruhisa Fukano @Jeff Brower @Moshe Shadmon
1. Manufacturing Company
@Caleb @Victor Lu
The system architecture for manufacturing company:
The Butler project addresses the digitization challenges faced by small and medium-sized enterprises (SMEs) lacking resources for transformation. The goal is to create an accessible app that equips SMEs with tools to enhance operations and competitiveness in today's digital era.
Project Objectives
The Butler aims to provide SMEs with a comprehensive solution to:
1) Visualize Production Data: The Butler will offer user-friendly data visualization to help SMEs monitor and improve production processes efficiently.
2) Simplify Knowledge Access: The app will streamline access to operation manuals and guidelines, facilitating swift troubleshooting and minimizing downtime.
3) Navigate Standards and Regulations: The Butler will assist SMEs in understanding and adhering to industry standards and regulations, ensuring product quality and compliance.
4) Market Insights: The app will offer market research insights, enabling SMEs to make informed decisions and develop effective marketing strategies.
5) Employee Training and Testing: The Butler will provide training modules and testing features to enhance employee skills and knowledge.
//Pick one to work on...
Benefits for SMEs: The project offers SMEs numerous advantages:
1) Affordability: The Butler provides cost-effective digital solutions tailored to SMEs' budget constraints.
2) Ease of Use: The user-friendly interface ensures simple adoption and utilization of the app's features.
3)Efficiency: By centralizing resources, The Butler reduces time spent on information retrieval and training, boosting productivity.
4) Competitiveness: Access to insights, standards, and market data empowers SMEs to stay competitive and informed.
The Butler is a crucial step toward SME digital transformation. By providing tailored tools, the project empowers SMEs to embrace digitalization, improve efficiency, and succeed in a digital business landscape. This initiative envisions a future where SMEs can leverage digital solutions to foster growth and competitiveness.
Robotics Use Cases
Akraino SSES Robotics Blueprint
for more info contact @Haruhisa Fukano or @Jeff Brower
Factory Floor, First Responders
SLM (small language model) for edge speech recognition, AI+Data Meeting @ ByteDance, Sep 2024
2. Low-latency AI inference on Edge Cloud
@Yona Cao
Empower edge AI – Challenges at the Edge
Computational Power: Edge devices often have limited CPU and GPU. This limits the complexity of the algorithms that can be run efficiently on these devices.
Memory and Storage: Edge devices typically have less memory and storage capacity, which restricts the size of the models that can be deployed and the amount of data that can be processed locally.
Network Connectivity: While not always a limitation, inconsistent or low-bandwidth connectivity can hinder the ability of edge devices to communicate with centralized clouds or other edge devices, affecting capabilities like model updates, data syncing, and real-time analytics.
Security and Privacy: Implementing robust security measures is more challenging at the edge due to device limitations and the distributed nature of deployment, increasing vulnerability to attacks.
Latency: For some applications, even the small delays involved in processing data locally (as opposed to the potentially larger delays from cloud processing) can be a significant limitation, particularly in real-time applications
.
Edge Apps
@Yona Cao
1.Realtime-Translate App on Geo-distributed Edge Inference Cloud
2.Stylenow.ai with geo-distributed API gateway
Architectures
AI Applications On The Edge
The provided architecture diagram for the InfiniEdge AI project showcases a comprehensive framework designed to deploy and manage AI applications at the edge, ensuring flexibility, scalability, and efficient resource utilization. The architecture is structured into three main layers: the AI Elastic Framework, Shifu, and the Terminal/EdgeNode layer.
At the core of this architecture is Shifu, chosen as a short-term solution due to the developers’ familiarity with it. Shifu, positioned on the IaaS layer, is a Kubernetes-native, production-grade, protocol & vendor-agnostic IoT gateway. Shifu serves as a middleware layer, bridging applications and IoT devices by abstracting device data and functionalities into REST APIs. This configuration supports flexible deployment options, allowing Shifu to be deployed at the edge for device twins or in the cloud based on user needs. Typically, the control plane for Shifu is deployed in the cloud. It's important to note that Shifu depends on Kubernetes for its operations.
YoMo is a AI elastic framework for AI agent, which will help AI agents build geo-distributed edge nodes infrastructure and AI API gateway, posioned on the PaaS layer, bring low-latency performance for every AI agent. YoMo is an open-source Large Language Model (LLM) Function Calling Framework designed specifically for building geo-distributed AI applications. Unlike Shifu, YoMo does not rely on Kubernetes, providing more flexibility in certain deployment scenarios. AI agents developers can choose YoMo (Near edge) or Shifu (Far edge) based on these factors shown below.
scalability (Number of users, number of devices etc.,)
Type of application (Gaming, Video streaming, web content, social media etc.,),
Latency requirement
Throughput
Entity that manages the application (Enterprise, Telco, Cloud Service provider etc.,)
Security constraints
Looking ahead, depending on demand, alternative solutions such as EdgeX and Fledge may be considered to either replace or supplement Shifu. The architecture also incorporates a scalable AI Elastic Framework, which includes an AI API gateway, various Edge AI applications, and components for smart elastic computing and intelligent time-sharing scheduling.
Generative AI models and large language models (LLMs) are inferenced on the edge nodes managed by the Kubernetes framework. These edge nodes encompass various terminal devices like cameras, smart terminals, and sensors, which interact with the AI framework. Data from devices connected to Shifu is stored in a user-selected data store, which can range from time-series databases and SQL to object storage or message queues (MQ).
Two common use cases can help illustrate its practical implementation through detailed architectures in real-world scenarios.
The first case involves the management of edge IoT devices, specifically dynamic message signs (DMS) on highways. These signs are essential tools for the police department to convey important messages to drivers, such as traffic updates, warnings, and other critical information. Each DMS operates as an IoT device with a unique IP address, allowing for individual control and management. To efficiently manage these devices, our architecture incorporates Shifu, a platform that enables dedicated Pods for each DMS. This setup ensures that each sign can be independently controlled and updated in real-time, providing a robust and flexible solution for dynamic message dissemination.
The second case focuses on handheld devices used by police officers. These devices are crucial for field officers who need to subscribe to different channels of information, such as emergency alerts, situational updates, and other relevant data streams. In this scenario, we integrate YoMo into our architecture. By utilizing YoMo, police officers can subscribe to a variety of information channels, ensuring they receive timely and real-time updates directly on their handheld devices. This capability enhances the officers' situational awareness and response capabilities, as they can access and process information pertinent to their duties without delay.
In summary, these two cases demonstrate the versatility and effectiveness of edge computing in managing IoT devices and providing real-time information to field personnel. By employing Shifu for dynamic message signs and YoMo for handheld devices, we create a comprehensive and responsive system that meets the specific needs of these real-world applications. This approach not only improves the efficiency and reliability of information dissemination but also enhances the overall operational capabilities of the police department.
For manufacturing, retail, and IoT use cases:
Q1. What is the data that being generated?
@Caleb @Victor Lu
The data being generated in the context of the Butler project includes:
1. Production Data: Data related to the monitoring and improvement of production processes in SMEs. This may include metrics such as production output, efficiency, machine performance, and downtime.
2. Knowledge Access Data: Information related to operation manuals, guidelines, and troubleshooting processes. This could involve data on frequently accessed documents, common troubleshooting steps, and usage patterns.
3. Standards and Regulations Data: Information on industry standards and regulations that SMEs need to adhere to. This might include compliance checklists, audit results, and regulatory updates.
4. Market Insights Data: Market research data that helps SMEs make informed decisions. This can encompass market trends, consumer behavior, competitor analysis, and sales data.
5. Employee Training and Testing Data: Data generated from training modules and testing features aimed at enhancing employee skills and knowledge. This includes training progress, test scores, and skill assessments.
Additionally, the project also mentions the proliferation of voluminous data generated at the edge. This includes:
- Operational Data: Data generated from the daily operations of SMEs, such as inventory levels, sales transactions, and customer interactions.
- Edge Computing Data: Data processed at or close to where it is created to reduce costs, energy consumption, and address data privacy and security concerns. This encompasses all types of data mentioned above, processed locally rather than being sent to centralized data centers.
Q2: For model inference, what data is needed?
@Jeff Brower @Haruhisa Fukano @Feimin Yuan
For manufacturing, retail, and IoT use cases, a variety of data may be needed, and required models may be multimodal. Some examples:
Video behavior recognition. Use case examples include factory floor, warehouse, and retail surveillance
Speech recognition. Use case examples include factory hands-free assembly line, factory floor and warehouse equipment (e.g. forklifts, cranes), first responders (e.g. automated vehicles)
IoT data gathered and communicated between edge nodes. Use case examples include environmental conditions, fire risk, structural integrity, water quality, waste water virus detection, and many others
Spectral measurements. Use case examples include RAN (smart radios)
Q3: Can the model be generated by FedML?
@Tina Tsou
InfiniEdge AI is designed to support federated learning, which means that models can be generated using various frameworks that support federated learning, including FedML.
FedML is an open-source library that facilitates the implementation of federated learning, making it possible to train AI models on edge devices while keeping data decentralized and secure. Given InfiniEdge AI's focus on distributed edge clouds, utilizing FedML would be compatible and beneficial.
AI Agent Platform
The AI Agent Platform is an open-source initiative designed to provide a comprehensive, modular, and scalable framework for developing AI-driven agents across multiple industries. The platform aims to accelerate innovation by offering ready-to-use AI solutions that enhance efficiency, personalization, and user experience across different domains such as supply chain management, sales, customer service, travel, real estate, healthcare, education, finance, retail, automotive, entertainment, and more.
Key Components
Agent SDK A comprehensive SDK that allows developers to easily create and customize AI agents. It includes libraries for natural language understanding, emotion detection, behavior modeling, and more, enabling the rapid development of intelligent solutions across various industries.
AI Agent Marketplace A marketplace for sharing and discovering AI agents, extensions, and tools. The marketplace encourages collaboration and the exchange of ideas within the developer community.
Microservices Architecture The platform is built on a microservices architecture, ensuring flexibility, scalability, and modular development. It leverages Kubernetes for deployment, an API Gateway for integration, and supports a variety of machine learning models tailored to specific agent behaviors.
Open Source GovernanceThe AI Agent Platform operates under an open-source governance model, encouraging contributions from a global developer community. It features a transparent roadmap and collaborative development, promoting wide adoption and adaptation across industries.
Following the key components of the AI Agent Platform, the diagram further elaborates on the specific modules that power the platform’s architecture, providing insight into how various elements contribute to the overall functionality and flexibility.
The Agent Build Platform serves as a low-code solution, designed to streamline the creation and management of AI agents with minimal coding effort. This low-code framework allows developers to easily orchestrate and publish agents, using predefined templates, workflows, and plugins to accelerate development. Additionally, the platform is highly extensible, supporting integrations and custom implementations from third-party providers. This ensures a versatile and collaborative environment where different industries and developers can contribute and customize agents to suit specific business needs.
A robust Data & Knowledge management framework underpins the agents’ ability to operate intelligently. With capabilities such as Knowledge Base Management, Knowledge Import, and Pre-processing, the platform allows for the ingestion, processing, and retrieval of information at scale. Advanced features like Retrieval Augmentation ensure that agents can efficiently access relevant information, even in complex scenarios, making the platform suitable for industries that rely heavily on real-time decision-making and knowledge retrieval.
The Agent Frameworks section highlights examples of various frameworks the platform aims to support, such as AGILE, AutoGen, OpenAI Swarm, and AgentZero. These frameworks provide diverse capabilities, ranging from agile development environments to collaborative multi-agent systems, allowing users to choose the best-suited framework for their specific needs.
The SPEAR infrastructure forms the backbone of the platform, providing essential services like cluster management, traffic routing, and agent runtime environments. SPEAR ensures that agents can scale effectively across distributed systems while maintaining security and performance. Components like Sandbox & Security and Image Management ensure that agents operate safely in their environments, while the API Gateway facilitates easy integration with external applications.
On the backend, the platform leverages a mix of MAAS (Model as a Service), Open Source Models, and Proprietary Models to provide flexible and high-performance AI solutions. This layer supports the use of both pre-built and custom models, ensuring that organizations can leverage the best available technologies for their specific needs. Network Acceleration and optimization further enhance the platform’s performance, making it suitable for industries requiring low-latency and high-throughput operations.
Finally, OPEA (Open Platform for Enterprise AI), an Intel-driven initiative, plays a pivotal role in ensuring that the AI Agent Platform remains an open and adaptable solution for enterprises. By integrating with the OPEA project, the platform promotes wide adoption and allows enterprises to customize the framework for their unique AI-driven use cases, aligning with the open-source governance model.
This modular and scalable design, combined with a commitment to open-source development, positions the AI Agent Platform as a powerful tool for developers and enterprises alike, offering cutting-edge AI capabilities across a diverse range of industries.
AI Agent Marketplace
// TODO @Noe
https://lf-edge.atlassian.net/wiki/x/C4AkBw
Minimum Viable Products
Minimum Viable Product (MVP) For InfiniEdge AI
@Wilson Wang and @C.C. Fan will draft the MVP.
@Moshe Shadmon @Victor Lu @Caleb will gather together to go through the though process for manufacturing selected use case and what data the model needs.