Thanks for the feedback from Zhenyang@Upshot, Fran@Giza, Ashely@Neuronets, Matt@Valence, Dylan@Pond.
This study aims to explore which artificial intelligence fields are most important for developers, and which may be the next breakout opportunities in the Web3 and artificial intelligence fields.
Before sharing new research perspectives, we are delighted to announce our participation in RedPill’s first round of financing totaling $5 million. We are also very excited and look forward to growing together with RedPill!
TL;DR
With the integration of Web3 and AI becoming a hot topic in the encryption currency industry, the AI infrastructure in the encryption world is flourishing. However, the actual utilization of AI or applications built by AI is not long-lasting, and the homogeneity problem of AI infrastructure is gradually emerging. The recent first round of financing participated by us in RedPill has triggered some deeper understanding.
The main toolkits for building AI Dapps include Decentralization, OpenAI access, GPU network, inference network, and proxy network.
The reason why the GPU network is more popular than the “BTCMining era” is because: the AI market is larger, and the rise is fast and stable; AI supports millions of applications every day; AI requires diverse GPU models and server locations; the technology is more mature than in the past; and the customer base is also broader.
Inference networks and proxy networks have similar infrastructure, but different focuses. Inference networks are mainly used by experienced developers to deploy their own models, and running non-LLM models does not necessarily require a GPU. Proxy networks are more focused on LLM, and developers do not need to bring their own models, but rather emphasize prompt engineering and how to link different proxies together. Proxy networks always require high-performance GPUs.
AI infrastructure projects promise huge potential and continue to roll out new features.
Most native encryption projects are still in the Testnet stage, with poor stability, complex configuration, limited functionality, and time is needed to prove their security and privacy.
Assuming AI Dapp becomes a major trend, there are still many undeveloped areas, such as monitoring, infrastructure related to RAG, Web3 native models, built-in encryption native APIs, Decentralization agents for data, and evaluation networks.
Vertical integration is a significant trend. Infrastructure projects attempt to provide one-stop services, simplifying the work of AI Dapp developers.
The future will be hybrid. Some reasoning will be done on the front end, while some will be done on-chain to consider cost and verifiability factors.
Source:IOSG
Introduction
The combination of Web3 and AI is one of the most eye-catching topics in the field of encryption. Talented developers are building AI infrastructure for the encryption world, dedicated to bringing intelligence into Smart Contracts. Building an AI dApp is an extremely complex task, and developers need to deal with a range of aspects including data, models, computing power, operations, deployment, and integration with the blockchain. To address these needs, the founders of Web3 have developed many preliminary solutions, such as GPU networks, community data labeling, community-trained models, verifiable AI inference and training, and agent stores.
And in this flourishing infrastructure background, the actual use of AI or applications built for AI is not long. Developers looking for AI dApp development tutorials find that these tutorials related to native AI infrastructure are not long, and the majority of tutorials only involve calling the OpenAI API on the front end.
Source: IOSGVentures
The current application has not fully utilized the Decentralization and verifiable functions of the blockchain, but this situation will soon change. Now, most of the artificial intelligence infrastructure focusing on the encryption field has launched a test network and plans to officially operate within the next 6 months.
This study will detail the main tools available in the artificial intelligence infrastructure in the encryption field. Let’s get ready to welcome the GPT-3.5 moment in the encryption world!
1. RedPill: Providing Decentralization Authorization for OpenAI
The RedPill we mentioned earlier is a very good entry point.
OpenAI has several world-class powerful models, such as GPT-4-vision, GPT-4-turbo, and GPT-4o, which are the preferred choice for building advanced artificial intelligence Dapps.
Developers can integrate OpenAI API into dApp through Oracle Machine or front-end interface calls.
RedPill integrates the OpenAI API from different developers under one interface, providing fast, economical, and verifiable AI services to users worldwide, thus democratizing access to top AI model resources. RedPill’s routing Algorithm will direct developers’ requests to a single contributor. API requests will be executed through its distribution network, bypassing any potential restrictions from OpenAI, and addressing some common issues faced by developers, such as encryption.
Limitation TPM (Tokens Per Minute): The use of Tokens for new accounts is limited, which cannot meet the needs of popular dApps that rely on AI.
Access Restrictions: Some models have restrictions on access to new accounts or certain countries.
By using the same request code but replacing the hostname, developers can access OpenAI models at a low cost, with high scalability and no restrictions.
2. GPU Network
In addition to using OpenAI’s API, many developers also choose to host models at home. They can rely on decentralized GPU networks such as io.net, Aethir, Akash, and other popular networks to establish GPU clusters and deploy and run various powerful internal or open-source models themselves.
Such a Decentralization GPU network can leverage the computing power of individuals or small data centers to provide flexible configurations, longer server location choices, and lower costs, enabling developers to easily conduct AI-related experiments within a limited budget. However, due to the nature of Decentralization, such GPU networks still have certain limitations in functionality, availability, and data privacy.
In the past few months, the demand for GPUs has been booming, surpassing the previous BTC mining frenzy. The reasons for this phenomenon include:
The GPU network now serves AI developers, whose number is not only large but also more loyal, and will not be affected by the price Fluctuation of Crypto.
Compared to mining-specific devices, Decentralization GPU offers longer models and specifications, better meeting the requirements. Especially for large-scale model processing, higher VRAM is needed, while smaller tasks have more suitable GPU options. At the same time, Decentralization GPU can serve end users at a close distance with lower latency.
As the technology matures, the GPU network relies on high-speed Blockchains such as Solana Settlement, Docker virtualization technology, and Ray computing clusters.
In terms of investment returns, the AI market is expanding, with opportunities for the development of new applications and models long. The expected return rate of the H100 model is 60-70%, while BTCMining is more complex, with limited production and winners take all.
BTCMining companies such as Iris Energy, Core Scientific, and Bitdeer have also started to support GPU networks, provide AI services, and actively purchase GPUs designed for AI, such as the H100.
Recommendation: For Web2 developers who do not attach much importance to SLA, io.net provides a simple and user-friendly experience, making it a cost-effective choice.
3. Inference Network
This is the core of the on-chain AI infrastructure, which will support billions of AI inference operations in the future. Many AI layer1 or layer2 provide developers with the ability to invoke AI inference natively on-chain. Market leaders include Ritual, Valence, and Fetch.ai.
These networks differ in the following aspects:
Performance (latency, computation time)
Supported Models
Verifiability
Price (on-chain consumption cost, inference cost)
Development Experience
3.1 Goal
The ideal scenario is that developers can easily access custom AI inference services anywhere, in any form of proof, with almost no obstacles in the integration process.
The inference network provides all the basic support needed by developers, including on-demand generation and proof of validation, inference calculation, inference data Relay and verification, providing interfaces for Web2 and Web3, one-click model deployment, system monitoring, Cross-Chain Interaction operations, synchronous integration and scheduled execution, etc.
Source: IOSGVentures
With these features, developers can seamlessly integrate inference services into their existing smart contracts. For example, when building decentralized finance trading robots, these robots use machine learning models to find buying and selling opportunities for specific trading pairs and execute corresponding trading strategies on the underlying trading platform.
In a completely ideal state, all the infrastructure is cloud-hosted. Developers only need to upload their trading strategy models in a common format such as torch, and the inference network will store and serve the models for Web2 and Web3 queries.
Once all the model deployment steps are completed, developers can directly call the model inference through the Web3 API or smart contract. The inference network will continuously execute these trading strategies and provide feedback to the underlying smart contract. If the community funds managed by the developer are substantial, verification of the inference results is also required. Once the inference results are received, the smart contract will execute trades based on these results.
Source: IOSGVentures
3.1.1 Asynchronous and Synchronous
In theory, asynchronous execution of reasoning operations can bring better performance; however, this approach may be inconvenient in terms of development experience.
When using asynchronous mode, developers need to first submit tasks to the smart contract of the inference network. When the inference task is completed, the Smart Contract of the inference network will return the result. In this programming mode, the logic is divided into two parts: inference call and inference result processing.
Source: IOSGVentures
If developers have nested inference calls and a lot of control logic, the situation will get worse.
Source: IOSGVentures
Asynchronous programming mode makes it difficult to integrate with existing smart contract. This requires developers to write a lot of additional code, and manage error handling and dependency management.
Relatively, synchronous programming is more intuitive for developers, but it introduces issues in response time and Block chain design. For example, if the input data is fast-changing data such as block time or price, then the data is no longer fresh after the inference is completed, which may lead to the need for Rollback in the execution of Smart Contract in specific scenarios. Imagine trading with an outdated price.
Source: IOSGVentures
Most AI infrastructure adopts asynchronous processing, but Valence is trying to address these issues.
3.2 Current Situation
In fact, many new reasoning networks are still in the testing phase, such as the Ritual network. According to their public documents, the current functions of these networks are relatively limited (functions such as verification, proof, etc. have not yet been launched). They do not currently provide a cloud infrastructure to support on-chain AI computing, but instead provide a framework for self-hosted AI computing and transmitting results to on-chain.
This is an architecture that runs AIGC Non-fungible Token. The diffusion model generates Non-fungible Token and uploads it to Arweave. The inference network will use this Arweave Address to on-chain mint the Non-fungible Token.
Source: IOSGVentures
This process is very complex, and developers need to deploy and maintain most of the infrastructure themselves, such as Ritual Node with customized service logic, Stable Diffusion Node, and Non-fungible Token Smart Contract.
Recommendation: Currently, the integration and deployment of custom models in inference networks is quite complex, and most networks do not support verification functionality at this stage. Applying AI technology to the front end provides developers with a relatively simple choice. If you need verification functionality, ZKML provider Giza is a good choice.
4. Proxy Network
The proxy network allows users to easily customize proxies. Such a network is composed of entities or Smart Contracts that can independently perform tasks, interact with each other, and interact with Block chain networks, all without direct human intervention. It is mainly aimed at LLM technology. For example, it can provide a GPT chatbot with a deep understanding of Ethereum. Currently, the tools for such chatbots are limited, and developers cannot develop complex applications on this basis.
Source: IOSGVentures
However, in the future, the proxy network will provide longer tools for agents to use, not only knowledge, but also the ability to call external APIs and perform specific tasks. Developers will be able to connect long agents together to build workflows. For example, writing Solidity smart contracts involves several specialized agents, including protocol design agents, Solidity development agents, code security review agents, and Solidity deployment agents.
Source: IOSGVentures
We coordinate the cooperation of these agents by using prompts and scenarios.
Examples of some proxy networks include Flock.ai, Myshell, Theoriq.
Recommendation: Most of the current agents have relatively limited functions. For specific use cases, Web2 agents can better serve and have mature orchestration tools, such as Langchain, Llamaindex.
5. The Difference Between Proxy Network and Inference Network
The agent network is more focused on LLM, providing tools such as Langchain to integrate a large number of agents. In most cases, developers do not need to personally develop machine learning models, as the agent network has simplified the process of model development and deployment. They only need to link the necessary agents and tools. In many cases, end users will directly use these agents.
The inference network is the infrastructure support of the proxy network. It provides developers with lower-level access permissions. In normal circumstances, end users do not directly use the inference network. Developers need to deploy their own models, not limited to LLM, and they can use them through off-chain or on-chain access points.
Proxy networks and inference networks are not completely independent products. We have begun to see some vertically integrated products that offer both proxy and inference capabilities because these two functions rely on similar infrastructure.
6. New Opportunity Land
In addition to model inference, training, and agent networks, there are many new areas worth exploring in the web3 domain:
Dataset: How to transform blockchain data into a machine-learning-friendly dataset? What machine learning developers need is more specific and specialized data. For example, Giza provides some high-quality datasets specifically for machine learning training on Decentralized Finance. The ideal data should not only be simple tabular data, but also include graphical data that can describe interactions in the blockchain world. Currently, we are lacking in this aspect. Some projects are addressing this issue by rewarding individuals to create new datasets, such as Bagel and Sahara, which promise to protect the privacy of personal data.
Model Storage: Some models are large in size, and how to store, distribute, and version control these models is crucial, as it relates to the performance and cost of on-chain machine learning. In this field, pioneering projects such as FIL, AR, and 0g have made progress.
Model training: Distributed and verifiable model training is a challenge. Gensyn, Bittensor, Flock, and Allora have made significant progress.
Monitoring: As model inference occurs both on-chain and off-chain, we need new infrastructure to help web3 developers track the usage status of models and promptly identify potential issues and biases. With proper monitoring tools, web3 machine learning developers can make timely adjustments and continuously optimize model accuracy.
RAG Infrastructure: The distributed RAG requires a brand new infrastructure environment, with high demands for storage, embedded computing, and vector databases, while ensuring the privacy and security of data. This is very different from the current Web3 AI infrastructure, which mostly relies on third parties to complete the RAG, such as Firstbatch and Bagel.
A model tailored for Web3: Not all models are suitable for the Web3 scenario. In many cases, it is necessary to retrain the model to adapt to specific applications such as price prediction and recommendations. With the prosperous development of AI infrastructure, we expect to have more local web3 models to serve AI applications. For example, Pond is developing a blockchain GNN for various scenarios such as price prediction, recommendations, fraud detection, and anti-money laundering.
Network evaluation: It is not easy to evaluate agents without human feedback. With the popularization of agent creation tools, there will be countless agents in the market. This requires a system to showcase the capabilities of these agents and help users determine which agent performs best in specific situations. For example, Neuronets is a participant in this field.
Consensus Mechanism: For AI tasks, PoS may not be the best choice. The complexity of computation, the difficulty of verification, and the lack of determinism are the main challenges faced by PoS. Bittensor has created a new intelligent Consensus Mechanism that rewards Nodes contributing to machine learning models and outputs in the network.
7. Future Prospects
We are currently observing the trend of vertical integration. By building a foundational computational layer, the network is able to support a variety of machine learning tasks, including training, inference, and proxy network services. This model is intended to provide a comprehensive one-stop solution for Web3 machine learning developers.
Currently, although on-chain reasoning is costly and slow, it provides excellent verifiability and seamless integration with backend systems (such as Smart Contract). I believe the future will move towards hybrid applications. Some reasoning processing will be done on the frontend or off-chain, while the critical and decisive reasoning will be completed on-chain. This model has already been applied on mobile devices. By leveraging the inherent characteristics of mobile devices, it can quickly run small models locally and migrate more complex tasks to the cloud for processing using larger LLM.
View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
IOSG: Where is the next breakthrough point of Web3+AI?
Author: IOSG Ventures
Thanks for the feedback from Zhenyang@Upshot, Fran@Giza, Ashely@Neuronets, Matt@Valence, Dylan@Pond.
This study aims to explore which artificial intelligence fields are most important for developers, and which may be the next breakout opportunities in the Web3 and artificial intelligence fields.
Before sharing new research perspectives, we are delighted to announce our participation in RedPill’s first round of financing totaling $5 million. We are also very excited and look forward to growing together with RedPill!
TL;DR
With the integration of Web3 and AI becoming a hot topic in the encryption currency industry, the AI infrastructure in the encryption world is flourishing. However, the actual utilization of AI or applications built by AI is not long-lasting, and the homogeneity problem of AI infrastructure is gradually emerging. The recent first round of financing participated by us in RedPill has triggered some deeper understanding.
Source:IOSG
Introduction
Source: IOSGVentures
1. RedPill: Providing Decentralization Authorization for OpenAI
The RedPill we mentioned earlier is a very good entry point.
OpenAI has several world-class powerful models, such as GPT-4-vision, GPT-4-turbo, and GPT-4o, which are the preferred choice for building advanced artificial intelligence Dapps.
Developers can integrate OpenAI API into dApp through Oracle Machine or front-end interface calls.
RedPill integrates the OpenAI API from different developers under one interface, providing fast, economical, and verifiable AI services to users worldwide, thus democratizing access to top AI model resources. RedPill’s routing Algorithm will direct developers’ requests to a single contributor. API requests will be executed through its distribution network, bypassing any potential restrictions from OpenAI, and addressing some common issues faced by developers, such as encryption.
By using the same request code but replacing the hostname, developers can access OpenAI models at a low cost, with high scalability and no restrictions.
2. GPU Network
In addition to using OpenAI’s API, many developers also choose to host models at home. They can rely on decentralized GPU networks such as io.net, Aethir, Akash, and other popular networks to establish GPU clusters and deploy and run various powerful internal or open-source models themselves.
Such a Decentralization GPU network can leverage the computing power of individuals or small data centers to provide flexible configurations, longer server location choices, and lower costs, enabling developers to easily conduct AI-related experiments within a limited budget. However, due to the nature of Decentralization, such GPU networks still have certain limitations in functionality, availability, and data privacy.
In the past few months, the demand for GPUs has been booming, surpassing the previous BTC mining frenzy. The reasons for this phenomenon include:
Recommendation: For Web2 developers who do not attach much importance to SLA, io.net provides a simple and user-friendly experience, making it a cost-effective choice.
3. Inference Network
This is the core of the on-chain AI infrastructure, which will support billions of AI inference operations in the future. Many AI layer1 or layer2 provide developers with the ability to invoke AI inference natively on-chain. Market leaders include Ritual, Valence, and Fetch.ai.
These networks differ in the following aspects:
3.1 Goal
The ideal scenario is that developers can easily access custom AI inference services anywhere, in any form of proof, with almost no obstacles in the integration process.
The inference network provides all the basic support needed by developers, including on-demand generation and proof of validation, inference calculation, inference data Relay and verification, providing interfaces for Web2 and Web3, one-click model deployment, system monitoring, Cross-Chain Interaction operations, synchronous integration and scheduled execution, etc.
Source: IOSGVentures
With these features, developers can seamlessly integrate inference services into their existing smart contracts. For example, when building decentralized finance trading robots, these robots use machine learning models to find buying and selling opportunities for specific trading pairs and execute corresponding trading strategies on the underlying trading platform.
In a completely ideal state, all the infrastructure is cloud-hosted. Developers only need to upload their trading strategy models in a common format such as torch, and the inference network will store and serve the models for Web2 and Web3 queries.
Once all the model deployment steps are completed, developers can directly call the model inference through the Web3 API or smart contract. The inference network will continuously execute these trading strategies and provide feedback to the underlying smart contract. If the community funds managed by the developer are substantial, verification of the inference results is also required. Once the inference results are received, the smart contract will execute trades based on these results.
Source: IOSGVentures
3.1.1 Asynchronous and Synchronous
In theory, asynchronous execution of reasoning operations can bring better performance; however, this approach may be inconvenient in terms of development experience.
When using asynchronous mode, developers need to first submit tasks to the smart contract of the inference network. When the inference task is completed, the Smart Contract of the inference network will return the result. In this programming mode, the logic is divided into two parts: inference call and inference result processing.
Source: IOSGVentures
If developers have nested inference calls and a lot of control logic, the situation will get worse.
Source: IOSGVentures
Asynchronous programming mode makes it difficult to integrate with existing smart contract. This requires developers to write a lot of additional code, and manage error handling and dependency management.
Relatively, synchronous programming is more intuitive for developers, but it introduces issues in response time and Block chain design. For example, if the input data is fast-changing data such as block time or price, then the data is no longer fresh after the inference is completed, which may lead to the need for Rollback in the execution of Smart Contract in specific scenarios. Imagine trading with an outdated price.
Source: IOSGVentures
Most AI infrastructure adopts asynchronous processing, but Valence is trying to address these issues.
3.2 Current Situation
In fact, many new reasoning networks are still in the testing phase, such as the Ritual network. According to their public documents, the current functions of these networks are relatively limited (functions such as verification, proof, etc. have not yet been launched). They do not currently provide a cloud infrastructure to support on-chain AI computing, but instead provide a framework for self-hosted AI computing and transmitting results to on-chain.
This is an architecture that runs AIGC Non-fungible Token. The diffusion model generates Non-fungible Token and uploads it to Arweave. The inference network will use this Arweave Address to on-chain mint the Non-fungible Token.
Source: IOSGVentures
This process is very complex, and developers need to deploy and maintain most of the infrastructure themselves, such as Ritual Node with customized service logic, Stable Diffusion Node, and Non-fungible Token Smart Contract.
Recommendation: Currently, the integration and deployment of custom models in inference networks is quite complex, and most networks do not support verification functionality at this stage. Applying AI technology to the front end provides developers with a relatively simple choice. If you need verification functionality, ZKML provider Giza is a good choice.
4. Proxy Network
The proxy network allows users to easily customize proxies. Such a network is composed of entities or Smart Contracts that can independently perform tasks, interact with each other, and interact with Block chain networks, all without direct human intervention. It is mainly aimed at LLM technology. For example, it can provide a GPT chatbot with a deep understanding of Ethereum. Currently, the tools for such chatbots are limited, and developers cannot develop complex applications on this basis.
Source: IOSGVentures
However, in the future, the proxy network will provide longer tools for agents to use, not only knowledge, but also the ability to call external APIs and perform specific tasks. Developers will be able to connect long agents together to build workflows. For example, writing Solidity smart contracts involves several specialized agents, including protocol design agents, Solidity development agents, code security review agents, and Solidity deployment agents.
Source: IOSGVentures
We coordinate the cooperation of these agents by using prompts and scenarios.
Examples of some proxy networks include Flock.ai, Myshell, Theoriq.
Recommendation: Most of the current agents have relatively limited functions. For specific use cases, Web2 agents can better serve and have mature orchestration tools, such as Langchain, Llamaindex.
5. The Difference Between Proxy Network and Inference Network
The agent network is more focused on LLM, providing tools such as Langchain to integrate a large number of agents. In most cases, developers do not need to personally develop machine learning models, as the agent network has simplified the process of model development and deployment. They only need to link the necessary agents and tools. In many cases, end users will directly use these agents.
The inference network is the infrastructure support of the proxy network. It provides developers with lower-level access permissions. In normal circumstances, end users do not directly use the inference network. Developers need to deploy their own models, not limited to LLM, and they can use them through off-chain or on-chain access points.
Proxy networks and inference networks are not completely independent products. We have begun to see some vertically integrated products that offer both proxy and inference capabilities because these two functions rely on similar infrastructure.
6. New Opportunity Land
In addition to model inference, training, and agent networks, there are many new areas worth exploring in the web3 domain:
7. Future Prospects
We are currently observing the trend of vertical integration. By building a foundational computational layer, the network is able to support a variety of machine learning tasks, including training, inference, and proxy network services. This model is intended to provide a comprehensive one-stop solution for Web3 machine learning developers.
Currently, although on-chain reasoning is costly and slow, it provides excellent verifiability and seamless integration with backend systems (such as Smart Contract). I believe the future will move towards hybrid applications. Some reasoning processing will be done on the frontend or off-chain, while the critical and decisive reasoning will be completed on-chain. This model has already been applied on mobile devices. By leveraging the inherent characteristics of mobile devices, it can quickly run small models locally and migrate more complex tasks to the cloud for processing using larger LLM.