AI needs Web3

Author: Catrina Wang Compilation: Catrina SevenUp DAO Source: Coin Time

Image credit: Generated by Unbounded AI tools

Until recently, startups have led the way in technological innovation because of their speed, agility, entrepreneurial culture, and freedom from organizational inertia. However, in the rapidly growing AI era, things have changed. Big tech giants like Microsoft-owned OpenAI, Nvidia, Google, and even Meta have dominated groundbreaking AI products so far.

So what went wrong? Why did "Goliaths" trump "Davids" this time around? While startups can write great code, they often cannot compete with the big tech giants due to several challenges:

  1. The computational cost is still extremely high;
  2. AI has a problem called "reverse salient": the lack of necessary regulatory measures will hinder innovation due to concerns and uncertainties about social impact;
  3. AI is a black box;
  4. The data divide of already widening players (big tech companies) creates barriers for emerging competitors.

So, how does this relate to blockchain technology, and where does it intersect with artificial intelligence? Although not a panacea, in Web3, **DePIN (Decentralized Physical Infrastructure Network) can improve AI technology by solving the above challenges. **In this article, I will explain how to use the technology behind DePIN to enhance artificial intelligence from four dimensions:

1. Reduce infrastructure costs; 2. Verify the identity and humanity of the producer; 3. Inject democracy and transparency into AI; **4. Install an incentive mechanism for data contribution. **

In the context of this article,

  1. "Web3" is defined as the next generation of Internet, blockchain technology is an important part of it, and also includes other existing technologies;
  2. "Blockchain" means decentralized and distributed ledger technology;
  3. "Cryptocurrency" refers to the use of tokens as an incentive and decentralization mechanism.

First, reduce infrastructure costs (computing and storage)

The importance of infrastructure affordability (in the context of AI, the cost of hardware to compute, deliver, and store data) is highlighted in Carlota Perez's "Technological Revolution" framework . The framework proposes that every technological breakthrough has two phases:

1) The installation phase is characterized by heavy VC investment, infrastructure build-up and a "push" to market (GTM) approach, as the customer's value proposition for the new technology is not yet clear. 2) The deployment phase is characterized by a rapid increase in infrastructure supply, which lowers the barrier to entry for new entrants, while being characterized by a "pull" GTM approach, indicating that customers desire more of an as-yet-unestablished product and that there is a strong product-market fit Spend. While ChatGPT already has a clear product-market fit and huge customer demand, one might think that AI has already entered the deployment phase. **However, one thing is still missing: an excess supply of infrastructure to make it cheap enough for price-sensitive startups to build and experiment with. ** 1. The Problem The problem is that the current market dynamics in the physical infrastructure space are largely vertically integrated oligopolies, where companies like AWS, GCP, Azure, Nvidia, Cloudflare, and Akamai enjoy high profits. For example, AWS has an estimated 61% gross margin on commodity computing hardware.

  • Computationally expensive for new AI entrants, especially in LLM.
  • The training cost of ChatGPT is about $4 million, and the hardware inference cost is about $700,000 per day.
  • Bloom's second version is expected to cost $10 million to train and retrain.
  • If ChatGPT were deployed into Google Search, it would cost Google $36B in revenue, a huge profit shift from the software platform (Google) to the hardware provider (Nvidia).

2. Solution DePIN networks (such as Filecoin, Bacalhau, Render Network, and ExaBits) can achieve infrastructure cost savings of more than 75%-90% through the following three levers. These networks are pioneers since 2014 focused on accumulating large-scale Internet hardware for decentralized data storage, while Bacalhau, Render Network, and ExaBits are the coordination layers that match demand with CPU/GPU supply. ** (Disclaimer: The author was a former employee of Protocol Labs and a consultant to ExaBits)

1) Push up the supply curve and create a more competitive market DePIN democratizes hardware supplier onboarding by enabling hardware suppliers to become service providers. It creates competition for these vested interests by creating a market where anyone can join the network as a "miner", offering their CPU/GPU or storage power in exchange for financial rewards. While companies like AWS undoubtedly enjoy a 17-year head start in user interface, operational excellence, and vertical integration, DePIN unlocks a new customer base that was previously overpriced by centralized providers. Just like Ebay will not directly compete with Bloomingdale, but introduce more affordable alternatives to meet similar needs, the DePIN network will not replace centralized providers, but instead aims to serve a more price-conscious user base.

2) Balance the economy of these markets through cryptoeconomic design DePIN creates a subsidy mechanism to induce hardware suppliers to participate in the network, thereby reducing costs for end users. To understand how it works, let's first compare the costs and revenues of storage providers in AWS and Filecoin.

A. DePIN network can reduce costs for customers: DePIN network creates a competitive market and introduces Bertrand-style competition, thereby reducing costs for customers. In contrast, AWS EC2 needs a middle 50% profit margin and a 31% gross margin to stay afloat. B. By issuing token rewards/block rewards as a new source of income, the DePIN network can provide more benefits. In the context of Filecoin, hosting more actual data means that the storage provider earns more block rewards (tokens). Therefore, storage providers have an incentive to attract more customers and win more deals to maximize revenue. The token structure of several emerging computational DePIN networks remains under wraps, but will likely follow a similar pattern. Examples of these networks include: Bacalhau: A coordination layer that brings computation to where data is stored without moving large amounts of data ExaBITS: A Decentralized Computing Network for Artificial Intelligence and Computationally Intensive Applications

3) Reduce overhead costs: The benefits of DePIN networks such as Bacalhau and ExaBITS and IPFS/content-addressed storage include: A. Create Availability from Latent Data: Due to the high bandwidth cost of transferring large datasets, there is a large amount of untapped data. For example, sports stadiums generate large amounts of event data, which is currently not used. The DePIN project unlocks the availability of such latent data by processing the data on-site and transmitting only meaningful outputs. B. Reduce operational costs through local ingestion of data, such as data entry, transmission, and import/export. C. Minimize the manual process of sharing sensitive data: For example, if hospitals A and B need to merge their respective sensitive patient data for analysis, they can use Bacalhau to coordinate GPU power to directly process the sensitive data locally instead of going through cumbersome administrative procedures with The counterparty processes the PII (Personally Identifiable Information) exchange. D. Eliminate the need to recompute the underlying dataset: IPFS/content-addressed storage has built-in properties to deduplicate, track lineage, and verify data. Here is further reading on the features and cost benefits that IPFS brings. 3. SummaryAI needs DePIN to get affordable infrastructure, and the current market is monopolized by vertically integrated oligopolies. DePIN networks such as Filecoin, Bacalhau, Render Network, and ExaBits can provide 75%-90%+ cost savings by democratizing access to hardware suppliers and introducing competition, balancing market economies through cryptoeconomic design, and reducing overhead costs.

Second, Creatorship & Humanity Verification

1. Question According to a recent survey, 50% of artificial intelligence scientists believe that there is at least a 10% chance that artificial intelligence will lead to the destruction of human beings. This is a sobering thought. AI is already causing societal disruption, and we currently lack a regulatory or technological assurance structure — what the government calls a “reverse springboard.” **

Unfortunately, the societal impact of AI goes far beyond fake podcast debates and images:

  1. The 2024 presidential election cycle will be one, a deepfake AI-generated political campaign that is hard to distinguish from the real one.
  2. Video of Senator Elizabeth Warren being edited to make it look like Warren is saying Republicans shouldn't be allowed to vote (debunked).
  3. Faking Biden’s voice cloning for criticizing trans women.
  4. A group of artists filed a class action lawsuit against Midjourney and Stability AI alleging the unauthorized use of the artists' work to train AI images that violated the artists' trademarks and threatened their livelihoods.
  5. A deepfake AI-generated soundtrack titled "Heart on My Sleeve," featuring The Weeknd and Drake, gained traction before it was pulled from the streaming service. The controversy surrounding copyright infringement is a harbinger of the complications that can arise when a new technology enters the mainstream consciousness without the necessary rules. In other words, it's a reverse springboard problem. What if we could have some protection for AI through cryptographic proofs in Web3? **2. Solution 1) Prove the creator's identity and human identity through the encrypted source certificate on the blockchain. ** This is where we can leverage blockchain technology - as a distributed ledger containing immutable records on the blockchain. This makes it possible to verify the authenticity of digital content by checking its cryptographic proofs.

2) Digital signature proves the identity and humanity of the creator To prevent deepfakes, cryptographic proofs can be generated using a digital signature that is unique to the original creator of the content. This signature can be created using a private key, known only to the creator, verifiable using a public key, and available to all. By attaching this signature to content, it is possible to prove that the content was created by the original creator, whether they be a human or an AI, and that authorized/unauthorized changes were made to this content.

3) Use IPFS and Merkle tree to prove authenticity IPFS is a decentralized protocol that uses content addressing and Merkle trees to reference large datasets. In order to prove changes to the contents of a file, a Merkle proof is generated, which is a list of hashes showing a particular block of data in the Merkle tree. Every time there is a change, a new hash is generated and the Merkle tree is updated, providing proof of the file modification.

Such cryptographic solutions may face the problem of incentives and rewards: After all, catching deepfake generators won't have as much financial cost as reducing negative social externalities. Responsibility will likely fall on major media distribution platforms such as Twitter, Meta, Google, etc., who are already flagging. **So why do we need blockchain? **The answer is that these cryptographic signatures and proofs of authenticity are more efficient, verifiable and deterministic. Today, the process of detecting deepfakes is largely through machine learning algorithms (such as Meta's "Deepfake Detection Challenge", "Google's Asymmetric Number System" (ANS) and c2pa) to identify patterns and anomalies in visual content, which are sometimes inaccurate. Accurate, and are falling behind increasingly sophisticated deepfakes. The intervention of human moderators is often required to assess authenticity, which is inefficient and expensive.

Imagine a world where every piece of content has its cryptographic signature so that everyone can verifiably prove the origin of a creation and flag manipulation or forgery - a brave new world. 3. Summary Artificial intelligence poses a major threat to society, with deepfakes and unauthorized use of content being major concerns. Web3 technologies, such as digital signatures proving creator identity and humanity and using IPFS and Merkle trees to prove authenticity, can provide security for AI by verifying the authenticity of digital content and preventing unauthorized changes.

Third, inject democracy into AI

1. Problem Today, artificial intelligence is a black box composed of proprietary data and proprietary algorithms. The closed nature of such large tech companies leads to the impossibility of "AI democracy", that is, every developer and even user should be able to contribute algorithms and data to LLM models and receive a share of the model's future profits (as discussed in this paper). discussed).

AI Democracy = Visibility (the ability to see the data and algorithms fed into the model) + Contribution (the ability to contribute data or algorithms to the model). 2. Solution AI Democracy aims to make generative AI models accessible, relevant and owned by everyone. The table below compares what is possible today with what blockchain technology will make possible in Web3.

1) Nowadays A. For consumers: B. For developers: Little repeatability as there is no traceability of ETL performed on the data 80% of data scientist time is wasted performing low-level data cleansing due to lack of ability to verify shared data output

  1. Blockchain will make it possible to: A. For consumers: Users can provide feedback (e.g. fine-grained feedback on bias, content moderation, output) as input for continuous fine-tuning

B. For developers: Decentralized data curation layer: Crowdsource tedious and time-consuming data preparation processes such as data labeling Visibility and ability to combine and fine-tune algorithms with verifiable and lineage-based (i.e. they can see a tamper-proof history of all past changes) Data sovereignty (achieved through content addressing/IPFS) and algorithm sovereignty (for example, Urbit realizes point-to-point combination and portability of data and algorithms) Innovative LLMs emerging from fundamental variants of the open source model generate a push to accelerate innovation Reproducible training data output via blockchain immutable recording of past ETL operations and queries (e.g. Kamu) It might be argued that Web2's open source platform is a compromise, but it's still far from optimal for the reasons described in this article. 3. Summary The closed nature of large technology companies has led to the impossibility of "AI democracy", that is, every developer or user should be able to contribute algorithms and data to the LLM model and get from the future profits of the model part. AI should be accessible, relevant and owned by everyone. The blockchain network will enable users to provide feedback, contribute data to model monetization, and give developers the visibility and ability to compose and fine-tune algorithms with verifiable and lineage-based features. Web3 innovations such as content addressing/IPFS and Urbit will enable data and algorithm sovereignty. Repeatable training data output from past ETL operations and queries will also be possible through the blockchain's immutable record.

Fourth, set data contribution incentives

1. Problem Today, the most valuable consumer data is the proprietary business divide of the big tech platforms. Tech giants don't have much incentive to share this data with outside parties.

So, why not get this data directly from the originator/user of the data? Why not make data a public good by contributing our data and open-sourcing it for talented data scientists?

In short, there is no incentive or coordination mechanism to make this happen. The tasks of maintaining data and performing ETL (extract, transform, and load) incur significant overhead. In fact, the data storage industry alone will be a $777 billion industry in 2030, not counting the cost of computing. Why would anyone take on data plumbing work and costs when there is nothing in return?

For example, OpenAI was originally open source and non-profit, but because it is not easy to make money, it has fallen into trouble. Finally, in 2019, it had to take a capital injection from Microsoft and shut down its algorithm to the public. OpenAI is expected to generate $1 billion in revenue by 2024. 2. Solution Web3 introduces a new mechanism called dataDAO, which facilitates the redistribution of income from AI model owners to data contributors, creating an incentive layer for crowdsourced data contributions.

Conclusion

In conclusion, DePIN is an exciting new category that provides an alternative fuel in hardware to fuel the renaissance of Web3 and AI innovation. While big tech companies dominate the AI industry, emerging players competing with blockchain technology also have the potential to:

The DePIN network lowers the threshold of computing costs; the verifiable and decentralized nature of the blockchain makes true open AI possible; innovative mechanisms, such as dataDAO, incentivize data contributions; the immutable and tamper-proof properties of the blockchain provide proof of the identity of the author to address concerns about the negative societal impact of AI.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
0/400
No comments
Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate app
Community
English
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)