Provenance aware Data Network for Massive Scale Data Science

Pando is a Provenacne aware Data Network for Massive Scale Data Science that follow Data Mesh principles : domain-oriented decentralized data ownership and architecture, data as a product, self-serve data infrastructure as a platform, and federated computational governance.

Join the Web3 reputation projects using the Pando network


Best Features

The Pando takes streams of data, aggregates them into CAR archives, backs them up to filecoin, and provides a lightweight query interface for consumers to read specific points of metadata with higher assurance of their global consistency and veracity.

Data Availability

Keep included Data consistently available.

Data Integrity

Trustworthy Data from Data Provider to Data Consumer.

Data Immutability

Discourage historical revisionism.

IPLD Compatible

Be Compatible with data model of the content-addressable web and hash-linked data structures.

Aggregation Computing

Data cube computation over Pando datasets.

Provenance aware

Data provenance (also referred to as “data lineage”) is metadata that is paired with records that details the origin, changes to, and details supporting the confidence or validity of data.

Huge Merkle Forest

Pando Legend

Pando is, in fact, the scientific name of the largest organism on Earth. Above ground, Pando appears to be a grove of individual trees, like any other grove. But underground everything is connected by a single and vast root system. It is one tree. A one-tree-forest.

IPLD is short for Inter Planetary Linked Data and is heart of IPFS. The files/data structures are linked to each other using Merkle links.

Pando is a Provenance aware Data Network backed by IPLD data model.Interoperability of systems can also persist in a huge merkle forest, where each tree represents a separate merkle tree.


The Timeline

About the features and timeline for upgrading the Pando Network.

Q3 2022

Storage layer
Self-serve Integration SDK
Community engagement

Storage layer

Q4 2022

DID & Authentication Layer
Prolly Tree Indexing layer

Indexing layer

Q1 2023

SQL Query Layer
Self-serve Cube Computation

Query Layer

Q2 2023

Scalability & Sharding
Verifiable Query


Q3 2023

Incentive mechanism
Data DAO


Q4 2023

V1 Launch

Web3 Data Infra faces some new challenges

Why Pando?

To give real service you must add something which cannot be bought or measured with money, and that is sincerity and integrity.


Web3 developers expect Off-Chain data protocols

Evolution of Blockchain Components to Off-Chain Models

“In terms of maturity, … the ecosystem of off-chain storage solutions is not yet where it needs to be to build out some of the more advanced use cases some developers might want…”


Web3 developers expect Querying and managing data by SQL

SQL is the King

“SQL is a special-purpose programming language designed for managing data in a relational database, and is used by a huge number of apps and organizations.”


Linked Data lies at the heart of Web3

IPLD Database

“InterPlanetary Linked Data (IPLD) is the data layer for content-addressed systems and the Web 3.0. It is a suite of technologies for representing and traversing hash-linked data.”


Paradigm shift in How we manage Big Data in Web3

Data Mesh

“domain-oriented decentralized data ownership and architecture, data as a product, self-serve data infrastructure as a platform, and federated computational governance.”

Build apps with Linked data

Provenace aware Data Network

In the context of scientific workflows, provenance usually means the lineage and processing history of a data product, and the record of the processes that led to it. Provenance captures workflow design and execution history. Provenance helps in tracking workflow inputs, outputs, process and data intersection points, so that experiments can be verified, replayed, and, when possible, reproduced in precise manner. Provenance also enables comparison between different workflow versions, smart re-reruns and failure recovery.

app image app image

Frequently Asked Questions

Frequently asked questions about Pando. Find out how the easiest way to build data pipelines on the decentralized web using Pando, or how it differs from other services.

Pando was originally designed for Reputation Datastore - a sidechain data store which serves as the reputation datastore for the RepSys:
- Producers can input their data onto Pando.
- Consumers can read data from Pando to inform dealmaking decisions on the Filecoin network.
- All data entries must be cryptographically signed via public keys to verify authenticity.

Pando is now generalized to Web3 Data Network, which aims to be the data availability layer of the blockchain.

The quickest way to get Pando up and running on your machine is by installing Pando with terminal commands. For installing and initializing Pando from the command line, check out the command-line quick start guide.
Awesome Pando is a good starting point to see the wide variety of projects that are using Pando today.

There are a lot of ways you can contribute to Pando, whether you're interested in helping with either of the core implementations, applications like reputation service, writing or editing documentation, doing UX, or whatever you enjoy working on. Get all the details on where to get started here.

Filecoin and IPFS are two separate, complementary protocols, both created by Protocol Labs. IPFS allows peers to store, request, and transfer verifiable data with each other, while Filecoin is designed to provide a system of persistent data storage. Under Filecoin's incentive structure, clients pay to store data at specific levels of redundancy and availability, and storage providers earn payments and rewards by continuously storing data and cryptographically proving it.

In short: IPFS addresses and moves content, while Filecoin is an incentive layer to persist data.

Some data like reputation metrics doesn’t make sense to directly embed within the filecoin chain for a few reasons: It is produced by independent entities, so the data itself does not need to meet the same ‘consensus’ bar as what we would expect in a global chain, and likewise aspects of reputation and measurements may have aspects of subjectivity. It is also expected that there is diversity of data and that experimentation is a good thing.

However, there are nice properties of having this sort of metadata ecosystem more tightly linked to the chain that seem desirable to encourage, and this leads to the goals for the Pando:
- Keep included metadata consistently available
- Provide light-weight, unbiased access to metadata
- Discourage historical revisionism.

In short: Pando make it easier for users to store and retrieve data in the Structured Query Language.


Recent News & Blogs

Find the latest Blogs news from Pando. See related science and technology articles, photos, slideshows and videos.


By Admin

Jan 25, 2025

Introducing the Pando Network



By Admin

Jan 25, 2025

How does Pando work?



By Admin

Jan 25, 2025

How to run a Pando node?.




For Developers and Contributors

Jump into the source code on Github and participate on Slack channels: #pando-builders-wg.

How Can We Help?

Our experts will provide you with more information about our solutions and expertise.Tell us your problem we will get back to you ASAP.

Contact Us

[email protected]

Send us a Message


We will never spam or share your information.