LibP2P HTTP Tunnel PoC

LibP2P Peer Connections: 0

Status: DISCONNECTED

Request

Response

What is this?

This is a PoC which demonstrates tunneling HTTP requests through LibP2P. In this particular case, you are making a Chaingraph API request to a Chaingraph server I have running on a modest SBC (CM3588 8GB RAM) that is sitting in my home.

The intent of this particular PoC is just to test things and see how feasible this approach might be in future. There are many things I still need to do and tweak to improve robustness but, if it can potentially work well (and this isn't just a pipe-dream), much of the below should be possible.

Why would we want this?

There are a few reasons:

Paying cloud services to host BCH infrastructure can be expensive. In Chaingraph's case, it requires a Cloud Server with A LOT of storage (1TB+ if we include the BCHN node). This might allow us to host these expensive services from home while still allowing access to them from Web-based (or backend) services.
This removes a lot of dependence upon DNS (still used for bootstrapping with some servers). LibP2P nodes are not identified by their physical address (i.e. IP Address), but by their Public Key. This means that we could move our service to a different device or location and, provided it is setup with the same key, other nodes would still be able to find and connect to it.

So, in short: a) It allow us to host BCH infastructure on cheap devices from home, b) it eliminates most of the dependency upon DNS and c) It can give us more redundancy and a greater degree of censorship resistance.

Is this applicable to only Chaingraph?

No. Currently, this PoC is only setup to work with Chaingraph. But, theoretically, we should be able to tunnel any HTTP Service (and, in future, other protocols).

To give a few examples of where I'd like to head with this:

Self-hosted @ HOME Fulcrum Nodes (that can work seamlessly with the Electrum-Cash TS/JS Library)
Self-hosted @ HOME Oracle Relays (e.g. like Oracles.Cash)
Self-hosted @ HOME LLM's and other Compute Heavy AI
Self-hosted @ HOME (and Decentralized) Backend Services (e.g. Settlement Services)

Is your current implementation Production-ready?

No. This is a PoC and I've tried to keep it very simple so I can debug issues easier. There are many robustness and efficiency improvements still required.

We can speed up peer discovery by first trying Delegated Routing and falling back to DHT to find the Peer's Multiaddr (IP/Ports).
The HTTP requests are not streamed: The proxy waits for all data to be fetched from the HTTP endpoint before forwarding it. This is slow and will not work for some use-cases (e.g. SSE streams).
Various parameters probably need a lot of tweaking to make things more robust (appropriate connection limits, retries, etc).
Currently only supports JSON payloads
Add WebSocket support (so that we can support Fulcrum)

Will this be hard for frontend services to use?

I don't think so. The current API I have can be invoked like this:

// Create a P2P node and start it.
const p2pNode = await P2PNodeBrowser.create();
await p2pNode.start();

// Create a HTTP Client connecting to my Node's Public Key and use the service named "chaingraph" (which I've configured to proxy requests to https://chaingraph:8080).
const chaingraphClient = new P2PHTTPClient(p2pNode, '16Uiu2HAm46R8AgtQDzMfCpDnZuUHEXCRwLySwnzfmzRaThhRPmiW', 'chaingraph');

// Make the API request.
const response = await chaingraphClient.post('/v1/graphql', {
  query: {
    // ... your chaingraph query here
  },
}, {
  headers: {
    "accept": "*/*",
    "accept-language": "en-GB,en;q=0.8",
    "content-type": "application/json",
  }
});

// Log the response to the console.
console.log(response);

In future, I plan on trying to make the Client match JS's built-in `fetch` interface. That way, it should be easy to plug into existing apps or libraries by allowing a custom `fetch` implementation to be passed in.

Additionally, I also plan on adding a Cluster class that can specify several nodes and apply quorum policies. For example:

// Configure cluster
const chaingraphCluster = new P2PHTTPCluster([
  // Default Node. Note that we would also support just plain HTTP for those configurations where we might only want to use LibP2P nodes as fallbacks.
  new HTTPClient('https://demo.chaingraph.com'),
  // LibP2P Node 1
  new P2PHTTPClient(p2pNode, '16Uiu2HAm46R8AgtQDzMfCpDnZuUHEXCRwLySwnzfmzRaThhRPmiW', 'chaingraph'),
  // LibP2P Node 2
  new P2PHTTPClient(p2pNode, '16Uiu2HAm46R8AgtQDzMfCpDnZuUHEXCRwLySwnzfmzRaTABCDEFG', 'chaingraph'),
]);

// Make a request.
const response = await chaingraphClient.post('/v1/graphql', {
  query,
}, {
  headers: {
    "accept": "*/*",
    "accept-language": "en-GB,en;q=0.8",
    "content-type": "application/json",
  }
},
// Use a policy such that the first response received will be used.
// NOTE: Other policies could be specified such as 2 of X responses must be the same.
FIRST_RECEIVED
);

For those familiar with the `electrum-cash` library, the above is like a HTTP version of the `ElectrumCluster` (but will support specifying policy on a per-request basis as a global policy would not work for many cases).

Other questions

Are nodes anonymous?

No. The IP Address of the nodes will be exposed. You could probably run a node behind Tor (pretty sure I've read of people doing this with IPFS), but I suspect it'd be slow.

Where is the code?

It's a mess right now, with lots of different test shit just scattered around. I'm ashamed and embarrased of it. I'd also like to get some robustness improvements in based on people testing this page. I'm going to try and get it cleaned up and release what I have at some point next week.

Will it be difficult to setup the HTTP Proxy to run services from home?

Using something like Docker Compose, it should be pretty easy. The intent is that you would just be able to add a docker compose service like this:

  libp2p-http-proxy:
    image: developerscash/libp2p-proxy:dev
    ports:
      - "45555:45555"
    environment:
      # List of Service IDs to BaseURLs that can be proxied.
      SERVICES: |
        {
          "oracles.cash": "https://oracles.generalprotocols.com",
          "chaingraph": "http://hasura:8080"
        }

Are there DoS/Attack Vectors?

Yes, see here. One future avenue of research might be seeing if we can tie LibP2P's ConnectionGater to BCH (e.g. through the use of PayToMiner txs, PoW, etc).

Do I need to configure my router/firewall, etc?

It will definitely help if you open a port specifically for the proxy. However, LibP2P has a few cool hole-punching techniques it can utilize. If you're into that kind of thing, you might want to read about LibP2P's DCUtR protocol which can negotiate a connection upgrade via other nodes on the LibP2P network. It also supports using "Circuit Relays" as a fallback in case it cannot hole-punch (I haven't looked into how to configure that well yet - but plan to pending testing of the above).

Would including this library on a frontend bloat up my Site/App's bundle size?

Not as much as I thought it would. With what I have on this page, I think it adds around 750KB. Longer term, my hope is that we might see LibP2P integrated into Brave Browser natively like it has IPFS (but I'm not sure if that's really all that feasible yet - it's possible that Node's might need very specific custom configs for a given site/app). I should be able to get some more specific info once I set it up to build as a JS that can be served from CDN.

Can this be used for other things?

Yes. LibP2P is very flexible and built upon the concept of "protocols". What I have written above is a very simple HTTPProxy protocol, but a custom RPC for P2P communications could easily be written too.

Can LibP2P work browser-to-browser?

I think, technically, it can now (last time I tested around mid 2023, it couldn't). I'll be doing some testing of this in the upcoming weeks. Just as a precaution for others: Mobile devices are very pedantic about connections. In most cases, all connections will be terminated when a tab becomes non-visbile (to save battery). So you have to be a bit cautious about this kind of design (particularly if your UX has steps where the user might navigate away - e.g. invoking a `bitcoincash:` URL). I'll try to write more about this (and mitigations) once I get around to testing it.

How did you get Chaingraph running on your SBC?

I'll document that elsewhere in case others would like to do it. The short of it is that I needed to make a small patch to the codebase. Chaingraph, during initial sync, will do one single INSERT query to Postgres that includes every transaction in that block. This becomes troublesome for giant blocks (we have some with 100K txs). By breaking these queries into many smaller queries (~1MB), I was able to massively reduce the RAM required and sync BCHN, Postgres, Chaingraph, Hasura all on an SBC with only 8GB of RAM.