Tuesday, June 29, 2021

Building Permanent and Censorship-Resistant Blog with Ethereum ENS and IPFS



The internet nowadays is ephemeral. Layers upon layers of trusted 3rd parties are necessary to distribute content online. It requires ongoing maintenance and is susceptible to censorship or hostile takedowns. In this blog post, I describe the steps I took to host my blog in a trustless, permanent, and censorship-resistant way using the IPFS network and Ethereum blockchain.

We’ll be covering many topics, including NFTs, Ethereum smart contracts, and ENS domains, but you don’t need to be familiar with any of those. This post aims to help you configure your own trustless and decentralized website even if you don’t have any blockchain-related experience.

Disclaimer: The information provided in this blog post is for educational purposes only. Please do your own research on the potential consequences of trying to circumvent censorship in your jurisdiction.

How to make your online content disappear?

Since I’ve started blogging, it was my goal to publish on an infrastructure that I own. Publishing on platforms like Medium is giving away your work for free. Don’t do it. If you trust the corporations to host your content permanently, you might want to read up on the story of Yahoo groups and the desperate community attempts to salvage them.

This blog is a static website generated with Jekyll hosted on an AWS EC2 instance with an NGINX server and distributed via Cloudflare CDN. Cloudflare is also a registrar of my domain.

The setup of this blog is as proprietary as it can get with the standard web toolkit. However, it would be a vast overstatement to say that I’m in control of it.

This blog has a bus factor of one. If I fail to fund or renew my credit cards, all my blog posts will disappear within a month or two. The best I could hope for is that someone would eventually buy back this domain to harvest the backlinks for SEO.

Recently I’ve started exploring the ecosystem of cryptocurrencies and blockchains. It turns out there’s more to it than gambling and overpriced kittens. Read of if you want to learn how the slowly emerging Web 3.0 can help make your online presence more independent and permanent.

What’s the IPFS all about?

IPFS is an alternative protocol to HTTP for serving static assets. The main principle that distinguishes it from HTTP is that files are addressed by the cryptographic hash (so-called CID - Content Identifier) of their content rather than location. File with the text Hello world! will always be addressed as ipfs://QmXgBq2xJKMqVo8jZdziyudNmnbiwjbpAycy5RbfDBoJRM regardless if a node in the USA or India serves it.

You can use the IPFS protocol directly if your browser supports it (Brave or Opera). Otherwise, you can access it using one of the many HTTP gateways available, i.e., the one by Cloudflare:

‘Hello world!’ on IPFS

An additional bonus is that since the file’s address is derived from its content, you can always verify that it has not been tampered with. After installing the ipfs CLI you can check that our Hello world! is indeed legit by running the following command:

curl https://cloudflare-ipfs.com/ipfs/QmXgBq2xJKMqVo8jZdziyudNmnbiwjbpAycy5RbfDBoJRM | ipfs add

You should get the following output:

added QmXgBq2xJKMqVo8jZdziyudNmnbiwjbpAycy5RbfDBoJRM

Another powerful side-effect is that contrary to hosting files by their location, content-based addressing means that you can easily achieve the ultimate redundancy. If you host assets on AWS S3 and your account is suspended, they will no longer be accessible. IPFS assets can be downloaded as long as there’s at least one node hosting them. We’ll discuss how it can help the permanence of your website later in this article.

Common IPFS misconceptions

This blog post is by no means trying to be a comprehensive introduction to IPFS. But I’d like to quickly address common misconceptions.

  • IPFS is not free storage in the cloud. Noone will permanently host your files for free. Some nodes might cache them temporarily, but unless you use the so-called commercial “pinning service” or run a 24/7 node yourself, your files will eventually disappear from the network.
  • IPFS is uncensorable so that files can be accessed as long as at least one copy remains in the peer-to-peer network. In 2017 a Catalan government used IPFS to bypass Spain’s censorship attempts. Also, an uncensorable version of Wikipedia was created after the Turkish government banned it. However, IPFS is not anonymous. All the nodes publicly advertise their IPs and which files they are hosting. It means that distributing illegal content might have consequences.
  • Once you upload the file to IPFS, it cannot be removed as long as at least one node is hosting it. You cannot force other nodes to remove your file.
  • Unless configured otherwise, your node will temporarily cache all the content you’ve accessed and distribute it to the rest of the network. However, files that you’ve not accessed will never automatically be hosted by your node.

Can IPFS be used to host a static website?

Short answer, YES. Long answer below, but spoiler alert: it’s not yet straightforward.

My approach when I’ve started migrating this website to IPFS was to do it as close to the metal as possible. A few services can help host your website on IPFS. But, these are additional trusted 3rd parties that I want to avoid to follow the rule that “Any trusted 3rd party is a security hole.”.

To upload content to the IPFS network, you need to start with installing the IPFS CLI. On macOS, it’s a simple as typing:

Please refer to the official documentation for information on how to install it on other systems.

Now you have to start the node process:

The rest of this tutorial assumes that you have a folder with the contents of your static website. I’m using Jekyll, but the following process will work the same regardless of the toolkit you use to generate it.

Let’s start with generating the static contents of our website:

JEKYLL_ENV=production jekyll build --destination _blog_ipfs/

Next you have to make all the internal links relative. We’ll be using IPFS CID of the main folder rather than hashes of individual files for linking. It means that i.e. our CSS link must look like that:

  <link rel="stylesheet" type="text/css" href="assets/styles.css">

Instead of:

  <link rel="stylesheet" type="text/css" href="/assets/styles.css">

Notice the additional / character


Otherwise, it would fail to resolve the IPFS path correctly. There’s a great tool that automates this process npm all-relative package. Let’s use it:

npm install -g all-relative
cd _blog_ipfs
all-relative

Jekyll generates files with an html extension. On IPFS, no NGINX server will translate /about path to about.html file. We need to strip this extension from all the files other than index.html. In the _blog_ipfs folder, run these commands:


for file in *.html; do
  mv -- "$file" "${file%%.html}"
done
mv index index.html

Now we’re ready to upload our static website to the IPFS network. In the parent folder, run the following command:

if [ -f ~/.ipfs/api ]; then
  export NEW_CID=$(ipfs add -r --cid-version 1 _blog_ipfs | tail -1 | cut -d' ' -f2)
else
  echo "IPFS daemon not running";
fi

It checks if the local IPFS node is running, uploads the whole folder, and saves its hash in the $NEW_CID variable.

At this point, you should already be able to access your blog both locally and via a public gateway. Assuming your CID was bafybeiczjr4lqpxj4fypqbnnryjhrpjxgy4ae22c72vxl6bcuwpahbhphm you can use the following links:

http://localhost:8080/ipfs/bafybeiczjr4lqpxj4fypqbnnryjhrpjxgy4ae22c72vxl6bcuwpahbhphm

https://cloudflare-ipfs.com/ipfs/bafybeiczjr4lqpxj4fypqbnnryjhrpjxgy4ae22c72vxl6bcuwpahbhphm/

How to permanently host IPFS files?

Our website is now reachable in the public IPFS network. But, we’ve only uploaded our files using the local node. It means that when our computer goes offline, the website might no longer be reachable after it’s cleared from other nodes’ cache.

Let’s fix it by using the previously mentioned “pinning” services. Currently, I’m uploading each new release of my blog to both Pinata Cloud and Infura. Since files are addressed by the hash of their content instead of location, uploading to multiple providers is possible.

Both services have a support for HTTP API. After obtaining your credentials you can use Pinata like that:

curl -X POST "https://api.pinata.cloud/pinning/pinByHash" \
-H "pinata_api_key: $PINATA_API_KEY" \
-H "pinata_secret_api_key: $PINATA_SECRET_API_KEY" \
-H "Content-Type: application/json" \
-d "{ \"hashToPin\":\"$NEW_CID\", \"pinataMetadata\": { \"name\":\"blog_release\" }}"

and Infura:

curl -X POST "https://ipfs.infura.io:5001/api/v0/pin/add?arg=$NEW_CID" \
-u "$INFURA_IPFS_PROJECT_ID:$INFURA_IPFS_PROJECT_SECRET"

I’m not completely satisfied with this setup. Both services are currently dependent on my credit cards. I’d love to find a service that allows me to prepay the storage period up front. There used to be Eternum that worked exactly like that but they are no longer onboarding new clients.

There’s a lot of discussion about Filecoin that’s designed to store files in a trustless way. However, I could not find a production-ready service that supports hosting IPFS folder files and is backed by Filecoin protocol. If you know something reliable, please let me know in the comments.

Two providers instead of a single EC2 instance is still a considerable improvement compared to my previous infrastructure. This ecosystem is evolving rapidly. I’m looking forward to revisiting this part of my setup in a couple of months.

How to advertise your IPFS content CID?

You can now reach your audience, but those URLs are just ugly and impossible to memorize. Let’s see how we can improve it.

A standard way to address IPFS files using a DNS system is to use the so-called DNSLink. It is a TXT DNS record that maps the website URL to its corresponding IPFS CID. I’m mirroring the contents of this blog on the ipfs subdomain:

ipfs.pawelurbanek.com

More details on why I did not migrate the root domain will be provided later.

You can check out the current DNSLink entry for this blog by running the following command:

dig TXT _dnslink.ipfs.pawelurbanek.com

You should see a similar output:

_dnslink.ipfs.pawelurbanek.com. 300 IN  TXT "dnslink=/ipfs/bafybeihyzo3q6jw4strg7ydxcx4wsrfksct5izg2i575wcruxtdl7bwwey"

You can also browse the same content using native IPFS protocol:

ipfs://bafybeihyzo3q6jw4strg7ydxcx4wsrfksct5izg2i575wcruxtdl7bwwey/

I use the following cURL call during the deployment process to update the TXT DNS entry with a new CID value in Cloudflare:

curl -X PUT "https://api.cloudflare.com/client/v4/zones/$CLOUDFLARE_ZONE_ID/dns_records/$IPFS_DNS_ID" \
-H "X-Auth-Email: $CLOUDFLARE_EMAIL" \
-H "X-Auth-Key: $CLOUDFLARE_API_KEY" \
-H "Content-Type: application/json" \
-d "{ \"type\":\"TXT\", \"name\":\"_dnslink.ipfs\", \"ttl\":1, \"content\":\"dnslink=/ipfs/$NEW_CID\" }"

Make sure to always always enable DNSSEC to at least somehow mitigate the threat of tampering with DNS query results.

We still depend on our domain and its registrar as a trusted 3rd party. It means that this solution is nowhere close to the promised censorship-resistant and trustless.

Let’s see how Ethereum’s Name Service can help us improve that.

Ethereum ENS as a decentralized replacement for ICANN domain registrar

ENS is a simple way to share your cryptocurrency wallet addresses and other data like social media accounts. pawelurbanek.eth just in case one of the readers decided to throw some ETH or crypto kitties donation my way.

Information about the domain ownership and all the corresponding metadata persists in the Ethereum blockchain. In addition to the cost of the domain ($5/year), you have to pay the gas fees. You can claim domain ownership for as long as you see fit. On the contrary, the Cloudflare domain registrar supports a maximum of 10 years. Being able to purchase your domain for an unrestricted period is an excellent step towards the permanence of your online presence.

Purchasing Ethereum ENS domain for 100 years

$500 for 100 years is a pretty decent deal


There’s a special CONTENT entry that lets you specify a CID hash of an IPFS file that should be assigned to your domain. It will display the file, in our case a previously built website, to the visitors. No more cumbersome hash addresses!

pawelurbanek.eth.link

The link suffix is necessary because eth is not a top-level domain supported by standard browsers. In theory, Brave and Opera have added native support for it, but I’ve found it to be randomly lagging and sometimes completely broken. On Firefox and Chrome, you can add support for eth domains via a Metamask extension.

Please be aware that ENS also uses DNS protocol under the hood. You can inspect what TXT entries are added for your domain by running this command:

dig TXT pawelurbanek.eth.link

An honorable mention here is an Unstoppable Domains project that works similarly to ENS. It offers a more cool crypto domain extension.

Both projects distribute domains in the form of NFT (non-fungible token). It means that ownership is confirmable on an Ethereum blockchain and cannot be tampered with. Since NFT is a unified standard, you can see all your domains by logging into a popular NFT marketplace OpenSea.

My NFT domains

I'm not very creative with my domain names.


ENS blockchain-based addressing eliminates a trusted 3rd party, i.e., your ICANN dependent domain registrar. One downside is that each data update costs money, ~$1.5 at the time of writing. Also, there’s a slight inconvenience for visitors who must install the extension or use the link suffix.

When it comes to censorship, your subdomain in the *.eth.link namespace could be censored by government-level actors. But CID of your IPFS website can always be retrieved straight from the blockchain or via the web3 JavaScript API:


web3.eth.ens.getContent('pawelurbanek.eth').then(function (result) {
    console.log(result);
});
// "ipfs://bafybeihyzo3q6jw4strg7ydxcx4wsrfksct5izg2i575wcruxtdl7bwwey"

Ethereum Smart Contract as the trustless source of truth

But, can you completely trust the Ethereum ENS system? I think it all depends on your level of paranoia and trust issues. In theory, you must have access to private keys to update domain-related data. But in practice…

Go Jack go

According to the ENS docs:

“keyholders can replace the contracts that govern issuing and managing domains (on .eth or any other top-level domain), giving them ultimate control over the structure of the ENS system and the names registered in it.”

“Over time, we plan to reduce and decentralise human control over the system”

It means that currently, ENS is not 100% trustless. If you want to distribute your content CID in a way that’s entirely under your control and impossible to censor, you could use a custom Ethereum smart contract. Smart contracts are immutable programs deployed to the Ethereum blockchain. You could use it as a medium for advertising the recent CID of the website to your audience. UX would be terrible. Potential users would have to read the raw contract state on the Etherscan or directly from blockchain on their proprietary full node.

But, I could not think of a better way to publicly distribute information in a way that’s 100% independent of any 3rd party. Feedback appreciated.

Check out a sample code for the Solidity smart contract that could serve this purpose:

contract CIDStorage {
    address public immutable owner = msg.sender;
    uint256 public updatedAt = block.timestamp;
    string public currentCID;

    function setNewCID(string memory _newCID) external {
        require(msg.sender == owner, "Access denied!");
        currentCID = _newCID;
        updatedAt = block.timestamp;
    }
}

I’ve deployed this sample contract to the Ropsten test network. You can read its state even if you don’t have a Metamask extension configured.

Etherscan displaying the internal state of smart contract

Reading the state of a smart contract with Etherscan

A full deployment script

Here’s a full script that I currently use to build and deploy my Jekyll website.

if [ ! -f ~/.ipfs/api ]; then
  echo "IPFS daemon not running"; exit 1;
fi

JEKYLL_ENV=production jekyll build --destination _blog_ipfs/
cd _blog_ipfs
all-relative
for file in *.html; do
  mv -- "$file" "${file%%.html}"
done
mv index index.html
cd ..
export NEW_CID=$(ipfs add -r --cid-version 1 _blog_ipfs | tail -1 | cut -d' ' -f2)
echo "New release CID:"
echo $NEW_CID
echo $NEW_CID > latest_ipfs_release.txt

curl "https://cloudflare-ipfs.com/$NEW_CID/" > /dev/null

curl -X PUT "https://api.cloudflare.com/client/v4/zones/$CLOUDFLARE_ZONE_ID/dns_records/$IPFS_DNS_ID" \
-H "X-Auth-Email: $CLOUDFLARE_EMAIL" \
-H "X-Auth-Key: $CLOUDFLARE_API_KEY" \
-H "Content-Type: application/json" \
-d "{ \"type\":\"TXT\", \"name\":\"_dnslink.ipfs\", \"ttl\":1, \"content\":\"dnslink=/ipfs/$NEW_CID\" }"

curl -X POST "https://api.pinata.cloud/pinning/pinByHash" \
-H "pinata_api_key: $PINATA_API_KEY" \
-H "pinata_secret_api_key: $PINATA_SECRET_API_KEY" \
-H "Content-Type: application/json" \
-d "{ \"hashToPin\":\"$NEW_CID\", \"pinataMetadata\": { \"name\":\"blog_release\" }}"

sleep 45

curl -X POST "https://ipfs.infura.io:5001/api/v0/pin/add?arg=$NEW_CID" \
-u "$INFURA_IPFS_PROJECT_ID:$INFURA_IPFS_PROJECT_SECRET"

Remember to start your local IPFS daemon before executing it


This script does not automatically update the CID hash in the ENS domain. It costs ~$1.5 per deployment, so I only do it manually for major releases. I use a cURL call to Cloudflare IPFS gateway to speed up the new upload propagation. sleep 45 is necessary to prevent timeouts when trying to pin a new CID in Infura before it propagates in the network.

Caveats of using IPFS for a static domain

As previously mentioned, I’ve not decided to migrate my root domain to IPFS. There are a few disadvantages of the described setup. Let’s cover them one-by-one:

1.

My commenting system Commento expects a single predefined domain. IPFS website can be browsed via any gateway or natively, so it wouldn’t work correctly. Right now, commenting is possible only if you access my blog via its root domain.

2.

It’s not possible to add client-side caching headers for IPFS websites configured with DNSLink. Instead, they use an ETag for caching. It’s not a disaster but generates unnecessary additional 304 requests on each page refresh. Websites accessed via “ugly” CID URLs use the correct client-side caching but for the cost of suboptimal browsing UX.

Unecessary static asset requests on IPFS website

Redundant requests caused by ETag-based caching.


You can compare caching related headers used for “nice” vs. “ugly” URLs by running the following commands:

curl -I https://bafybeiczjr4lqpxj4fypqbnnryjhrpjxgy4ae22c72vxl6bcuwpahbhphm.ipfs.dweb.link/smart-contract-development
curl -I https://ipfs.pawelurbanek.com/smart-contract-development

You’ll notice that the latter is missing the cache-control header. It’s not a bug but rather a limitation of this setup because adding this header could result in stale cache issues for new releases.

3.

Unfortunatelly, I’ve noticed both crypto and eth.link domains to randomly fail. From time to time my website would not load and cURL call returned the following DNS resultion error:

curl -I https://pawelurbanek.eth.link/
curl: (6) Could not resolve host: pawelurbanek.eth.link

It usually fixed itself after a few tries. DNSLink with custom domain was working more reliably than eth.link and crypto extensions.

Most of my traffic comes from Google search results. I’m afraid that the degraded reliability of this website could hurt its SEO rating. All the posts mirrored to IPFS advertise their root domain copies as a canonical version using rel="canonical tag. I suspect that the main culprit of the problems I’ve noticed is the so-called IPNS (InterPlanetary Name System) that’s currently known to be imperfect. I’m sure that this situation will improve over time. But, right now, I’d not risk moving a domain that’s depending on SEO organic traffic to IPFS.

4.

This blog has been around for a while, and I use a few custom NGINX redirect rules to salvage my old backlinks and smoothly handle no longer supported Google AMP pages. IPFS does not allow for such a fine-tuned control. This problem would probably not exist for new websites or the ones with simpler SEO-related requirements.

Summary

My website might not be the perfect candidate to use IPFS as its primary hosting platform. But, I believe that for some use cases, none of the issues mentioned would outweigh the benefits of running an uncensorable and decentralized website.

This post turned out much longer than I had anticipated. Congrats on making it to the end! As I’ve mentioned, these are my first steps in the Web 3.0 ecosystem, so I may have mixed something up. Please let me know in the comments if you’ve noticed any errors.

I want to exaggerate that the setup described has nothing to do with anonymity. Your website might be uncensorable from the outside, but it’s possible to track your public IP address based on your IPFS node or blockchain operations.

The vision of the future internet that’s decentralized and controlled by users rather than a few IT giants is pretty exciting. The current Web 3.0 ecosystem offers some truly unique features but is still in a pretty rough state. I believe that now might be a perfect time to get involved as an early adopter.



from Hacker News https://ift.tt/2SDdGXc

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.