Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How does paying for data retrieval work with IPFS? #1191

Open
tysonzero opened this issue Oct 3, 2020 · 19 comments
Open

How does paying for data retrieval work with IPFS? #1191

tysonzero opened this issue Oct 3, 2020 · 19 comments

Comments

@tysonzero
Copy link

tysonzero commented Oct 3, 2020

I was under the impression that filecoin allowed you to pay to make sure someone was pinning content on IPFS. However after looking at the site it looks like you have to pay to retrieve a file off of a miner. These two things seem at odds with one another.

Is it possible to use filecoin to make sure a file persists on IPFS, without having to worry too much about how many people are requesting the file? It seems like IPFS's caching facilities should limit load on the miner itself for heavily requested files.

@yiannisbot
Copy link
Collaborator

yiannisbot commented Oct 6, 2020

@tysonzero: there are different things mixed in your questions above, so let me try to clarify.

IPFS is (at this point) detached from Filecoin. They are two different networks. Filecoin uses many of the principles and concepts of IPFS, but the two systems do not interoperate "out of the box". There is more integration on the way, but it is not correct to assume that, for example, when you add a file on IPFS Filecoin will pick it up and store it in a storage miner.

Filecoin operates two markets: i) the storage market, where a user makes an agreement with a storage miner for the latter to store their files and ii) the retrieval market, where a user makes a deal with a retrieval miner (different entity to the storage miner, although could be run by the same host or physical entity), to retrieve the file. Both miners can be operated by a single physical entity, but they are two different concepts. In both cases the user has to pay to store or retrieve the file. You can read more about the Filecoin Markets in the related spec section that gives all the details. The Filecoin Docs also have a lot of useful information that might be helpful.

IPFS pinning services are different and not related to storage miners. See some discussion and pointers to pinning services in the related section of IPFS Docs. I don't have up to date information on whether some pinning services will expand their focus to become storage/retrieval miners on the Filecoin network, but in all cases, as mentioned above, pinning on IPFS does not translate to storing on Filecoin.

So, on your comments.

I was under the impression that filecoin allowed you to pay to make sure someone was pinning content on IPFS.

I think you're referring to the IPFS pinning services, not filecoin.

However after looking at the site it looks like you have to pay to retrieve a file off of a miner.

Yes, this is the retrieval market.

These two things seem at odds with one another.

Note, that to retrieve a file that has been pinned on the IPFS network, you don't have to pay.

Is it possible to use filecoin to make sure a file persists on IPFS

Nope, if you use Filecoin, then the file will persist on the Filecoin network, not the IPFS network. At this point, these are two different networks.

without having to worry too much about how many people are requesting the file

Can you let me know what's the use-case you have in mind? It might make things easier to put in context. When pinning a file on IPFS, anyone can retrieve it for free. It doesn't matter how popular the file is. If you store something with a storage miner, then others will have to pay to retrieve it.

It seems like IPFS's caching facilities should limit load on the miner itself for heavily requested files.

Indeed, load-balancing will be useful for popular files and miners (in the Filecoin network) will have to consider that in the future. Whether it will be done using IPFS techniques or not, this is a different issue.

@tysonzero
Copy link
Author

Ah I see.

To be honest the main thing I am looking for is a decentralized way to pay people to pin files on the IPFS network.

We'll be storing and IPFS-pinning the files we care about on our local dev machines anyway, and referencing them using their IPFS hash, so there is a lot less benefit for us in storing them on an entirely separate network.

Thank you for the information.

@yiannisbot
Copy link
Collaborator

Ok, I see. Just for information: the Filecoin network also uses the IPFS CID and hash concept, so that's not a show-stopper in using Filecoin. It depends what you want to do from then on that will define whether you want to use IPFS or Filecoin. For Files & Data, see this section of the spec, which provides all the details: https://spec.filecoin.io/#systems__filecoin_files.

@tysonzero
Copy link
Author

Are there any plans for an "IPFS retrieval market" where instead of paying for one download of the file right now, you pay an amount for the file to be available on IPFS for amount of time? I would like all my files to be available via IPFS gateways and the IPFS protocol as a whole.

@yiannisbot
Copy link
Collaborator

IIUC what you describe is the pinning services (which already exist), where you store something on the service’s storage (commonly realized as a public gateway) and pay according to the size of the file.

The retrieval itself, i.e., someone downloading the file is free.

I would suggest you check the pinning services functionality in the docs link I linked to above.

@tysonzero
Copy link
Author

Fully decentralized IPFS pinning services already exist? I was under the impression it was all via centralized vendors.

@yiannisbot
Copy link
Collaborator

Oh, ok, I see. No, there are no decentrtalised pinning services AFAIK. But again this is different to an "IPFS retrieval market" IMO.

I suggest that you reach out to the discuss.ipfs.io forum as the community might have more input. From what I understand, your issue is more of an IPFS issue, rather than Filecoin one.

@tysonzero
Copy link
Author

Can you elaborate on the differences you see? It's definitely possible though, i'm new to most of this stuff. I'm just looking for a decentralized way to pay for things to stay available in IPFS.

Yeah it is more or less an IPFS issue. The main reason why I ask here is it seems hard to solve without a cryptocurrency and the proof of spacetime stuff done by Filecoin.

@agnelvishal
Copy link

@tysonzero You can try Textile.io Powergate It allows storing files in IPFS called Hot storage and in Filecoin called cold storage.

@d10r
Copy link

d10r commented Oct 23, 2020

I can fully relate to the confusion expressed here by @tysonzero . I too was under the impression that Filecoin would be an incentive overlay to IPFS. In fact the Filecoin paper has this sentence in the abstract:

Filecoin works as an incentive layer on top of IPFS [1], which can provide storage infrastructure for any data.

Did we misinterpret what this sentence was supposed to mean or did the team change strategy along the way?
Is there a place where one can read about the motivations for making Filecoin a network distinct from IPFS?

In our org we have been using IPFS for various PoCs (e.g. this) since 2017.
So far we have always used own servers for pinning - in the assumption that one day we could add an option to pay for storage in order to have guaranteed data availability - independent of how our servers are doing. I consider centralized pinning services more of a workaround than a solution to that problem - nothing I'd want to rely on in the long term. I always assumed that Filecoin would be the long-term solution.
Now however I wonder how Filecoin can help. In theory we could run a Filecoin node instead of an IPFS node and keep user data available there on a voluntary basis - just like we're doing with IPFS now. But is that feasible in practice? Is it possible to run a Filecoin storage node which is not also a full-blown miner? Because we definitely can't afford the HW requirements as specified for a Filecoin miner for a voluntary service.
My guess is that in order to achieve what we want with Filecoin, we need to operate on both networks - keep the content on a self-hosted IPFS node for voluntary storage and replicate it over to Filecoin once a user wants to pay for guaranteed availability. Is that the best course of action or am I missing something?

@yiannisbot
Copy link
Collaborator

@tysonzero @d10r just to clarify one important thing: I'm stating above that the IPFS and Filecoin networks do not interoperate natively, meaning that running ipfs add <file> won't add the file to Filecoin automatically. However, there are services developed by the community, as @agnelvishal is pointing out above.

Please see Filecoin backed pinning services, where by using Filecoin Pinning Service providers you can have IPFS services backed by Filecoin persistence. I hope this can help clarify some of the design decisions you're looking to make.

See also Textile's Powergate and buckets.

@tysonzero
Copy link
Author

tysonzero commented Oct 27, 2020

To be clear I definitely did not expect ipfs add <file> to add it to filecoin. IMO incentive systems and blockchains and such should be considered out of scope for ipfs.

Now I did expect filecoin add <file> to add it to ipfs.

I want to put ipfs links in tons of places i'd previously put http links, from codebase dependencies to eventually src attributes (once we have ipfs in browser). I'm realistically not going to use filecoin links in those places, as then end users will need to pay to download the file, which is a huge step down from what they are used to.

Unfortunately without an ipfs-backed filecoin (or equivalent) it's likely not in our company's best interest to use ipfs or filecoin at all. It would be a lot more expensive for us to run an ipfs node on aws (our current host) than it is to just use s3. We could use a third party centralized pinning service, but we'd likely only feel comfortable if we used multiple of them as well as local pinning, so it'd require more money and effort than just continuing to use s3.

I'm sure for some types of businesses the current filecoin model works well, but it just does not seem to work well for very public-facing data that people expect to be able to download for free (with companies like ours happy to eat those costs directly or indirectly).

This also seems to go against the idea of addressing data based on it's content rather than it's provenance, if a ton of data will only be on filecoin, and a ton of other data will only be on IPFS. You now need to either check both places, or store that provenance information alongside.

Someone posted this issue to Reddit in case you are curious about the discussions going on there.

@aschmahmann
Copy link

@tysonzero I get your point and your pain here around wishing you could ipfs get bafyabc and have the data just get pulled from Filecoin.

I think part of what you're missing here is the difference between how things work today and how things should work in the (hopefully not too distant) future.

This topic is a bit big, so I apologize for the long post. Happy to discuss more about this on discuss.ipfs.io as this isn't likely isn't a Filecoin issue.

High level thoughts on IPFS - Filecoin interoperability

As @momack2 mentioned in that reddit thread Filecoin's retrieval market development is undergoing very active development within the ecosystem. I'd really highly recommend watching the Filecoin Liftoff videos on retrieval markets and IPFS + Filecoin (includes some thoughts by yours truly so I'm happy to answer any questions on discuss.ipfs.io or the IPFS IRC/Matrix channel).

The TLDR is that IPFS isn't currently and isn't planning on being tied to Filecoin, nonetheless the power of content addressing (and self certified data in general) means that I do not have to care where data is coming from I can just get it. While using go/js-ipfs as a library is quite pluggable (although go-ipfs has some rough edges) extending the go-ipfs binary is a bit of a rough experience that we'd like to make better.

IPFS Extensibility + Filecoin

Three of IPFS' major jobs are:

  • Storing data (let's roughly call this Pinning)
  • Find who has data (Content Routing)
  • Download the data (Fetching)

Since some IPFS users may want to leverage Filecoin for pinning (i.e. storage deals), content routing (i.e. retrieval markets) and downloading data (i.e. retrieval deals), IPFS implementations such as go-ipfs should make it easier to plug into each of the above systems.

Once it's pluggable you can reasonably have some light Filecoin client running on your machine that handles all the extensibility required by IPFS. Again, this is already doable today it's just not the best UX.

Pinning + Filecoin

Fully decentralized IPFS pinning services already exist? I was under the impression it was all via centralized vendors.

This is interesting the answer is almost yes, all it needs is go-ipfs v0.8.0 and a bit of love, perhaps from someone like you 😄.

As part of go-ipfs v0.8.0 we're adding support for a pinning service API which essentially will allow go-ipfs to ask any service to store data for you in a uniform way. While some of the first implementers of the server side of the protocol will be centralized pinning services, there's no reason IPFS Cluster, a Lotus node, or any other service couldn't implement something here.

Content Routing + Filecoin

This is essentially retrieval markets and as mentioned above is an area of active development. One thing that's still not there is making it easy for someone to plug a custom content routing solution into go-ipfs. There are actually a number of ways to do this (some listed above), but it's an area for improvement

Fetching + Filecoin

Let's start with clearing some things up, because this seems to be where there's some misunderstanding.

Now I did expect filecoin add to add it to ipfs.

Why? How is the IPFS node supposed to pay the Filecoin retrieval miner for the bandwidth to get the data?

IPFS would need to have some extensibility point that would allow it to pay a retrieval miner. How much should it pay for the data and how is IPFS supposed to deal with this?

Even if you wanted to foot the bill and you could give your users special vouchers that only let them get the data they want for free how is IPFS supposed to know how to handle this?

By the way, as @yiannisbot mentioned the are already some solutions that can help you with paying for your users' download bandwidth. If you work with a pinning service that has a system like Powergate which couples IPFS + Filecoin then you can pay one or more centralized services to do the "serve up my data to users at no cost to them" thing while having the data backed up on Filecoin as well. One nice thing about this is that it doesn't require your users to have any Filecoin client installed at all and if the user doesn't have IPFS installed locally they'll still be able to fetch links via a public gateway.

Adding extensibility to go-ipfs here seems like a totally reasonable thing to do (and again can already be done, but it's not fun). However, the UX may be tricky. For example, you could setup some defaults in your Filecoin light client that say "pay no more than X per byte" which gives up some control but allows the IPFS experience to just work since the payments can all be hidden. Alternatively, you could allow IPFS to pass arbitrary metadata flags through to underlying services (e.g. ipfs get bafyabc --meta={pay_max: 10}) but then some of the UX around just going to ipfs://bafyabc becomes worse.

Summary

Wow, you made it (or you skipped to the end and I can't really blame you)!

As you can hopefully tell adding extensibility is important to the IPFS ecosystem. People are regularly building custom components that suit their use cases (e.g. custom routers, datastores, pin managers, provider record publishers, fetch clients, etc.) and Filecoin's launch is an excellent opportunity to work on making more extensibility points and making them easier to use.

It'll take some time to build these things and grow the ecosystem around them and some of the UX questions are tough. If you're interested perhaps start a topic on discuss.ipfs.io (or Matrix/IRC) to engage members of the community in what you'd like to see and perhaps hack together a prototype.

@tysonzero
Copy link
Author

tysonzero commented Oct 28, 2020

I won't go too in depth but just wanted to quickly address a few points.

The TLDR is that IPFS isn't currently and isn't planning on being tied to Filecoin

I think I was fairly clear on that. I do not want IPFS to be tied to or depend on Filecoin. I agree that being hardwired to one particular choice of incentive system would be bad.

However I was hoping that Filecoin would depend on / build on top of IPFS. I was hoping that various different incentive systems would all be built on top of IPFS.

This is interesting the answer is almost yes [decentralized paying for pinning]

I don't see how that's possible. It seems like you fundamentally need a cryptocurrency to do that. So if filecoin is not doing it then who is?

Why? How is the IPFS node supposed to pay the Filecoin retrieval miner for the bandwidth to get the data?

Up until the day I made this github thread, I was under the assumption that filecoin had the same pricing model as most centralized pinning services, you pay (extra) for storage but not for bandwidth.

... stuff about extending go-ipfs ...

I was not intending to change ipfs itself at all. I do not think that ipfs clients should have any concept of payment or know about filecoin. I was hoping that filecoin nodes would be ipfs nodes, just like how existing pinning services work.


A large amount of advertising and discussion I have seen around filecoin has referred to it as "an incentive system for IPFS". I have now learned that this is not the case, as it is not built on top of IPFS. This is what I am rather disappointed in as it makes it unviable for our business. I also feel that I am not alone here, as almost everyone on that reddit thread was equally surprised by this revelation.

@aschmahmann
Copy link

aschmahmann commented Oct 29, 2020

had the same pricing model as most centralized pinning services, you pay (extra) for storage but not for bandwidth.

My understanding is that Filecoin has planned to have storage + retrieval markets for quite some time (at least as far back as the 2017 paper). Perhaps systems and smart contracts like you propose will emerge over time, but I suspect they'll have to deal with some tricky questions around what happens if a peer with data no longer wants to send it because bandwidth happens to cost more than they were expecting when they made the deal, or more people are requesting the data than they expected.

Note: if you're willing to work under a model where you use a reputation system to track storage miner behaviors you can currently look to store with peers that have some system like Powergate running that basically runs Filecoin and IPFS together. The reputation then allows you to work around some of the tricky problems with incentivizing peers to upload content even while bandwidth is more costly than they expected.

[decentralized paying for pinning] ...
I don't see how that's possible. It seems like you fundamentally need a cryptocurrency to do that. So if filecoin is not doing it then who is?
... I do not think that ipfs clients should have any concept of payment or know about filecoin

As an example. I could run a local Filecoin light client, called foo that supports the pinning service API. I could then run something like ipfs pin remote add --service=foo QmFile and my foo client will then pay some Filecoin miner to store the data. IPFS doesn't have to know anything about payments, it just has to know that there is some extensible component out there that will pin an IPLD graph when asked to.

Sure, you could do this directly from the Filecoin light client instead of IPFS, but when thinking about other behaviors like retrieval via ipfs get it might be nice to do things like check the network of people who give data away for free and fall back on asking an external API to find you the CID (e.g. a local Filecoin light client that utilizes the retrieval market). This allows users to work with IPFS itself and integrate other systems (Filecoin, centralized pinning services, custom content resolution services, etc.) as they need.

@tysonzero
Copy link
Author

My understanding is that Filecoin has planned to have storage + retrieval markets for quite some time (at least as far back as the 2017 paper).

Then I have to say I'm a tad annoyed that all the discussion and promotion I've seen in the wild have referred to filecoin as "an incentive/persistence layer for ipfs", because that really isn't true without some way to require miners to expose data on ipfs.

but I suspect they'll have to deal with some tricky questions around what happens if a peer with data no longer wants to send it because bandwidth happens to cost more than they were expecting when they made the deal, or more people are requesting the data than they expected.

This doesn't seem fundamentally different than if a peer agreed to store data, but no longer wants to store it because of some unforeseen event. Particularly given how much ipfs caches heavily requested data. Centralizing pinning services pretty much all seem to go this route without issue. I agree that it will absolutely effect storage markets and make storage more expensive to build a buffer for this, but I don't think it's all that fundamental a change. Just like now you'll be punished if you decide to stop serving certain content.

Note: if you're willing to work under a model where you use a reputation system to track storage miner behaviors you can currently look to store with peers that have some system like Powergate running that basically runs Filecoin and IPFS together.

Ok this seems a lot more promising. If something like filecoin add <foo> --ipfs is supported by filecoin then I know I personally would be perfectly happy with that. I would expect to pay more for such a request, but I would also expect various people be able to ipfs request the file as much as I want without any expectation of further payment.

Sure, you could do this directly from the Filecoin light client instead of IPFS, but when thinking about other behaviors like retrieval via ipfs get it might be nice to do things like check the network of people who give data away for free and fall back on asking an external API to find you the CID

For our specific case I'm totally fine with a thoroughly bifurcated workflow. Our developers would be using filecoin and have plenty of funds available to pull from. Our end users will be using ipfs (via gateway or js-ipfs or what have you) and will likely have no idea what filecoin (or ipfs) are and will definitely not have funds available to pull from.

@bonedaddy
Copy link

A large amount of advertising and discussion I have seen around filecoin has referred to it as "an incentive system for IPFS". I have now learned that this is not the case, as it is not built on top of IPFS. This is what I am rather disappointed in as it makes it unviable for our business. I also feel that I am not alone here, as almost everyone on that reddit thread was equally surprised by this revelation.

💯

My understanding is that Filecoin has planned to have storage + retrieval markets for quite some time (at least as far back as the 2017 paper). Perhaps systems and smart contracts like you propose will emerge over time, but I suspect they'll have to deal with some tricky questions around what happens if a peer with data no longer wants to send it because bandwidth happens to cost more than they were expecting when they made the deal, or more people are requesting the data than they expected.

If this is the case which it appears to be, it seems a bit disingenuous that for the last 3 years Filecoin has been advertised as an incentivization layer for IPFS, when it is 1000% not the case. As @tysonzero pointed out literally everyone in that reddit thread was surprised by this fact. Perhaps it's time to start adjusting the marketing around Filecoin, as you can't really be doing a good job marketing when the most advertised selling point of Filecoin is entirely false.

@nghianguyen119
Copy link

Fully decentralized IPFS pinning services already exist? I was under the impression it was all via centralized vendors.

You can do the research about Crust Network, I use it for my website socbay.io, Crust is truly an incentive layer for IPFS

@totedati
Copy link

totedati commented Jun 4, 2021

Now I did expect filecoin add <file> to add it to ipfs.

Well, at my level of understanding we can think filecoin layer as a "decentralized cloud" backup of ipfs layer.

When you want files pinned on ipfs to persist a month, a year and so on and want to pay money for that guaranteed persistence you can use filecoin to save your pins and to retrieve you pins aka mirroring of filecoin layer to ipfs layer.

But this save and retrieve operations, the seal and unseal transactions in filecoin layer, is not something that will happen in seconds! You will have filecoin miners who will do that in ... hours! Depending on how big is the chunk of data to be processed, the market and availability of filecoin miners ready to process you "bid to market" order, the network speed and so on. That is not what you want when you request a "persistent" CID URI in you web3 app or blog web3 page so this cloudy or hosted on premises filecoin backup always need a hot cache already mirrored in a running ipfs service!

To make granular transactions of "filecoin add file" for every tinny file look like something overkill for filecoin layer today. That's why you make transactions in filecoin market in the gigabyte and terabyte range of sectors sealed and unsealed per transactions. Something like Amazon S3 Glacier service.

So, you can't use filecoin layer alone for this web3 interplanetary filesystem! Damn!

Filecoin is a incentive layer on top of ipfs layer but as a long term backup incentive of ipfs pinned content intended for archiving and long term availability not as a monetized hot cache of ipfs! Can be filecoin in time competitive enough to replace Amazon S3 Glacier service? Will see!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants