I've had extremely bad luck running go-ipfs at any scale. GC is braindead (literally deletes the whole cache on a timer), networking is slow and drops tons of packets (apparently too much UDP?), and by default it stores each object 2 or 3 times. I'm sure it'll work fine for people using http://dweb.app, and probably go-ipfs will work okay for super casual browsing, but as soon as someone tries to download any substantial IPFS dataset, expect lots of resource limits.
Yup, I had a (tiny, spinning rust) home server that was slowed to nearly a halt. SSH logins would take minutes even when limiting IPFS to 2GiB of the 16GiB of RAM. Stopped go-ipfs and it was instantly snappy again.
My impression of the IPFS project is that the goals are excellent, the core protocol is quite good however they like rewriting the higher level layers far too frequently (for example they have deprecated UnixFS which seems to be the most used format and they keep switching between JSON, Protocol Buffers and CBOR) and go-ipfs seems to be a pretty garbage codebase.
Any chance that, by limiting RAM usage, you forced your application to heavily swap, clogging the disk and making your machine slow?
I have run a public gateway on 2GB of RAM. Later 4GB because it was subject to very heavy abuse, but it was perfectly possible. Perhaps it is a matter of knowing how to configure things and how to not inflict self-pain with wrong settings.
Yes, there is definitely a chance but it was wayyyy worse when I gave it more or unlimited ram. At least this way the machine was operational most of the time. I don't think it was swapping but since I think the limit I applied also affected the page cache it was likely reading it's data from disk a lot more often than it would of if it could own the whole page cache. But this is basically the same effect as swapping.
Maybe there is a Goldilocks value I could find, but I didn't really need IPFS running that much so I just removed it.
It does not delete the whole cache on a timer. It will delete blocks that are not pinned periodically or when it reaches a high-water mark. It does not store each object 2 or 3 times. First, it doesn't refer to anything as an object but rather blocks and a block is only stored once. It will only be replicated if you're running a cluster in which case replication is the point.
I don't really know anything about the Golang GC, but I would not be surprised if the process of scanning for unpinned blocks results in a lot of memory accesses. If too many useful things get evicted from the cache during that process, then I can see why GP is saying it deletes the whole cache.
Why would you ever start anything with, "I don't really know anything about the Golang GC but..." IPFS GC is separate from the Golang GC. IPFS GC deletes blocks on the local node that are not pinned. I'm not sure what you mean by "too many useful things". If it's not pinned it's evicted.
Golang has only 1 tunable GC parameter by design so it could be not opaque enough for certain loads but I learned that putting a ballast in ram fixes the too frequent GC sweeps
If you have any specific complaints or experiences you'd like to share I would be interested in hearing about the but "it's a bit of a trash fire" is unhelpful.
It was slow and buggy when it first released; understandable so I waited a few years, tried again recently now that it's popular and it still is an unpractical proof of concept.
Yes, that has roughly been my experience. The application throughput offered by IPFS is quite low while the packet throughput is very high. I was experiencing 5-15% packet loss over my internet connection while running IPFS. I'm not sure if a bandwidth limit would even help or if it is related to number of connections.
There's different profiles you can select from. You might have the server profile enabled. Powers save probably consumes the least abs you can opt out of sharing altogether. But the entire point is p2p.
I used to be very gung ho on IPFS, until I learned that the content ID does not depend solely on the content of the file. When one puts a file into the system one can choose different hashing algorithms, which will cause the content ID to be different obviously. However, even when using the same algorithm the content ID will change depending on how it is chunked.
I would expect any sane system to consistently produce the same hash/content ID a file. I can see if the system is moving from using SHA2 to SHA3 that it could be stored twice. Don't know whether they have changed things so that a consistent content ID will be produced or not.
The content ID is not the hash of the content, it is the hash of the root of the Merkle DAG that carries the content.
Doing it like that has many advantages, like being able to verify hashes as small blocks are downloaded and not after downloading a huge file. Being able to de-duplicate data, being able to represent files, folders and any type of linked content-addressed data structure.
As long as your content is under 4MiB you can opt out of all this and have a content ID that is exactly the hash of the content.
As I just replied to "cle", some disadvantages doing the way that it is because one can't predict what content ID would be produced. Perhaps the hash of the entire contents of the file could point the hash that is current the content ID would solve this issue. To me, IPFS does not seem useful unless this issue is solved. Also, multiple hashes (different algorithms) of the file could point to the content ID/merkle DAG; so if both SHA2 and SHA3 were both used and one of them had a security issues, then just use the one that is OK.
not sure that I follow what you are asking. I would expect if sha2-256 is used then the content ID would be the same. However, depending on how the content is chunked, the content ID will change. Two disadvantages that I see:
1. if new packages are produced for a release of open source, could I see if there is a copy available via IPFS? No, because one can't predict how it would be chunked. So, one would have to download and then derive a content ID and one can only tell if it is available if the same chunking algorithm is available.
2. if I want to push a package or other binary, can I figure out if it is already available via IPFS? No, one can't.
How do you handle conflicts where two concurrent events occur at the same time? Who wins? I know timestamps are not reliable but I want last write wins behaviour and seamless merge. The paper leaves data layer conflict resolution to the reader. It does suggest sorting by CID. I added a timestamp field for conflict resolution.
After reading Merkle-DAGs meet CRDTs whitepaper I took a go to implement a MerkleClock. It's incomplete. I need to maintain the partial order of "occurs before".
In https://github.com/ipfs/go-ds-crdt, every node in the Merkle DAG has a "Priority" field. When adding a new head, this is set to (maximum of the priorities of the children)+1.
Thus, this priority represents the current depth (or height) of the DAG at each node. It is sort of a timestamp and you could use a timestamp, or whatever helps you sort. In the case of concurrent writes, the write with highest priority wins. If we have concurrent writes of same priority, then things are sorted by CID.
The idea here is that in general, a node that is lagging behind or not syncing would have a dag with less depth, therefore its writes would have less priority when they conflict with writes from others that have built deeper DAGs. But this is after all an implementation choice, and the fact that a DAG is deeper does not mean that the last write on a key happened "later".
I thought of using user indexes by order of connection or user last active time but if you're not worried of the security implications of wall clock and time skew.
Hyperhyperspace project has a previous field on the CRDT operations and can issue undo events to reverse operations.
I suspect you could have a time validator service that is a node that issues revocations if times are in the future. It wouldn't be on the read or write path and it's more to validate that times aren't in the future.
It's a little sad that Firefox isn't the first mobile browserto receive and experiment with new tech like IPFS. I do wonder if they have solved the privacy issues with IPFS before they put it into Brave.
IPFS is probably the best contender for Web3 right now and I hope it'll see more use before the crypto bros take over the term completely
They could ship with IPFS/DAP/I2P/Tor native in Firefox right now, without any requirement of running external software, but choose not to. Instead, we get limited support for IPFS from a desktop-only addon that simply interfaces with an IPFS service already running on the host machine.
Take it a step further: Firefox could allow websites to open sockets and toss arbitrary packets around, and choose not to. If that capacity were available then Javascript could be harnessed to support all sorts of protocols and services. They could even provide Javascript access to monitoring network access point availability and connectivity management.
Imagine then a single page app you could share as an attachment through $messageService and it has all the stuff built in to create ad-hoc real networks in large gatherings that provide data resiliency against the dropping of nodes. You could have the cellular network shut down, protestors arrested, their phones taken, and the data they gathered still retained so long as any node managed to exit the area or the network itself expanded beyond the area of contention.
You have it backwards, stuff like Websockets are built by design to be incompatible with existing implementations. This is because Javascript code is untrusted/untrustworthy, and we already had a plethora of attacks due to foreign JS doing nasty things with what little they had, here's a couple examples:
> Web extensions should allow you to do normal sockets
Not since 2017 or whenever it was that Firefox dropped XUL extensions and replaced them with WebExtensions. The legacy XUL extensions could do much, much more and there was correspondingly much, much more malware in browser extensions.
The problem with that is that regular people (not super-techies) have a much better chance of understanding the implications of agreeing to microphone and webcam use than something called "socket access" - or any other more friendly term that tries to explain what's going on, because it's such a long way away from the level of abstraction that they are likely to understand.
Also not knowing if disabling it will break the page, something even technically inclined people can't know ahead of time. It's not like push notifications where you would have to try hard to build pages that could break without the feature enabled. I could easily see people abusing this to serve pages over alternate protocols and making people expect to need to click "allow."
That makes sense. Google wants users to be easily identified and tracked; elsewise their primary revenue model, surveillance capitalism, would be under threat.
>They could ship with IPFS/DAP/I2P/Tor native in Firefox right now
A bit of a tangent, but I really cannot stress enough that if you're using Tor to be private/anonymous that you should never use anything other than the official Tor browser, you will stand out like a sore thumb.
This still requires external software to operate, and isn't available on mobile. It's effectively dead in the water by not being available to use without additional configuration, by default.
I'd argue this is worse than doing nothing. This gave Firefox the ability to say they care, and yet not deliver something meaningful.
I'm working on a collaborative photogrammetry solution (think async/distributed 3d mapping from overlapping pictures) that shares data via IPFS.
Flattering myself heavily, I believe this sort of public-data consuming application fits like nothing else.
Your collaborative photogrammetry can be combined with the open and free species identification API and my custom OpenStreetMap data extensions and KartaView/OpenStreetCam/OpenStreetView to get more photogrammetry location integration and more free crowdsourced open data to add to photogrammetry. A demo of Seadragon/Photosynth [1] inspired me to work on this.
With pleasure, drop me a mail and I'll get back to you next week (last three letters of my username here @ rest of my username dot artificial intelligence).
I haven't put anything online yet though sorry!
Firefox had its time and it is basically on life support and 80% dependent on Google's money and almost always last to support such features. Brave seems to have made Firefox obsolete.
As for IPFS, the crypto-bros seem to already be winning for taking over that term and melding it as part of a layer in web3, just like they did for 'crypto' which that has become too late and that ship has long sailed.
Perhaps the reason why they are winning is because they keep building stuff like this [0] and existing companies are jumping on board with the term already? [1][2]
Brave is using IPFS for file storage but once the content address (CID) is known anyone can access the file you're looking for. So it remains to be seen how they will leverage IPFS to create scarcity of digital items for their merchandise store. It is a step backward and not what IPFS goals were. A huge number of books are currently on IPFS through libgen, and scihub is going to IPFS eventually. Web3 is just a step back from the greatness that the internet could be. With "decentralized" oracles (3 mining pools control Eth), and centralized front facing websites simply verifying some hash of something.
Again, people use these meme terms without understanding. If I pin a file on IPFS, share the link and decide to delete it tomorrow because I don't like it anymore, that file is then unavailable and anyone trying to retrieve it gets an error. That's because IPFS isn't file storage. It's not Web3. It's not a blockchain. It's decentralized but not everything that's decentralized is an archive.
Web 3.0 has been used for a long time to mean any P2P/distributed/... approach, not just blockchain, even if the blockchain people try to completely take over the term sometimes.
Web3 has also been used to describe web pages designed for easy parsing.
Reader view, tools for the visually impaired, and browser automation are actually useful and commonly used, so that definition win the title for me.
There are certainly useful distributed web tools (e.g. email, TOR, IRC, Matrix, self-hosting, bittorrent), but they're the opposite of recent trends towards monopoly.
The distributed meaning is absolutely poisoned by blockchain at this point.
Even blockchains are less monopolistic than web 2.0 (Google, Facebook, Amazon, etc). At least blockchains are powered by (largely) independent users, instead of a single corporate entity. But I still prefer further decentralized technologies like email or bittorrent
Web3 has a storage layer, a messaging layer, and an execution layer. Most popular Web3 apps use Ethereum for execution, IPFS for storage, and some custom websocket garbage for messaging, but there are many viable Web3 stacks out there that people are using.
What actually defines web3 software? Is sending emails with .exe attachments considered web3?
Like, if we compare this to RESTful servers, there's no set definition but nearly everyone agrees it's verbs and paths over a hierarchical API sending JSON back and forth over HTTP[S].
It seems like most people can't agree on anything except using etherium as a backbone.
So calling something web3 doesn't seem to do a good job describing things like REST or like something like you wrote above.
This is a common misperception of IPFS. IPFS does not push any content onto your node or force you to host random content. It will only host content you have added or requested to be added to your node.
Why should I (or anyone else) care about IPFS? It's not like I have trouble storing and retrieving data currently. Moreover, every anecdote in this topic seems to be from people who found IPFS unusable in reality.
I don't understand why that would be more useful to me than my current storage and backup system.
I already don't store needless duplicates, and when I decide to store something, I don't have to go back to the source after I have stored it. That's the whole point.
How is IPFS helpful?
Has IPFS made anything happen that couldn't have happened without it?
What are its biggest accomplishments, outside of its own spread?
I've been wondering this as well. In IPFS isn't -everything- cross-domain by definition? Each publication of a document would have a new identity and I'm not sure how you'd shoehorn that into something like CORS. You'd probably need something interactive, akin to OSCP vs CRLs.