Hacker News new | past | comments | ask | show | jobs | submit login

I work on some BitTorrent software and while it's a really cool protocol, it isn't designed to sequentially stream data. Some clients support streaming, but the act of prioritizing sequential chunks of data rather than chunks that are most likely to be unavailable in the future is bad behavior for the collective group of peers.

I haven't personally given much thought to solving the problem of streaming, but I am surprised that the WebTorrent FAQ doesn't mention why they didn't take this opportunity to design a protocol that has more suitable trade-offs than BitTorrent. I'm getting mixed messaging; is their goal to connect the BitTorrent network with WebRTC or enable high quality P2P streaming via WebRTC?




Hi, creator of WebTorrent here.

> [BitTorrent] isn't designed to sequentially stream data

We’re working on improving the algorithm to switch back to a rarest-first strategy when there is not a high-priority need for specific pieces. In other words, when sufficient video is buffered, there’s no need to deviate from the normal piece selection algorithm.

But the fact is that with the speed of today’s internet connections, the user is going to finish fully downloading the torrent in a fraction of the time it takes to view it, so they will still spend more time seeding than downloading.

In practice, the only time that the rarest-first algorithm is important is on poorly-seeded torrents, or in the first few hours of a torrent being published when the ratio of seeders to leechers is really bad. I plan to keep improving the piece selection algorithm so that WebTorrent can be a good citizen.

Also: you should note that not all WebTorrent users stream sequentially. That's just one option for downloading the data.

Also: It's noteworthy that BitTorrent Inc.'s official torrent client (as well as the largest player by marketshare), uTorrent, offers sequential downloading, as well as selective file downloading. And the BitTorrent network remains very healthy.

> why they didn't take this opportunity to design a protocol that has more suitable trade-offs than BitTorrent

BitTorrent is the most successful, most widely-deployed P2P protocol in existence. It works really well. My goal with WebTorrent was to bring BitTorrent to the web in a way that interoperates with the existing torrent network.

Re-inventing the protocol would have made WebTorrent fundamentally incompatible with existing clients and prevented adoption. The way we've done it is better. The wire protocol is exactly the same, but there's now a new way to connect to peers: WebRTC, in addition to the existing TCP and uTP.

Also, re-inventing the protocol is a huge rabbit hole. There was already a lot of risk when I started the project -- will WebRTC get adopted by all the browser vendors? Will data channel stabilize and be performant? Is JavaScript fast enough to re-package MP4 videos on-the-fly for streaming playback with the MediaSource API? My thinking was: Why add inventing a new wire protocol and several algorithms to the table?

Thanks for your thoughtful comment. Hope you'll give WebTorrent and our new desktop app, WebTorrent Desktop a try!


Great work. You brought your product to market! Don't mull over what you didn't do.


>Also: It's noteworthy that BitTorrent Inc.'s official torrent client (as well as the largest player by marketshare), uTorrent, offers sequential downloading

Its worth mentioning that the option is in a hidden menu


> We’re working on improving the algorithm

That sounds like you prioritized implementing streaming first over being a good citizen.

> Also: It's noteworthy that BitTorrent Inc.'s official torrent client (as well as the largest player by marketshare), uTorrent, offers sequential downloading, a

To my knowledge that is only available if the swarm condition allows and is not purely sequential. But that is second-hand knowledge, so I may be wrong. But either way, the default is rarest-first.


> That sounds like you prioritized implementing streaming first over being a good citizen.

I read that as "it sounds like you prioritized getting a working proof-of-concept first over working out the long-term details".


But you don't need streaming for a bittorrent-over-webrtc PoC.

And those "long term details" are implemented by all bittorrent clients, so they're hardly something novel that needs figuring out.


There's a difference between an industry-wide proof of concept and a personal one. If I was making a text editor, I would begin by focusing on making a proof of concept that I could accomplish the features that I wanted in the way that I wanted to do it -- it wouldn't help much to say "emacs has that feature so that's fine".

These long term details are implemented by all established bittorrent clients. I would bet that version 0.1alpha of many of them did not, but were rather in a state of "holy moly this works! I should go show HN".


Webtorrent is 2 years old and seems to have several active contributors, do you really think the "0.1 prototype" argument applies here?

Not to mention we're not talking about some optional, nice-to-have feature here, we're talking about a core aspect of bittorrent which gives it robustness.

Also, you forgot to address my other argument.


Yep! I didn't address your other argument because I accept it and there's nothing about it I disagree with. I don't disagree with any of what you just said, either.

I just wanted to point out that "streaming over web-torrents" is the feature being demo'd here, which means that (a) it's a new feature (I assume?) to this project / these developers, and (b) it's clearly something they feel is a nice-to-have feature, because they not only chose to spend time making it, but also announced to HN when they had a working PoC. If people never posted something to HN until they were "100% complete", I think this place would be a lot less interesting than it is.


WebTorrent supports both sequential and rarest-first downloads. Just pass the right option to the webtorrent library. The client.add() method takes a 'strategy' option that can be set to 'sequential' or 'rarest'.

Feel free to open a GitHub issue if you have suggestions for how we can do better.


So in principle the library supports it, but instant.io always passes sequential, i.e. defaults to disabling rarest-first, correct?


What kind of trade-offs ?

Genuinely interested, I know the basics of the torrent protocol, and I don't understand why the torrent protocol wouldn't work for streaming... I mean, you would just need to request the packets in order instead of randomly.

It would be less efficient, sure, but it would work.


> but it would work.

Downloaders are only incentivized to give back data until they are done. Seeders are not really incentivized at all, so they can go away at any moment.

So if everyone downloads sequentially and seeders go away, then you can end up getting stuck in a situation where everyone has the beginning of a file but nobody has the last parts.

When you're streaming the protocol is not robust.

With random order you have a robust protocol that just degrades in throughput if people are selfish.

And with a webpage-bsed service being selfish is as easy as closing a tab.


You can choose to not seed the second you begin downloading files with most torrent clients (at least that I've used), so the argument that torrenting ensures people share while their download is underway is weak.

It's proof that seeders seed for the sake of seeding for the entire system to work.


Exactly. In practice, modern torrent swarms have such an over-abundance of seeders that there is ample bandwidth for everyone. Consequently, the famed BitTorrent tit-for-tat algorithm as well as rarest-first piece selection strategy become a less important.

Update: It's also noteworthy that BitTorrent Inc.'s official torrent client (as well as the largest player by marketshare), uTorrent, offers sequential downloading, as well as selective file downloading. And the BitTorrent network remains very healthy.


None of that addresses what I have said. I'm argueing about robustness in the absence of seeders.

Seeders are not abundant in all swarms. And nothing in the protocol guarantees their existence, and thus they do not contribute to intrinsic robustness.

Also, torrent clients that run in the background and consume few resources are hardly comparable to things that run in a browser tabs, user behavior will differ.


And for popular torrents, torrents with a lot of seeders who have at least okay bandwidth, it shouldn't matter. If a file can be downloaded in less time than it takes to use / consume / watch / listen to that file then there's space for innovation.


Am I missing something or is this thing streaming?


Weird, a friend of mine used to stream with utorrent years ago without problems. Maybe one pause of a few seconds for a whole TV episode.

The problem is not the protocol but the lack of features like subtitles. And torrent contents not being standard. All this can be easily scripted to cover most cases.


The problems are not for the individual user, but for the rest of the group. By downloading blocks in sequential order, it guarantees that the last blocks will be rarer on average, which is bad for collective performance.


While this is strictly speaking true, considering todays fast bandwidth, I don't think this is that big an issue. I am using webTorrent's desktop torrent client which has a streaming mode and usually this goes like this: download a magnet, start watching (in streaming mode), download of the file is finished within 5 minutes, and I keep seeding the full file for the rest of the duration of the movie. Usually, my ratios are bigger than with a non-streaming client torrent. Kudos to feross for his great work!


It doesn't have to be that bad. Users are likely to have more bandwidth available than is needed to stream the file (if they didn't, then the streaming will never work).

Unlike simple HTTP streaming, the clients can use this spare bandwidth to download some blocks from the end of the file, even when streaming from near the start. So a sensible torrent streamer can still ensure that later blocks are not too rare.


I take your point. It's certainly not the end of the world, but it is sub-optimal.


The increase in seeds from a more ergonomic protocol offsets the downside. Especially if the population has mostly the early blocks, but most of the population wants the early blocks, then there's room for much leech/leech sharing


Guys guys hear me out ... the solution is to use "middle in" (not to be confused with middle out from Pied Piper).

If you download from front and back simultaneously only the middle blocks would be scarce and you still get at least half your original streaming rate. And no weirdo will stop watching midway through a movie.


That doesn't solve the problem, it just moves it around. A reasonable compromise would be to use any "spare" bandwidth to fetch random blocks.


And unfortunately, that's not what happens. Clients instead linearly download an hour's worth of content.

That being said, though, do streaming clients make much of a contribution to the amount of seeded data?


If somebody was a genius, they'd write a bittorrent content delivery software that arbitrages those pauses with preloaded ads.


And then someone would come along a few days later with a client that doesn't display those ads.


You'd still have to wait for the first block of data (piece in BitTorrent lingo) to complete.


I'd rather have a buffering symbol than an ad.


Nice one! Plus you would not have to download ad.


No!


The BitX program seems to handle subtitles outside of the container and protocol (some webservice I guess).


Most of the webtorrent streaming demos I've seen have a webseed fallback, so it's not really that important unless you're doing something completely distributed.


BitTorrent streams just dandy if you request/prioritize packets in order rather than random order. Most clients already have this capability.


It works fine and is even supported in the official BitTorrent clients nowadays but it's correct that BitTorrent wasn't designed to stream and streaming was controversial in early BitTorrent.

Clients are meant to prioritize the rarest pieces in the swarm first with some randomization thrown in. Downloading sequentially is bad for the swarms health but as it turns out doesn't seem to be bad enough that the protocol can't handle it, at least as long as there's a mix of streaming and classic clients or streaming clients use a combination of classic and streaming behavior.


But it's sometimes very bad for the network. There are a few papers out there about how if too many people stream vs random download it can lead to bad performance.


Is this necessarily true if all of the connected peers are also streaming? In my understanding, Bittorrent optimizes for the most in-demand chunks, and in a streaming context those would be the ones at the beginning of the video.


It's actually the other way around. BitTorrent clients should prioritize that most rare pieces, there isn't really a notion of what the most in-demand pieces are.

If you were to combine streaming with a BitTorrent protocol that prioritized the most in-demand pieces you would probably not be able to watch most videos to the end without pauses to buffer or maybe even at all.

By prioritizing the most rare, even if all seeds left the swarm there's a much better chance swarm combined may still have all pieces and the torrent can still be finished.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: