> And don’t tell me about Bytes crate—it should not be a separate crate
I'd be interested to hear the author's reasoning behind this, if it does what they want then why not use it? It's small and well written, so I don't think vetting it should be a problem.
The rest of the article seems quite sensible, that comment just strikes me as a little odd.
There is a real change in mentality you have to go through if you transition from a fairly strict C/C++/ even Java background to trying out Rust. In the former languages, adding dependencies rapidly becomes a painful experience, whereas Rust does much better dependency management and automatic building than even Python (where you need a requirements file or something similar to go pull down all the deps).
With Rust, you really should just use crates. The std is meant to be limited to just the most used code and that which should not change for the sake of keeping the ecosystem stable.
Off topic, but it shouldn't be better than "even Python", because Python has a really, really broken dependency system. Far more so than Java, which has Maven/Gradle which are both infinitely better than the pip/virtualenv disaster.
People complain about things like shading in Maven being complicated. What they might not realize is that pip doesn't even try to address conflicting dependencies, it will just silently give you the wrong version! You ask for A==0.2 and it will give you A==0.1 if another dependency asked for A==0.1 first. And it won't even warn you even though it's straight up broken behavior. Virtualenv makes packaging annoying since it's almost vendoring but not quite. To totally understand the packaging system forces you into the world of eggs, wheels, disutils, conflicting versions, condas, etc.
Sorry for tangent, just thought it was funny you would hold Python up as a standard of dependency excellence when it's probably the worst overall ecosystem of major languages.
Yeah, as someone who's done a deep dive into Python dependency management (mostly because virtualenv has some insane ideas about how to make shell tools) that bit made me laugh.
There are a lot of great things about Python, but dependency management is right there under the GIL on the list of things that are very painful.
My language development went from C -> C++ -> Java -> Python. So when I got there and figured out pip was a thing (or easy_install back in the day) it was a major innovation at the time.
Additionally, for anyone coming from almost all compiled native languages, a native environment like Rust with better dependency management than Python (which, as you say, and in retrospect, is pretty broken) is a bit of a mind screw.
The Python devs I worked with always envied me for npm. I asked them if they don't have something similar with pip, but seems like npm is a whole different level.
Cargo should even be better than npm.
On the other hand I always asked myself if they couldn't simply use Nix?
Haven't used Rust yet, is it kind of similar to Go where any file can simply import and then Go knows how to fetch and build when needed ? And then when you remove the calls (eg during a refactor) the compiler force you to remove the imports too, stopping the infinite bloat caused by "no one really knows if we still need this" that can be common in other language.
I really liked that "dependancy as part of the language and build tool".
Of course, Go still had its own dependancy issues (versionning, availability, ...) that they now work on, but that was a step in a direction I quite liked.
> is it kind of similar to Go where any file can simply import
You declare your crate dependencies in a Cargo.toml file. Rust does proper versioning of dependencies so having a separate manifest is desirable. Within your code you declare the existence of a crate via `extern crate` and then you can use it wherever.
> the compiler force you to remove the imports too
Do you know if there are plans for rustfmt to auto-import like gofmt does? In the sense that if a crate is available (in the .toml file) and you reference it, rustfmt will automatically insert the required "import" and "use".
If the compiler gives you an error with a suggestion "You probably need to add `extern crate foo;`" (and it already does in some cases), RLS (i.e., your editor plugin) will be able to automatically add it for you (soon™).
There's a separate tool called rustfix in development that can apply the suggestions given by the compiler, so it could theoretically prompt for these.
Its kind of the reverse. You declare versioned dependencies in the toml file, and (soon) Cargo will add them to rustc so they just show up for your source files. Right now you declare them twice, once in the toml and once in the crate root - ie, you add url = 1.5 in toml, and extern crate url in src/main.rs. There is an RFC coming that will make the extern crate part optional.
A current problem with " should just use crates" is that cargo only works with source code dependencies, not binary libraries.
So each new crate added to your project ends up increasing the overall build time, which gets exponentially bad if you happen to add a dependency to a crate than happens to have a big dependency list.
I have a very basic word counting application with a GUI written in Gtk-rs, a fresh build straight out of "git clone" takes a few minutes, mainly thanks to Pango.
I'm not saying it isn't a problem, I just wanted to know if I was correct in thinking that it is only a problem the first time you compile each crate (or rather each version of each crate, I think).
Yes, it is a problem when you do a clean build, or when you have common crates across projects, because cargo doesn't have a concept of build cache.
You can try to workaround it by setting all target directories to same one via target-dir in your .cargo/config file, but there is no guarantee that the crate won't get rebuild.
As a developer at a large software company, every dependency that is not part of the language runtime itself is a pain because legal paperwork and evaluation has to be done for each individual component before I can use it / ship it.
NOTE: I am not referring to copying I would be doing, but to the crates thats have little dependencies that should consider including a copy of their little dependency instead via whatever method is appropriate for licensing.
This makes languages like Python, Go, Perl, etc. preferable over languages/projects like node.js, rust, etc. because (at least I) don't generally end up with tens/hundreds of dependencies since the standard library is rich enough for most work.
Additionally, I personally despise have many, small dependencies because it means I have more to think about and manage. Instead of just being able to think about the version of the compiler/standard library I'm using, I have to consider every individual crate.
I'm well aware of why rust chose to make certain tradeoffs with crates, but having numerous pieces of what many of us consider "basic functionality" as a third-party dependency is frustrating. Languages with richer standard libraries have spoiled us all.
I really disagree with that quote. Copying is how you get bugs sticking around in software for all time (for example, doing binary searches in a way that avoids overflow is surprisingly tricky, and the endless copying of naive binary search code is why this bug is so difficult to eradicate). Honestly, that quote is just an excuse to avoid the hard work of making the language ecosystem handle dependencies properly.
> As a developer at a large software company, every dependency that is not part of the language runtime itself is a pain because legal paperwork and evaluation has to be done for each individual component before I can use it / ship it.
How is copying better than dependencies in this regard? You presumably need legal signoff either way.
> Additionally, I personally despise have many, small dependencies because it means I have more to think about and manage. Instead of just being able to think about the version of the compiler/standard library I'm using, I have to consider every individual crate.
This is what the "Rust platform" is designed to address. It is nice to be able to refer to a specific version of the Rust platform, but that doesn't mean you have to give up on the massive ergonomic benefit of the Cargo ecosystem relative to copying and pasting code.
I believe they mean, each time a source code is downloaded and used that contains a LICENSE file, a legal review must occur. So if it's bundled into one licensed work "Rust With Lots Of Crates Bundled", then that's one form to fill out, but if it's "Rust" and then "Download And Use Crates", that's one form per addon to fill out.
I think this is where reality meets theory. In reality, the developers are probably just taking the code as if they had written it, and the people that may know, such as immediate supervisors, don't care to point it out for the same reason the developers are stealing it, it's much easier than the alternative. The code vetting team is just left in the dark.
Employees take shortcuts around bureaucracy all the time. Sometimes (often?) that bureaucracy is for legal reasons.
I'm not going to endorse copying over package managers on the grounds that copying makes it easy to get away with violating big companies' legal procedures on the use of third-party code.
I wasn't endorsing, just providing an explanation of why while in theory copying and package inclusion are the same from a license standpoint, they likely often aren't in reality. That doesn't mean it's a good thing.
Your argument is the correct moral and legal one. Unfortunately that doesn't always matter. For another example, see the cognitive dissonance many express regarding ad blocking (not to come down entirely on one side of that issue, it's complicated).
> and the people that may know, such as immediate supervisors, don't care to point it out for the same reason the developers are stealing it, it's much easier than the alternative.
If your company gets aquired one day, the code will probably be audited during the due diligence process. If licence violations related to copy-pasting is found, your team will be asked to remove the infringing code and your supervisor may be fired. This happened to my team in the first company I worked for (not the firing part though) : we had a lot of code which was just copy-pasted from lodash and the audit found it.
Yes, anytime source code is retrieved that isn't an existing, approved version, legal review of some sort must occur. This includes even referencing it despite what the other poster mistakenly believed I was implying.
I think the argument is where that line is. Maybe copy/pasting binary search code is too much, but do you need a dependency for left pad? There's a line somewhere.
Left pad was a problem for a number of reasons, none of which apply to Cargo (cargo yank never breaks code, by design, while the npm equivalent did). It's not relevant at all.
Sure it is. We're talking about what should and shouldn't be a dependency. The acute problem with left pad was npm's design, but the cultural problem (if you consider it a problem) was that anything depended upon something so small in the first place.
The circumstances that led to the left-pad fiasco were because of Javascript's uniquely anemic standard library (at least until very recently). Rust's stdlib is not small in the same way that Javascript's historic stdlib was. Rust's stdlib is narrow, yet deep: a relatively small number of modules that themselves provide a very large number of operations and convenience functions. Rust dependency graphs can get pretty big, but in practice they're nowhere near as big as the dependency graphs you'll see in big Node apps because the stdlib is so much more fleshed out. That order of magnitude difference is crucial; one might call it "microdependencies versus minidependencies".
Rust has the capability to, and I believe the developers have expressed they are amenable to, internalizing crates that become the best solutions for a problem.
Would you rather a flawed, or later deemed incomplete internal solution be implemented and then the language is forced to support it in perpetuity, or would you rather one or more solutions get tried and the best implementation and syntax eventually accepted into core?
EcmaScript can do the same, and finally has[1], but it moves so slowly and has so many competing interests that it seems to take forever for that to happen.
Oh God no, haha. I started out in Python and I'm pretty sure all of us have fallen out of love with batteries included.
You can see what happens when you go too far the other way though. C has effectively no basic data structures like strings, lists, hash tables, etc., so anything you interface with has its own idea on how to handle that stuff. Library X might return an array of Things that's NULL terminated. Library Y might return an array of Thangs and a size_t output param. Or like you pointed out in JS, its standard library is full of holes so you get tiny projects that attempt to plug them, or larger projects that try to make JS into a specific kind of language (Underscore), or full on programming languages that transpile to it.
I just don't think the problem is definitively solved though. Personally I think Bytes should be in Rust's stdlib. I think bit and byte manipulation is a fundamental part of a language and there should be a standard way of doing it, especially if there are things like TCP/UDP and hash tables in there. I understand the arguments against; I really like the design of Rust's stdlib, but I feel like there's room for disagreement. That's all I'm saying :)
How is copying better than dependencies in this regard? You presumably need legal signoff either way.
I wasn't referring to copying that I would do myself, but copying that other crates would do.
That is, if a crate only needs a little bit from a little dependency, then copying it into their crate can make everyone else's life easier (obviously taking licensing into consideration when doing so).
In short, the context here was the bytes crate, which is fairly tiny. If rust is going to insist on not including the bytes crate, or a copy of it, in the standard library, then I would hope others that consume it would consider embedding a snapshot of it into their own crate for their own, private use so that I don't have to worry about it.
I'm well aware there's a fine line here, hence my reference to the Go proverb.
The short version is that a component distributed with an embedded copy of its dependencies means a single legal review since it's a snapshot in time of a particular version of that component and its dependencies.
A component that instead references its dependencies and that have their own release schedule/versions, etc. requires a legal review for that component and each of its dependencies.
This has been true at multiple employers I've worked for, so seems unlikely to be a consideration unique to my current employer.
Again, that's what the Rust Platform is for. It's a better solution than copying code, because it doesn't throw away all of the benefits of Cargo just to make some legal policies at some large companies a little easier.
This is where I actually prefer Go's "vendor" approach to dependencies. It would be great if rust / cargo eventually had the same and more authors adopted it or simply copied their little dependencies instead of having external dependencies on them.
Something like this proposed command, except for crate maintenance instead of distribution:
I sincerely hope that people never start copying code into their packages. I see virtually no upsides, except for making it easier to dodge bureaucratic hurdles at some big companies, and a huge number of downsides (basically forgoing all the benefits of Cargo).
Does it do more than using relative paths in a Cargo.toml would do?
I think this thread is about copying and pasting code versus using a small library in the Go case, which might be a philosphical difference with Rust.
It might help to point out that vendored crates are compiled from source making the required review process referenced by that poster just as possible with server crates.
"Dependency" doesn't imply "third-party library." There are plenty of crates that are maintained by the Rust organization itself. You could think of them as a "non-standard library."
(This isn't uncommon; it's also true in, say, Elixir: there are a few useful Hex packages owned by the elixir-lang GitHub org itself. And I believe it's true in Haskell as well.)
It usually does for legal review purposes, in my experience. If those things aren't part of the "standard distribution", they have to be evaluated separately. Especially if they have a different release schedule.
Hmm. I guess this might justify the Erlang/OTP approach: shipping a "platform" or "distribution" release that contains your core packages/stdlib—along with a bunch of other, seemingly "extraneous" packages that you also take responsibility for—bundled together as your language's SDK.
Unlike a huge stdlib, a "distro"-style SDK is still factored into packages (in Erlang terms, "applications"), that can be included or excluded from any given release of your project. But it's all released monolithically, and comes as one big package. Probably helps a lot with getting legal sign-off for using the relevant packages. I wonder if that's why they (still) do it?
I personally agree with this philosophy - more the "use the standard library" than "copy stackoverflow".
I can't count the number of times we've had problems with the requests library, either because of the huge tree of dependencies (both explicit and implicit) that requests has, and because of some of the assumptions made by requests.
On the other hand, when a bit more time is taken (yes, this means a few lines of boilerplate) and the code uses urllib2, it rarely has to be touched again.
Some C & C++ programmers are averse to third-party libraries (moreso than other languages in my experience). It's a valid position, but if you really value that then perhaps Rust is not for you.
I can't speak for the author of course, but probably their argument is that a systems language should be able to directly manipulate bits and bytes without outside dependencies. I don't know that I agree. A reasonable counter example is that you need library support to allocate memory in C. The argument is that it's a feature to not require C implementations to include dynamic memory allocation because not all projects allow it. My point being that what "should" be in a language usually depends on what you're using it for, and for a general purpose language keeping that very small is at least a consistent design.
I think that the point was more that heap allocation is not a standard language primitive in C. And indeed, it would be ludicruous to require dynamic allocation support from freestanding implementations (no standard library because typically your platform doesn't even have an OS).
For what it's worth, C++ can have freestanding implementations and it provides dynamic allocation support at a syntactic level. Freestanding applications, however, need to provide an "operator new" function (with the appropriate memory allocation code) if they want to use it.
Bytes is basically just C++'s `std::string` (kind of don't bite my head off people who've memorized the C++17 standard). Its an ARC backed array [1]. This suggests it should be a fairly fundamental abstraction.
Edit: ^^^My C++ is wrong sorry :(
Really I disagree with its purpose. In its immutable, non-threaded safe form you can create the same structure by just borrowing a value. This ofc requires making your peace with the borrow checker and `Cow<'a, T>` copy on write types.
By in large Bytes advertised purpose is network code. And for networking code its really only _super_ useful if your using a Packet Ring in Linux. As jemalloc will not return regularly re-allocated buffers.
Really this is all performance theater. How you manage/architect your socket reads/writes will have an order of magnitude larger effect then what abstraction you _store_ those bytes in once read.
Bytes is intended for use with the tokio ecoystem where you cna't use references often because borrowing across a yield point in a future would be a lifetime error.
Then you have to do a deep clone every time. Arc<Vec<u8>> is also not sufficient, because Bytes lets you share a reference count among slices to different offset into the buffer.
In pre C++11 it was quite typical for std::string to be implemented with COW semantics.
Since C++11 standard it is no longer permitted, though it is not necessarily reflected in all stdlib implementations.
I'd be interested to hear the author's reasoning behind this, if it does what they want then why not use it? It's small and well written, so I don't think vetting it should be a problem.
The rest of the article seems quite sensible, that comment just strikes me as a little odd.