I maintain a program written in Python that is faster than the program written in C that it replaces. The C version can do a lot more operations, but it amounts to enumerating 2^N alternatives when you could enumerate N alternatives instead.
Certainly my version would be even faster if I implemented it in C, but the gains of going from exponential to linear completely dominate the language difference.
You can take that one step further. What kind of signal does “I can afford to go to University and not worry about credentials” send? I’d argue that’s realistic only for people who are willing to admit that they belong to a leisure class. In the US at least, we like to flatter the leisure class with the pretense that they worked hard to get there.
Aside from how slow and user hostile it is compared to a text editor, my biggest complaint about vs code is the load it puts on the login node. You get 40 people each running multiple vs code servers and it brings the poor computer to its knees.
Every job on an HPC cluster should have a memory and CPU limit. Nearly every job should have a time limit as well. I/O throttling is a much trickier problem.
I wound up having a script for users on a jump host that would submit an sbatch job that ran sshd as the user on a random high level port and stored the port in the output. The output was available over NFS so the script parsed the port number and displayed the connection info to the user.
The user could then run a vscode server over ssh within the bounds of CPU/memory/time limits.
Kitty is great; I want to see it succeed in pushing terminal emulators forward into the current millennium. However, I can’t use kitty at work, and I absolutely live inside of tmux. The server is where all the action is, and when I get disconnected, I want to be able to pick things up exactly where I left them. Window layout, the state of each shell and text editor, what’s in the copy buffer, scrollback, everything. I can’t give that up unless I have a suitable replacement on Windows. Until then I will continue to use tmux at work and kitty at home.
how do you do something like tmux sessions (not just windows) in kitty? how does switching sessions and windows work? can i connect to the same session in multiple windows? or can i manage separate windows with different sets of sessions?
My complaint about JSON is that it’s not minimal enough. The receiver always has to validate anyway, so what has syntax typing done for us? Different implementations of JSON disagree about what constitutes a valid value. For instance, is
{“x”: NaN}
valid JSON? How about 9007199254740993? Or -.053? If so, will that text round trip through your JSON library without loss of precision? Is that desirable if it does?
Basically I think formats with syntax typed primitives always run into this problem: even if the encoder and decoder are consistent with each other about what the values are, the receiver still has to decide whether it can use the result. This after all is the main benefit of a library like Pydantic. But if we’re doing all this work to make sure the object is correct, we know what the value types are supposed to be on the receiving end, so why are we making a needlessly complex decoder guess for us?
NaN is not a valid value in JSON. Neither are 0123 or .123 (there must always be at least one digit before the decimal marker, but extraneous leading zeroes are disallowed).
JSON was originally parsed in javascript with eval() which allowed many things that aren't JSON through, but that doesn't make JSON more complex.
That’s my point, though! I’ve run into popular JSON libraries that will emit all of those! 9007199254740993 is problematic because it’s not representable as a 64 bit float. Python’s JSON library is happy to write it, even though you need an int to represent it, and JSON doesn’t have ints.
Edit: I didn’t see my thought all the way through here. Syntax typing invites this kind of nonconformity, because different programming languages mean different things by “number,” “string,” “date,” or even “null.” They will bend the format to match their own semantics, resulting in incompatibility.
> 9007199254740993 is problematic because it’s not representable as a 64 bit float. Python’s JSON library is happy to write it, even though you need an int to represent it
JSON numbers have unlimited range in terms of the format standard, but implementations are explicitly permitted to set limits on the range and precision they generate and handle, and users are warned that:
[...] Since software that implements IEEE 754 binary64 (double precision)
numbers is generally available and widely used, good interoperability can be
achieved by implementations that expect no more precision or range than these
provide, in the sense that implementations will approximate JSON
numbers within the expected precision.
Also, you don't need an int to represent it (a wide enough int will represent it, so will unlimited precision decimals, wide enough binary floats -- of standard formats, IEEE 754 binary128 works -- etc.).
RFC 8259 is a good read and I wish more people would make the effort. I really don’t mean to bash JSON here. It was a great idea and it continues to be a great idea, especially if you are using javascript. However, the passage you quote illustrates the same shortcoming I’m complaining about: RFC 8259 basically says “valid primitive types in json are the valid primitive types in your programming language,” but this results in implementations like Python’s json library emitting invalid tokens like bare NaN, which can cause decoders to choke.
I think what JSON gets right is that it gives us a universal way of expressing structure: arrays and objects map onto basic notions of sequence and association that are useful in many contexts and can be represented in a variety of ways by programming languages. My ideal data interchange format would stop there and let the user decide what to do with the value text after the structure has been decoded.
Before your edit, I was going to object to your premise because it seems like a format could get worse just by more implementations being made.
After your edit, I see that it's rather that syntax-typed formats are prone to this form of implementation divergence.
I don't think this is limited to syntax-typed formats though. For example, TNetstrings[1] have type tags, but "#" is an integer. The specification requires that integers fit into 63 bits (since the reference encoder will refuse to encode a python long), but implementations in C tend to allow 64 bits and in other languages allow bignums. It does explicitly allow "nan", "inf", and "-inf" FWIW.
Exactly right! As somebody who’s spent a great deal of time with the discrete Fourier transform, I thought, “this article reads like it was written specifically for me.” I/Q modulation is new to me though.
What bothers me is, their notebooks were fine when they first came to market. I still have their old, filled notebooks with great bindings and paper, with spotless ink retention, incl. fountain pens.
They gradually reduced their quality, and created a “higher, more expensive tier” to offer their previous quality.
Leuchtturm 1917 is a world apart when compared to today’s Moleskine.
Platinum Carbon Black is a wonderful ink. It seems to work very well in cheap fountain pens with flow issues. It’s highly resistant to coffee spills and looks ok on mediocre paper. Only drawback is cleaning it up; it cleans up like used motor oil.
I just discovered the formulas Excel library for Python, and I have about two dozen ideas for things that you could do with it. It turns out they already implemented the first dozen in the library itself, but the possibilities that come from extracting the computational graph out of a workbook are huge.
At the top of my list is taking unstructured workbooks and adding a sheet to tabulate inputs and outputs. A lot of spreadsheet models are not set up for bulk evaluation or retaining old results, and the peculiar arrangement of cells constitutes an ad hoc user interface for the model. I want to take that seriously as a programming environment for subject matter experts and give them an extra tool for working with their models. As programmers, we give spreadsheet authors far too little credit for intelligence and skill, and often blame them for the faults of their platform. This is unproductive, and we can do better.
Certainly my version would be even faster if I implemented it in C, but the gains of going from exponential to linear completely dominate the language difference.
reply