Definitely specific to the V8 JavaScript engine. I was able to put 40 million keys in a map in Firefox/SpiderMonkey. It was pretty fast and only took a few seconds.
The pitfall was that because I did it in the Developer Tools console, the CPU and memory usage would shoot through the roof from the autocompleter if I wrote any code that would cause a string representation of the map to be printed as a helpful hint.
If you like playing with ridiculous amounts of hardware, definitely try out TPU VMs with TFRC: https://blog.gpt4.org/jaxtpu
You can create up to 5 TPU v3-8's in europe-west4-a. Each core is roughly equivalent to a V100, so it's sort of like having 5 p3.16xlarge's. (The TPU has 96 CPUs and 330GB of system RAM, whereas a p3.16xlarge has 64 CPUs and 488GB of system RAM according to https://aws.amazon.com/ec2/instance-types/p3/).
The best part IMO is that you don't have to figure out how to link the cores together. JAX does that for you automatically. You get 100GigE from TPU to TPU too, which you can also leverage: https://twitter.com/theshawwn/status/1406171487988498433
Sorry. As an AI outsider, I wasn't sure how to publish to arxiv, so I used the first PDF hosting site I could find. I've mirrored it to https://battle.shawwn.com/swarm-training-v01a.pdf and edited my original comment with that instead. Thanks for pointing that out.
While lol-worthy in some ways (especially that error message - wow!): 16 million is a very "human-scale" number. I'm (somewhat) surprised that's the limit, and that I haven't heard about it before.
I've run up against this building a gigantic LZ dictionary for full text search[0] and I got around it[1] by using npm module called big associative that provides dropin big map and big set primitives.
but even in that code I think there was some error or inefficiency that I had to customize to make it usable.
But in the end I was able to get around the 16 million limit on entries and good efficiency.
author of blogpost here. interestingly the actual thing (not mentioned in blog post) that I was developing was for text searching too...resulting library I made is here https://github.com/GMOD/ixixx-js/
Cool, man. I just forked it! Happy to take a look inside...figure out how it works; use the best stuff for mine...kidding. But I ran into issues with using LZ, but also was using a trie for the index structure (or on way to).
Let me take a look and double check that. It might have been a prior library, or maybe I did modify big-associative. I'll investigate. Be back here later
edit: So I looked and I think I didn't modify big-associative...could have been a library I used before I found that one.
Same here. Sometimes I wish the web was simpler and the history API was not a thing. Or barring that, that the browser would pop up a permissions prompt when a page attempted to access the history API.
The back button in Chromium type browsers should ignore history entries added before a page receives a user gesture [0]. That did not seem to happen here!
You can try it yourself by loading a YouTube video via pasting into the URL box and letting it autoplay a few videos. The back button should take you to the new tab page. If you click anywhere on the page then those autoplay entries will not be skipped by the back button.
Several years ago, I wanted to make a huge lookup table in Node.js during a research project, but the program stopped making progress after a while.
This incident was detailed in my Quora answer:
https://www.quora.com/What-would-you-need-64GB-of-RAM-for/an...
I incorrectly concluded that Node.js has a limitation on how much memory it could use. After reading this article, I now understand that the real reason is that I'm setting too many properties on an Object, exceeding V8's limitation.
0_0, 0__0, 0x_0, 0_ and 0_.0 are legal in Rust. 0._0 is the only one of those that’s illegal (“`{integer}` is a primitive type and therefore doesn't have fields”).
012345 is the old way of doing octal notation in Python, which has its roots in C. I'm guessing they want to prevent ambiguity by making it so that it's an error for "012345" to be a decimal literal.
Not only that, octal notation is actually valid in non strict-mode versions of JavaScript (which is still the default unless you opt-in with "use strict"). So it would be backwards incompatible outside of strict mode, and very confusing if changing from non-strict to strict-mode changed the semantics without giving an error/warning.
Yes, it gives an error -- if you want decimals, you can't use leading 0's:
Python 3.6.3 (v3.6.3:2c5fed8, Oct 3 2017, 18:11:49) [MSC v.1900 64 bit (AMD64)]
Type 'copyright', 'credits' or 'license' for more information
IPython 7.2.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: 01
File "<ipython-input-1-3351d58b1d3b>", line 1
01
^
SyntaxError: invalid token
In [2]: 0o1
Out[2]: 1
In [3]: 0o10
Out[3]: 8
Since 2019 in the leading browsers (Firefox and Chromium); 2020 if you use something else like Edge or Samsung Internet; 2021 for Chrome or Firefox on mobile... and of course you can forget about MSIE or Opera Mobile support as usual.
You can also write JavaScript numbers with E notation (i.e. multiplied by 10 to the power of x), which is basically what I do with all numbers above 1,000 or below 0.001 now. It has the rare benefit, for big / small numbers, of being both more readable and more concise.
By default, Node has a relatively low limit on the amount of memory it will use (I believe 1GB). As you get close to this, it thrashes as it tries to garbage collect more and more frequently.
You can increase the limit by running `node --max-old-space-size=8192`. But that doesn't seem to fix this specific issue, either with the Map or the Object!
Ran into something similar years ago on the Blackberry where if you created more than 2^15 objects, performance could drop by something like 90%. There was no error but the performance hit was bad enough to make the app unusable.
Hmm, one key for even every possible Unicode code point (including surrogates, noncharacters and reserved ones, all the way from U+0000 to U+10FFFF) makes 1,114,112 keys. That’d be a big keyboard with rather poor random access characteristics. Allow 20×20mm for each key (seems a fairly typical sort of size) and that’s almost 446m². I dunno what you’ll use your 16,777,216 keys for (I agree, it ought to be enough for anyone), but as a full-sized keyboard it’s going to have a surface area of at least two thirds of a hectare. You could build a giant hamster ball a bit over 46m in diameter and lay the keys out on the inside, travelling the world as you spin the ball in order to find the right key. (The Unicode 13 hamster wheel keyboard might be more reasonable, its 143,859 keys requiring a diameter of less than five metres. Perhaps more manageable, but some keys might be harder to access as you roll through cities since the ball is no longer big enough to almost ignore many houses.)
I’d suggest smaller intervals, like a cent instead of a semitone, but the numbers are so absurdly large that dividing it by a hundred doesn’t really help anything. You’d need a good deal more than 64-bit floating point numbers for any pertinent calculations, 10³⁰⁸ is pretty tiny; and the theoretical string tensions involved would be quite something.
Perhaps an organ is the right answer instead, with one key per pipe rather than the usual multiplexing. The Wanamaker Organ makes a good start with 28,750 pipes across 484 ranks (though it multiplexes them across a paltry six manuals plus pedalboard), but you’re going to need to come up with a lot more sounds, and the speed of sound is low enough that it’s not going to work very well done acoustically. (Hmm… y’know, there’s actual scope for an art piece here, an organ but producing light instead of sound, like lidar is to sonar. I think you’d struggle to produce anything the human eye would appreciate, but it’s a fun concept.) Or perhaps it should be splitting each manual into one for each combination of sounds it controls. That might well go over the 2²⁴, I’m not thinking about the numbers too exactly.
A yes the Gamma Ray, one of my favourite phenomena. Giving off as much energy as our dear sun radiates in her lifetime.
Well, maybe we use 10e-absurdlyLargeNumber as our base note then, just to stay safe. Hm, how many years will we have to wait, till the lowest notes went through one cycle?
It was actually a case of me going and doing something else and then returning to this tab and reading first of all the comment that I responded to; yet having forgotten the context, I immediately thought of keyboard keys; and the concept amused me so I ran with it and it ended up somewhere a bit different from where I expected!
I feel like if you are working with genome levels of data maybe you shouldn't be using JS to do the work. JS is a great tool in some cases but not really a scientific one. I guess you use whatever you know.
The problem, of course, is when you've worked on the assumption that it will do the job (hey, it was perfectly plausible!) and after a six months' work you run into a showstopper like this.
There's no limit for array size in C++, or for any core structure. Other than the size types, which naturally scale up as we get newer architectures (obviously 2^64 for most people at the moment).
I think C++ pointers being 64 bit will stick with us forever, unless some fundamental paradigm shift happens in what C++ is used for. I also think the size of short, int, long long, float and double will not change anymore, ever. The reasons we had for changes in the past will not happen anymore from now on, and the cost of changes only increases over time. We might see the sizes (and IEEE-754 floats) be officially standardized at some point, like twos complement is standardized now.
I mean, yes, size_t will stay at 64bits for at least a while, 2^64 bytes is 16 exabytes. Once this is insufficient then it will be slowly switched to 128bits. Some systems already use 128 bits for storage-related arithmetics.
The C++ standard may not impose a limit, but any given C++ implementation will. Likewise, the limit in the OP here isn't part of the Javascript standard, just a Chrome implementation detail.
If we're saying "Pfffff" at things, let's take a moment to appreciate that by the magic of not actually storing them Rust gets to store all of nothing in no space.
Rust's arrays of something can only have up to std::isize::MAX elements. But by choosing to store nothing instead of something, we can go larger to std::usize::MAX with e.g. [(); std::usize::MAX] and that'll actually work, at runtime, even on a modest computer since you don't need space for std::usize::MAX of anything.
If usize::MAX of a zero size type isn't enough for you, might I interest you in an Iterator that contains not merely usize::MAX but infinity of any Default type of your choosing? And if you don't need that many, you can always just take() fewer of them no charge.
It COULD be written to be infinitely scalable in size (up to allowed memory) but that comes with a speed cost.
I remember something about NSArray switching its layout (and speed characteristics) after adding certain number of items, but I can't find a reference now.
The pitfall was that because I did it in the Developer Tools console, the CPU and memory usage would shoot through the roof from the autocompleter if I wrote any code that would cause a string representation of the map to be printed as a helpful hint.