An error message if you put more than 2^24 items in a JS Map object

nayuki · on Aug 27, 2021

Definitely specific to the V8 JavaScript engine. I was able to put 40 million keys in a map in Firefox/SpiderMonkey. It was pretty fast and only took a few seconds.

The pitfall was that because I did it in the Developer Tools console, the CPU and memory usage would shoot through the roof from the autocompleter if I wrote any code that would cause a string representation of the map to be printed as a helpful hint.

sillysaurusx · on Aug 27, 2021

A year or two ago, I helped the Cloud TPU team fix an obscure error when creating more than 200 TPUs.

Apparently no one had ever tried to create more than 200 TPUs before (https://battle.shawwn.com/swarm-training-v01a.pdf). I was pretty proud of that.

(As a bonus, I actually wanted to use all 200 TPUs. It wasn't merely a stress test. Big data indeed!)

ZeroCool2u · on Aug 27, 2021

Here I was having fun with my p3.16xlarge at work today, that's awesome!

sillysaurusx · on Aug 27, 2021

If you like playing with ridiculous amounts of hardware, definitely try out TPU VMs with TFRC: https://blog.gpt4.org/jaxtpu

You can create up to 5 TPU v3-8's in europe-west4-a. Each core is roughly equivalent to a V100, so it's sort of like having 5 p3.16xlarge's. (The TPU has 96 CPUs and 330GB of system RAM, whereas a p3.16xlarge has 64 CPUs and 488GB of system RAM according to https://aws.amazon.com/ec2/instance-types/p3/).

The best part IMO is that you don't have to figure out how to link the cores together. JAX does that for you automatically. You get 100GigE from TPU to TPU too, which you can also leverage: https://twitter.com/theshawwn/status/1406171487988498433

google234123 · on Aug 28, 2021

The program is for researchers, not for play...

antonvs · on Aug 28, 2021

Are you so sure there's a difference?

ms4720 · on Aug 28, 2021

One is a strict subset of the other

google234123 · on Aug 29, 2021

Really? A “strict subset”, come on…

CamperBob2 · on Aug 27, 2021

Wow, that docdroid site is toxic waste. Don't click on anything but the back button if you follow that link.

sillysaurusx · on Aug 27, 2021

Sorry. As an AI outsider, I wasn't sure how to publish to arxiv, so I used the first PDF hosting site I could find. I've mirrored it to https://battle.shawwn.com/swarm-training-v01a.pdf and edited my original comment with that instead. Thanks for pointing that out.

Groxx · on Aug 27, 2021

While lol-worthy in some ways (especially that error message - wow!): 16 million is a very "human-scale" number. I'm (somewhat) surprised that's the limit, and that I haven't heard about it before.

graderjs · on Aug 28, 2021

I've run up against this building a gigantic LZ dictionary for full text search[0] and I got around it[1] by using npm module called big associative that provides dropin big map and big set primitives.

but even in that code I think there was some error or inefficiency that I had to customize to make it usable. But in the end I was able to get around the 16 million limit on entries and good efficiency.

[0]: https://github.com/i5ik/futzz

[1]: https://github.com/i5ik/futzz/blob/master/src/futzz.js#L3

tonto · on Aug 28, 2021

author of blogpost here. interestingly the actual thing (not mentioned in blog post) that I was developing was for text searching too...resulting library I made is here https://github.com/GMOD/ixixx-js/

graderjs · on Aug 28, 2021

Cool, man. I just forked it! Happy to take a look inside...figure out how it works; use the best stuff for mine...kidding. But I ran into issues with using LZ, but also was using a trie for the index structure (or on way to).

sillysaurusx · on Aug 28, 2021

Do you have any details on your customizations? That’s interesting.

graderjs · on Aug 28, 2021

Let me take a look and double check that. It might have been a prior library, or maybe I did modify big-associative. I'll investigate. Be back here later

edit: So I looked and I think I didn't modify big-associative...could have been a library I used before I found that one.

Here's big-associative's repo: https://github.com/samchon/big-associative

peterbmarks · on Aug 28, 2021

Is it just me or does this site prevent the browser back button from working? I hate that!

catlifeonmars · on Aug 28, 2021

Same here. Sometimes I wish the web was simpler and the history API was not a thing. Or barring that, that the browser would pop up a permissions prompt when a page attempted to access the history API.

account42 · on Sept 1, 2021

Speaking of, there should probably also be a permission prompt when the site wants to use a 1GB hash map.

hyferg · on Aug 28, 2021

The back button in Chromium type browsers should ignore history entries added before a page receives a user gesture [0]. That did not seem to happen here!

You can try it yourself by loading a YouTube video via pasting into the URL box and letting it autoplay a few videos. The back button should take you to the new tab page. If you click anywhere on the page then those autoplay entries will not be skipped by the back button.

[0] https://bugs.chromium.org/p/chromium/issues/detail?id=907167

zinekeller · on Aug 28, 2021

It's tumblr-powered, so ¯\_(ツ)_/¯

yoursunny · on Aug 27, 2021

Several years ago, I wanted to make a huge lookup table in Node.js during a research project, but the program stopped making progress after a while. This incident was detailed in my Quora answer: https://www.quora.com/What-would-you-need-64GB-of-RAM-for/an...

I incorrectly concluded that Node.js has a limitation on how much memory it could use. After reading this article, I now understand that the real reason is that I'm setting too many properties on an Object, exceeding V8's limitation.

roberttod · on Aug 27, 2021

TIL You can write javascript numbers with underscore separators https://v8.dev/features/numeric-separators

codetrotter · on Aug 27, 2021

Same in Python 3 and Rust too.

https://www.python.org/dev/peps/pep-0515/

https://doc.rust-lang.org/rust-by-example/primitives/literal...

banana_giraffe · on Aug 27, 2021

And C# since 7.0

https://github.com/dotnet/csharplang/blob/main/proposals/csh...

First time I saw it was in C#, I love it personally

kevin_thibedeau · on Aug 28, 2021

It is lifted from Ada. A language ahead of its time.

pa7ch · on Aug 28, 2021

And go since 1.13 https://golang.org/doc/go1.13#language

de6u99er · on Aug 27, 2021

And Java since Java7 which was released in 2011.

nayuki · on Aug 27, 2021

Note that different languages have slightly different syntax rules.

0_0: Legal in Java, legal in Python, illegal in JavaScript.

0__0: Legal in Java, illegal in Python, illegal in JavaScript.

0x_0: Illegal in Java, legal in Python, illegal in JavaScript.

0_, 0_.0, 0._0: Illegal in all three.

It wouldn't surprise me if C# and C++(14) have differing edge cases too.

chrismorgan · on Aug 27, 2021

0_0, 0__0, 0x_0, 0_ and 0_.0 are legal in Rust. 0._0 is the only one of those that’s illegal (“`{integer}` is a primitive type and therefore doesn't have fields”).

Zababa · on Aug 28, 2021

To clarify the special case in Rust ( 0._0 ), it's due to te definition of identifiers: https://doc.bccnsoft.com/docs/rust-1.36.0-docs-html/referenc.... _0 is an identifier, as it's an underscore followed by one alphanumeric character. Thus, 0._0 means "the field _0 of 0". Since it's an identifier starting with _, you won't get any warnings if you don't use it. Small example: https://play.rust-lang.org/?version=stable&mode=debug&editio...

austinjp · on Aug 27, 2021

scns · on Aug 28, 2021

Interestin, i saw it first in Kotlin.

omegalulw · on Aug 27, 2021

Interestingly, why do they specify base 10 integers as: nonzerodigit (["_"] digit)* | "0" (["_"] "0")*

Isn't digit (["_"] "0")* simpler?

Jasper_ · on Aug 27, 2021

012345 is the old way of doing octal notation in Python, which has its roots in C. I'm guessing they want to prevent ambiguity by making it so that it's an error for "012345" to be a decimal literal.

tdeck · on Aug 27, 2021

I always thought this was one of the worst notation conventions in modern languages.

Gibbon1 · on Aug 28, 2021

I've felt for 35 years they could do everyone a favor and break everything that dependens on a leading zero to represent octal.

nicoburns · on Aug 27, 2021

Not only that, octal notation is actually valid in non strict-mode versions of JavaScript (which is still the default unless you opt-in with "use strict"). So it would be backwards incompatible outside of strict mode, and very confusing if changing from non-strict to strict-mode changed the semantics without giving an error/warning.

dr_zoidberg · on Aug 27, 2021

Yes, it gives an error -- if you want decimals, you can't use leading 0's:

    Python 3.6.3 (v3.6.3:2c5fed8, Oct  3 2017, 18:11:49) [MSC v.1900 64 bit (AMD64)]
    Type 'copyright', 'credits' or 'license' for more information
    IPython 7.2.0 -- An enhanced Interactive Python. Type '?' for help.

    In [1]: 01
      File "<ipython-input-1-3351d58b1d3b>", line 1
        01
         ^
    SyntaxError: invalid token


    In [2]: 0o1
    Out[2]: 1

    In [3]: 0o10
    Out[3]: 8

thewakalix · on Aug 28, 2021

And Haskell with the NumericUnderscores GHC extension.

https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/nume...

lucb1e · on Aug 28, 2021

Since 2019 in the leading browsers (Firefox and Chromium); 2020 if you use something else like Edge or Samsung Internet; 2021 for Chrome or Firefox on mobile... and of course you can forget about MSIE or Opera Mobile support as usual.

https://caniuse.com/mdn-javascript_grammar_numeric_separator...

sarreph · on Aug 27, 2021

You can also write JavaScript numbers with E notation (i.e. multiplied by 10 to the power of x), which is basically what I do with all numbers above 1,000 or below 0.001 now. It has the rare benefit, for big / small numbers, of being both more readable and more concise.

Example:

1e3 === 1000;

1e-3 === 0.001;

45.2e6 === 45200000;

jraph · on Aug 28, 2021

D has had this for quite some time, C++ now too (with quote characters instead of underscores)

magoghm · on Aug 28, 2021

And also in Ruby.

petepete · on Aug 28, 2021

I always thought Perl was the originator of this style, which would explain why it's been in Ruby forever.

I wonder where it started if it existed before Perl.

pjmlp · on Aug 28, 2021

Ada and ALGOL dialects.

danvk · on Aug 28, 2021

By default, Node has a relatively low limit on the amount of memory it will use (I believe 1GB). As you get close to this, it thrashes as it tries to garbage collect more and more frequently.

You can increase the limit by running `node --max-old-space-size=8192`. But that doesn't seem to fix this specific issue, either with the Map or the Object!

Mountain_Skies · on Aug 27, 2021

Ran into something similar years ago on the Blackberry where if you created more than 2^15 objects, performance could drop by something like 90%. There was no error but the performance hit was bad enough to make the app unusable.

rbanffy · on Aug 27, 2021

16,777,216 keys ought to be enough for anyone.

chrismorgan · on Aug 27, 2021

Hmm, one key for even every possible Unicode code point (including surrogates, noncharacters and reserved ones, all the way from U+0000 to U+10FFFF) makes 1,114,112 keys. That’d be a big keyboard with rather poor random access characteristics. Allow 20×20mm for each key (seems a fairly typical sort of size) and that’s almost 446m². I dunno what you’ll use your 16,777,216 keys for (I agree, it ought to be enough for anyone), but as a full-sized keyboard it’s going to have a surface area of at least two thirds of a hectare. You could build a giant hamster ball a bit over 46m in diameter and lay the keys out on the inside, travelling the world as you spin the ball in order to find the right key. (The Unicode 13 hamster wheel keyboard might be more reasonable, its 143,859 keys requiring a diameter of less than five metres. Perhaps more manageable, but some keys might be harder to access as you roll through cities since the ball is no longer big enough to almost ignore many houses.)

quickthrower2 · on Aug 28, 2021

I wonder how many octaves and what frequency you’d get to on a piano with this many keys?

This is 1,398,101.333333333 octaves so I guess even starting at 1hz the top frequency is 2^1398101 hz.

Gamma rays are about 10^22 so the frequency is probably so high as to blow up the universe or something.

chrismorgan · on Aug 28, 2021

I’d suggest smaller intervals, like a cent instead of a semitone, but the numbers are so absurdly large that dividing it by a hundred doesn’t really help anything. You’d need a good deal more than 64-bit floating point numbers for any pertinent calculations, 10³⁰⁸ is pretty tiny; and the theoretical string tensions involved would be quite something.

Perhaps an organ is the right answer instead, with one key per pipe rather than the usual multiplexing. The Wanamaker Organ makes a good start with 28,750 pipes across 484 ranks (though it multiplexes them across a paltry six manuals plus pedalboard), but you’re going to need to come up with a lot more sounds, and the speed of sound is low enough that it’s not going to work very well done acoustically. (Hmm… y’know, there’s actual scope for an art piece here, an organ but producing light instead of sound, like lidar is to sonar. I think you’d struggle to produce anything the human eye would appreciate, but it’s a fun concept.) Or perhaps it should be splitting each manual into one for each combination of sounds it controls. That might well go over the 2²⁴, I’m not thinking about the numbers too exactly.

scns · on Aug 28, 2021

A yes the Gamma Ray, one of my favourite phenomena. Giving off as much energy as our dear sun radiates in her lifetime.

Well, maybe we use 10e-absurdlyLargeNumber as our base note then, just to stay safe. Hm, how many years will we have to wait, till the lowest notes went through one cycle?

scns · on Aug 28, 2021

The most elaborate joke i read on HN so far, well done! Thank you and the others who rolled with it.

chrismorgan · on Aug 28, 2021

It was actually a case of me going and doing something else and then returning to this tab and reading first of all the comment that I responded to; yet having forgotten the context, I immediately thought of keyboard keys; and the concept amused me so I ran with it and it ended up somewhere a bit different from where I expected!

scns · on Aug 29, 2021

What a lucky accident.

rbanffy · on Aug 28, 2021

Thank you. I’ll be here all week.

austinjp · on Aug 27, 2021

:)

https://skeptics.stackexchange.com/questions/2863/did-bill-g...

raphaelj · on Aug 27, 2021

It's easy to imagine some app to reach that amount of keys pretty easily. Like games, scientific simulations or GIS apps.

fridif · on Aug 27, 2021

What if I'm trying to model every atom in a multi-cellular being?

Someone · on Aug 27, 2021

That question doesn’t show understanding of the relative sizes of atoms and cells.

https://www.thoughtco.com/how-many-atoms-in-human-cell-60388...:

“According to an estimate made by engineers at Washington University, there are around 10¹⁴ atoms in a typical human cell”

⇒ it isn’t necessary to use “multi-cellular” there by a wide, wide margin.

fridif · on Aug 27, 2021

It would only be 800TB to store every cell as a long?

Samsung is making 512 GB ram sticks.

peanut_worm · on Aug 27, 2021

I think you’d have a hard time modeling a single celled organism

kaba0 · on Aug 27, 2021

Hell, we can’t even simulate a single more complex protein.

H8crilA · on Aug 27, 2021

The difference between multi cell and single cell is not the hard part.

l-lousy · on Aug 27, 2021

Don’t use JS

munk-a · on Aug 27, 2021

Indeed - you'll want to upgrade to Basic.

kook_throwaway · on Aug 27, 2021

PHP is webscale.

biglost · on Aug 28, 2021

Mongodb! Great video by the way Link: http://www.mongodb-is-web-scale.com/

scns · on Aug 28, 2021

Firefox warns me of a potential security risk on that site.

(edit) on proceeding it redirects to a site that won't be found.

retbull · on Aug 27, 2021

I feel like if you are working with genome levels of data maybe you shouldn't be using JS to do the work. JS is a great tool in some cases but not really a scientific one. I guess you use whatever you know.

H8crilA · on Aug 27, 2021

(Not being sarcastic at all) I remember the times when bringing JavaScript anywhere near data analysis would be considered a cute joke.

brundolf · on Aug 27, 2021

Use what you're comfortable with if it'll do the job. In this case turns out it won't, but it was perfectly plausible it might've.

scbrg · on Aug 28, 2021

The problem, of course, is when you've worked on the assumption that it will do the job (hey, it was perfectly plausible!) and after a six months' work you run into a showstopper like this.

albertgoeswoof · on Aug 27, 2021

[flagged]

dang · on Aug 28, 2021

Please don't do this here.

jdeaton · on Aug 27, 2021

The actual mistake here was trying to do bioinformatics in javascript.

deepsun · on Aug 28, 2021

Why? This is Hacker News. Because we can.

mastrsushi · on Aug 27, 2021

I always read these with my arms crossed like "pffffff yeah obviously"

H8crilA · on Aug 27, 2021

There's no limit for array size in C++, or for any core structure. Other than the size types, which naturally scale up as we get newer architectures (obviously 2^64 for most people at the moment).

stkdump · on Aug 28, 2021

I think C++ pointers being 64 bit will stick with us forever, unless some fundamental paradigm shift happens in what C++ is used for. I also think the size of short, int, long long, float and double will not change anymore, ever. The reasons we had for changes in the past will not happen anymore from now on, and the cost of changes only increases over time. We might see the sizes (and IEEE-754 floats) be officially standardized at some point, like twos complement is standardized now.

H8crilA · on Aug 29, 2021

I mean, yes, size_t will stay at 64bits for at least a while, 2^64 bytes is 16 exabytes. Once this is insufficient then it will be slowly switched to 128bits. Some systems already use 128 bits for storage-related arithmetics.

kibwen · on Aug 28, 2021

The C++ standard may not impose a limit, but any given C++ implementation will. Likewise, the limit in the OP here isn't part of the Javascript standard, just a Chrome implementation detail.

lionkor · on Aug 28, 2021

that would be..? std::numeric_limits<size_t>::max()?

kibwen · on Aug 28, 2021

Yes, size_t should be the limiting factor.

tialaramex · on Aug 28, 2021

If we're saying "Pfffff" at things, let's take a moment to appreciate that by the magic of not actually storing them Rust gets to store all of nothing in no space.

Rust's arrays of something can only have up to std::isize::MAX elements. But by choosing to store nothing instead of something, we can go larger to std::usize::MAX with e.g. [(); std::usize::MAX] and that'll actually work, at runtime, even on a modest computer since you don't need space for std::usize::MAX of anything.

If usize::MAX of a zero size type isn't enough for you, might I interest you in an Iterator that contains not merely usize::MAX but infinity of any Default type of your choosing? And if you don't need that many, you can always just take() fewer of them no charge.

antonvs · on Aug 28, 2021

Right there with you

_dh54 · on Aug 27, 2021

If the user has the memory there should be no hard limit. Oh wait… there’s the pesky issue of memory overcommit…

xsmasher · on Aug 27, 2021

It COULD be written to be infinitely scalable in size (up to allowed memory) but that comes with a speed cost.

I remember something about NSArray switching its layout (and speed characteristics) after adding certain number of items, but I can't find a reference now.

kccqzy · on Aug 28, 2021

https://ridiculousfish.com/blog/posts/array.html