Honestly the title doesn't do it justice. For a plotting tool it's not an impossible task, but making a webpage this responsive is incredibly hard, so well done there.
Kudos for directing folks to other projects if uPlot isn't fast enough for them. Every developer should take the time to write out appropriate and inappropriate use-cases on the readme.
Hi, I'm very interested in this. I've been using dask/plotly for a side project but it's just too slow. If you know dask, do you think you could discuss which features of dask your project has and doesn't have?
I believe so. An optimization of Datashader is that it _doesn't_ plot all the points, but rather optimizes which points actually influence pixel values.
>The range field tells you the approximate area within which the cell could be, in metres. This is effectively an estimate of how accurate the location is.
There's one marked in the middle of a corn field near me with a range of 48,000m. ?
This is awesome. I wrote some software to pull readings off my weather station and stream them to the browser. The plotting library that I use is pretty awful. Besides being slow, it's constantly hosing up the numbers on the y-axis. I want to give uPlot a shot.
Hopefully the upvotes and being on the front page is enough, but if not just want to say props, this is amazing. Already had it saved in my bookmarks glad to see it pop up again :)
The 150ms benchmark in the title is really selling this short, the performance is very impressive. The 150ms seem to refer to the time it takes to initialize the graph, and with a hot cache that is more like 50ms for me here. Redrawing seems much, much faster.
I have done some visualization with WebGL because I couldn't get it fast enough with just drawing lines. That was a while ago so I'm not sure about the details, but even a simple prototype just drawing a few tens of thousands to a hundred thousand lines using canvas was slower than this for me (subjectively).
I'll have to look at the code later, but I'm curious about where this library is getting the performance from. Before I saw this I would have said you can't do this at this speed without using previously downsampled version of the data points, I'm not entirely sure now.
I'm still looking at the performance tab in the browser dev tools and how impressively empty the main Javascript usage plot is.
It looks like the mouse lines and the selection highlight are just partially-transparent divs stacked on top of the canvas that get moved around, so nothing actually gets redrawn unless the date range changes (which is a pretty clever approach!).
but really tricky to align correctly at different screen pixel densities and rounding errors. it's still not pixel-perfect, but i decided it was good enough.
That's something I expected, you really don't want to trigger redraws on mouse over. What surprised me was that I couldn't tell the the times where I zoomed in or out from the JS flame chart. Usually that is really obvious, but in this case zooming was so fast that you could hardly see it. And this graph has probably around ~194k data points (388k in the source data, I assume that's x and y). I'm not entirely sure about the number of points here, I'm taking that from the json delivered to the site.
The selection highlight is unusual, I admit. That's something I'd just skip if I were going for high performance.
Very nice project, but I just want to randomly point out that 3 years of hourly data is really not as much, as it may sound to somebody. In fact, it's 3 times less than 24 hours of every-second of data, which is pretty common scale for all sorts of real-time monitoring tools we often use. These fine-sliced stats really pile up quickly...
> but I just want to randomly point out that 3 years of hourly data is really not as much, as it may sound to somebody.
you're right, it isn't. and yet many js charting libs struggle even with this.
on my i5 thinkpad with integrated gpu uPlot can render 600 series x 8,000 datapoints in ~2000ms [1]. and it finishes this job in ~7s on my 2015 sony z5 compact phone. so there's that :)
but really, pumping 4.8M datapoints into a browser is probably not a great idea. you're gonna want to aggregate on the server at some point. just sayin'.
Just want to ask, because I love talking about render performance: have you tried doing this using offfscreen canvas? That should allow you to move a lot of things to a worker, so you avoid blocking the rendering with js? It probably won’t speed up the total time to finished render, but I assume it will lock the page for a shorter period of time?
If the rendering is this controlled (and sparse), blocking the thread with it isn't really a concern. And with this data size, serialization/deserialization between the main and worker thread would probably become nontrivial.
Super cool! It would be great if you could provide some insight into how you built this and the kind of tricks you had to use to make this possible. Looking forward to a blog post in the future :)
- there is no per-datapoint memory allocation beyond whatever is necessary for the browser to construct the canvas Path2D object. this keeps the memory pressure low and the GC nearly silent.
- the amount of draw commands is reduced by accumulating the min/max data values per pixel
- uPlot does not generate axis ticks by walking the data. it only uses min/max of the x and y ranges and finds the divisions from that.
- there is no mass-creation of Date objects, data is kept in timestamp format except when hovered.
- the date/time formatting is done by pre-compiling templates and not re-parsing them all the time.
- the cursor interaction uses a binary search over the x array
it's highly dependent on what's actually drawn. i regressed it by accident and didnt notice a huge difference until i randomly opened the stress test (which is also densely packed so probably a worst case for AA)[1]. it went up by a factor of 2-4. cant remember exactly.
Let this be a lesson that JavaScript is not slow. The DOM can be slow, and downloading large bundles can be slow, and any code can be made slow if you try hard enough to write it poorly. But JavaScript the language is not slow in 2020.
When we talk (or rant) about the performance of JS-based apps - running on the web or in Electron or otherwise - it's really important that the conversation focus on the factors that actually matter.
No, JS is slow(ish). The GC pressure alone will kill you. This is fast despite being in JS by simply doing as little as possible (while still meeting the requirements on the end result, which is where the genius lies).
The fastest code is no code. No code is fast even with pathologically slow languages/runtimes.
It's no slower than Python. Probably faster unless you're staying within a native library like NumPy.
But that wasn't my main point: my main point is that people love to rant about "JavaScript apps" while maintaining (willful?) ignorance about what the actual factors are that manifest as the slowness they experience. Sometimes it's poor usage of the DOM. Usually it's ads. It's almost never the unavoidable overhead of JS cycles.
> people love to rant about "JavaScript apps" while maintaining (willful?) ignorance about what the actual factors are that manifest as the slowness they experience.
Are you sure you got this right?
Just because JS is fast[0] doesn't mean we cannot complain about Javascript apps that should have been plain web pages?
It is clearly possible to create advanced Javascript apps that are enjoyable even to people like me.
For most pages I use as a consumer I don't see much value add because of frontend code: I see the value of autocomplete, drag-and-drop etc but I would much prefer if my CPU stopped chewing after web pages were loaded and rendered.
[0]: yes, I agree - in most cases the Javascript language or engine is not the cause of the performance problems on the web
Whether or not a given site should be client-side rendered is a valid discussion
Ad bloat is a valid discussion
Lazy development practices in modern sites/apps are a valid discussion
But in my experience, these as well as other less-legitimate issues all tend to get lumped under the banner of "JavaScript stuff == slow", without any nuance.
Can't say for sure but the way I read them most complaints about Javascript performance are about the results of the abuse of Javascript applications everywhere.
I guess Javascript just happens to be the common thing between a number of them, and people would be just as annoyed if the multi-megabyte, cpu-hogging, data-stealing monstrosities where hand-crafted in assembly ;-)
And: If someone complained about Javascript being slow for scientific calculations I'd probably call them out on it ;-)
> but I would much prefer if my CPU stopped chewing after web pages were loaded and rendered.
this happens when it's programmed with performance as an afterthought, which is sadly how things work when the prerogative is to just ship more features and continually ingest dependencies with the most features, with little vetting.
For me as a user it seems to be: loading google analytics, loading fonts that seem to be improperly cached, loading chat windows, assistance widgets and spies, and dynamically adding things after page load that cause reflows.
It's among the fastest scripting languages. Only LuaJIT turned out to be faster (than Node.js) in my benchmarks. Node is on par with Dart most of the time and the latter is statically typed.
Question: in the demo you load papaparse but the data is delivered as JSON?
Would you mind sharing what your backend looks like for the project where you're using this, and how do you extract all of this data at once from storage?
> Question: in the demo you load papaparse but the data is delivered as JSON?
good catch, leftovers from original PoC i made locally. the exported data was still in CSV back then. i should clean that up!
> Would you mind sharing what your backend looks like for the project where you're using this, and how do you extract all of this data at once from storage?
i just did a manual export from Analytics. how you get the data out is up to you, but a time-series or columnar database would probably be a good start :)
That looks amazing, nice job! I was trying to create a seb dashboard with several graphs each with a few hundred data points recently. None of the libraries felt "snappy" on a mobile device in terms of load time or responsivity. I ended up abandoning the proejct because of that, but if only I knew about this library at the time...
The only thing I wish µPlot had is support for tooltips rather than numbers in the legend, but that's a small price to pay for this level of performance.
I didn't even read the title before clicking and was blown away at the speed. Honestly thought something acted up since you're not used to seeing things render this fast.
interestingly, focusing on performance in web dev (and actually delivering on it) really presents new UI challenges, because users are not used to near instantaneous feedback in web-apps. you have to start adding complexity like css transitions, and UI delays just to reduce user confusion.
I'm somewhat confused. Is 150ms considered low? That's less than 7fps.
336524=26280. Emitting 26k points is like child's play for any program, are we really so far down in bloat that this is considered impressive? Because this is not impressive. A 16MHz AVR could execute 91 instructions per sample in those 150ms, more than enough for plotting that data.
A 3GHz machine could execute 17123 instructions per sample, and it would probably need less instructions than the AVR would.
FWIW, the majority of this 150ms is bootup time and includes initializing the DOM & canvas, JITing the js, downloading & parsing the dataset, and actually running the code, which includes data downsampling & gap detection/clipping.
try to get this perf on the web, and then you can re-asses your statement. if it was easy, then every other js charting lib would not be struggling to do it, right?
finally, you cannot extrapolate from the 26k/150ms number. uPlot can draw 4.8M points in ~2000ms on an i5 with integrated gpu (after bootup amortization).
I was slightly disappointed when I zoomed in as I expected 3 years of data in 150ms increments (600 million measurements) but that's probably not possible
Assuming there is some culling, I’ve found myself somewhat obsessed about the methods of dataset-culling. If there’s a single outlier, for example, some methods would skip it, where it should really be highlighted.
https://knowledge.ni.com/KnowledgeArticleDetails?id=kA00Z000... is a good algorithm to use to keep peaks. At a maximum you need 4x the chart width data points. In practice, you can easily reduce that. I've implemented the algorithm before and it can be done in linear time.
Slick! I too felt that many libraries were not slick/fast enough, but I eventually ended up using react-stockcharts (in canvas mode) for my use case and was quite satisfied with its performance.
Could you compare your performance against this one if possible?
Minor suggestion: not really a bug, but inconvenient UX. When I move start the selection and I move my mouse outside the chart it stops selecting it. If you added the mouse movement listeners to the document/body then when I move my mouse outside the chart it would keep working.
For a moment thought here you were plotting 3 years of data collected every 150ms - wow! What you have here still very nice. I had to work with display/charting interval data in past and having it load quickly is very nice especially when toggling new dataset intervals.
Would it be possible to allow for the cursor to still update when doing a horizontal scroll? When I move the mouse left and right it does that, but it doesn't move when I traverse the page by horizontally scrolling
Either way, just a bit of feedback on a beautiful page
I work with analytic tools to watch production systems and they boil my data down to a pathetic number of datapoints prior to rendering, when you know computers — even in a browser — are vastly more capable.
I don't think systems like this have ever been really limited to the amount of data they can _present_. A bigger limiting factor is the volume of data they can ingest, at what cardinality (ideally arbitrary), and how quickly they can get it back to you, transformed in some way (e.g. "i want to see p95").
That said, things like SignalFx or Datadog are certainly slow at the presentation layer, too ;)
I find it curious that the volume of users are similar for day of the month regardless of year. For instance the first of the month is not always the same day of the week.
Eyeballing the data, there doesn't seem to be a strong weekday effect. Not sure if this is real data (human activity data usually show weekday effects).
If there was a very pronounced weekday effect, a quick hack is to use a 364-day instead of full-year offset for comparison.
For a dataset like this, Highcharts would be slower for sure since it will create DOM SVG components for each of the points. <canvas> (which this is built with) is much quicker at presenting tons and tons of data points like this.
additionally, it's hard to overstate how impressive modern javascript JITs are, as well as GPUs.
every time i visit a web page that downloads 2MB of js and spins up my CPU fan, i feel ashamed that this is the industry (webdev) where i make my living.
at least with uPlot i show that all this waste is not necessary.
The problem with large websites is that there's a dozen layers of abstractions leading to large and inefficient code. But when it comes to just raw speed (iterating over a huge array of numbers in a tight loop, like drawing this chart), then it's blazing fast especially with JIT. That's specifically where a lot of people underestimate JS. I've seen a lot of people do things in the backend when they can just send all the data and do all the filtering/sorting locally leading to much faster UX.
Interesting read indeed.
Too bad this was posted to Hackernoon, had a hard time reading it due to the header bar constantly dropping down into the text I was reading. So frustrating!
If you wanted to render these on a regular basis, WebGL is fairly straightforward, and works really well for this simple sort of rendering. You could do this with one polygon and a small fragment shader (treat the data as a texture, and use SDF to draw the line and fills), or use the actual geometry (render as a triangle strip; and separate the line if you want to do more interesting stuff in your fragment shader).
I probably wouldn't even generate an SDF, instead read the data directly from a 1D texture and fill the anti-aliased line directly in the fragment shader.
Yeah, I meant more in the abstract sense, rather than "create a two-dimensional texture of the distance field", which as you point out would be unnecessary.
You could probably sample all the 1D textures in one pass, and draw all the lines and fills there. One additional nice side effect of this is that you can easily have sequences at different resolutions.
A small part of the problem is drawing the trend charts. So I decided to make uPlot [1] to see what was really possible.
[1] https://github.com/leeoniya/uPlot