Hacker News new | past | comments | ask | show | jobs | submit login
Introduction to D3 (observablehq.com)
599 points by kickout on March 3, 2020 | hide | past | favorite | 92 comments



Technically, D3 is a JavaScript library, but in reality it is much more than that. In this article, it is called "visualization grammar". I have heard it being called "jQuery of diagramming", "declarative DSL for data visualization" and so on.

Since 2011, there were already numerous cycles of the best of breed JavaScript SPA frameworks and libraries (I'm talking everything from jQuery, through AngularJs, to React), but in data visualization, D3 seems to hold its position very well.

What are the things that D3 did right to get the acclaim, and why does it keep it for so long?

D3 is approaching 10 years since the initial release, did it stand the test of time and will it keep its power for the next 10 years?


> What are the things that D3 did right to get the acclaim, and why does it keep it for so long?

Once upon a time, there was an infovis library for Java called Prefuse. Then the developers abandoned that library for a Flash infovis library called Flare. A JS library called Protovis was inspired by Flare, and then Protovis itself was replaced with D3.

I've used 3 of those libraries (never wrote any Flash code). The only one I would actually recommend people use is D3--it's literally the first (honestly, only) infovis toolkit I've used that doesn't feel like pulling teeth to get anything done.

What D3 does differently from its lineage is that it is data-centric. Each datapoint is an object, corresponding to a DOM node, and the layouts in D3 don't actually draw anything; they just set properties on the object--you're responsible for actually converting those into setting SVG properties for display.

It sounds like this would be a nasty headache, but this way is actually much better than the traditional way of handling things (where the visualization is essentially treated as a display widget in a normal GUI toolkit). Any time you want to progress beyond a basic "display a graph with static data," you start to need a lot more fine-grained control over display elements. Want to link multiple displays of the same dataset, so clicking on one in one screen highlights them in all the others? That's very hard in prefuse, but quite easy in D3.


Conversely you often seeing D3 recommended for people that just want to draw a bar chart. That's a big learning curve and a steep cognitive overhead if you do just want to do something fairly conventional.

Luckily there's now plenty of d3 wrappers to cover common cases.


if you have experience, witch 3 would you recommend for business as usual presentations ?



They kept it really low level and didn’t make many opinions on how to join the dots. There are functions to manipulate data, functions to make scales, and functions to and functions to render svg (plus dom bindings). These decisions make individual charts quite verbose, but the trade off is you can visualise anything you want. Just look on npm how many charting libraries use d3 as a base.


This is true. We decided to use D3 directly to render charts at my job, and it was honestly kind of a terrible decision. Ideally you would use a library that uses D3 to render charts.


Same here, too much work just for simple charts


What made d3 a terrible decision?


My guess is that d3 is too low-level to make reuse simple, and that using a higher-level abstraction would make the code more maintainable.


My guess is that d3 is too low-level to make reuse simple, and that using a higher-level abstraction would make the code more maintainable.

In what context? It's not that hard to collect some scales and axes into a JS object, mix it with judicious use of CSS et voila.

I understand that d3 is low-level and verbose, but I can't say that I've any regret over deploying it in production.


It's too low level. For reference, I just re-implemented the entire thing in React. just react, with svg elements directly. Not using D3 at all. And it was roughly the same level of effort.

edit: saying that D3 was a terrible choice is not saying that D3 itself is terrible. D3 is amazing. But before choosing it, you should know exactly what it is, and whether it helps in your specific use case.


I'm not sure how you'd implement scales, color theory, etc.. in React (a rendering library). There are a lot of useful parts of d3 which have nothing to do with rendering to the DOM. In fact, React and D3 complement each other quite well (maybe that is what you were saying?).

D3 is slightly more verbose than the one liners, but with that I get complete control over the scales and appearance of my chart. In addition it's trivial to combine multiple charts in a single diagram.


Out of interest, how did the performance compare between the React+SVG and D3 implementations? (and what sort of data size / structure were you working with?).

> D3 is amazing. But before choosing it, you should know exactly what it is, and whether it helps in your specific use case.

I agree. D3 is really great, but not always the most appropriate solution.


That's the idea when you're using D3 with React: React handles the DOM updates, D3 handles data wrangling and math. You said you reimplemented your charts in SVG directly – did you also reimplement scales, chart axis generation, d3-collections, d3-time, etc?


each D3 project is an entire application of its own

adding the “simple pie chart” to your view is not simple at all.

It needs independent UX planning, system architecture planning, and multiple sprints to implement

All the example code and prebuilt charts are outdated due to using old versions of D3 or other dependencies and essentially have to be rebuilt from scratch


All the example code and prebuilt charts are outdated due to using old versions of D3 or other dependencies and essentially have to be rebuilt from scratch

As part of moving them over to observable (which I'm not a fan of) the examples are being updated to use the v5 API.


I had the same experience. It started great when you just want a simple line chart, but when you need more and more features, it will be like creating tons of abstractions and writing a whole charting library ...


What are the things that D3 did right to get the acclaim, and why does it keep it for so long?

d3 (especially now that they've broken the bits and bobs into discrete packages) takes a very unixy philosophy: do one thing and do it well. d3 lets you iterate over your data and get useful items back (e.g. scales that you can use to create geometric shapes for a chart). It's intimidating for some but offers a ton of flexibility you don't see with other charting libraries.

Mike Bostock also put a lot of effort into making sure that d3 lets you create beautiful charts.

Every other charting library I've seen takes a very different approach. Often enabling you to create ugly, inflexible charts.


In addition to breaking it up into discrete packages, the devs are also OK with improving things even if it includes breaking changes.


In my mind, D3 is a DSL for ingesting structured data and outgesting DOM. While it has certainly been a few years since I touched it last, what I remember was that it basically didn't touch anything in the data or DOM, it just made the piping between the two super easy. So I think it was that approach: not introducing new idioms for existing things, but instead introducing the pipeline approach, with a very simple syntax, all made it very amenable to learning very quickly and rapidly iterating an idea, tweaking it until you get just the thing you want.


> What are the things that D3 did right to get the acclaim, and why does it keep it for so long?

As a dev, what D3 did right for me is kept the code stable, provided clear sample code for most use-cases, the internet (thanks folks) made lots of runnable examples available via blocks and jsfiddle, maintained precise documentation (even for older versions in github as markdown files) and has almost no external dependencies (so my local .js+.html files from 3 years ago still work without needing any broken npm install or jspm install)

Moving from v3 to v4 was not as painful as I thought it would be.

Once I got over the core concepts of selections and enter/update/exit cycles it has been a breeze and a pleasure to work with.

Thanks Mike and everyone else who's made reasonable and sensible choices for D3 !


Vega calls itself a "visualisation grammar" and I think it is much more worthy of that title. I just wish it was efficient. Sadly it represents every data point as a JavaScript object. In other words it uses array-of-structs style rather than struct-of-arrays which is an obviously bad choice in JavaScript if you want efficiency.


Interesting what you say about struct-of-arrays, I have been following this topic closely recently. Do you know any well known JS code implemented using struct-of-arrays or have some interesting resources about it? I am worried that SoA codebase is difficult to maintain and hard to express in a simple API.

I have been reading two resources about it, https://ourmachinery.com/post/data-structures-part-1-bulk-da... and https://stackoverflow.com/questions/39799874/why-does-javasc....


Here is a notebook on the subject. I’d also check out Apache Arrow.

https://observablehq.com/@mbostock/manipulating-flat-arrays


Thank you for that. For the first time I finally understood what the Apache Arrow is - until today, I didn't realize it's just a way to do SoA in different languages + a lot of buzzwords.


No, sadly most Javascript APIs seem to use array-of-structs. It is definitely easier to write the code that way. I suspect most graphing libraries start out like that and then by the time they try to plot a heat map with a million points it is too late to fix.

Traditionally plotting libraries like Matlab and Matplotlib work on arrays of data.


It's in a different domain, but you might be interested in unity (the game engines) data oriented stack. Basically it serializes properties attached to an entity, which you then write systems that map against entities with certain combinations of properties attached. Seems like this could be amenable to JavaScript as well.


D3 does the Right Thing with respect to dataviz.

That means: it's conceptually overloaded for doing quick-and-dirty conventional charts, where you just want to plug in a few parameters and have it "just work" - but excellent once you need to customize and make it work with your specific requirements.

That it has the concepts, and a clear notion of them, is the critical difference. Most libraries, most of the time, don't add new concepts, they just have a premade black box of features and functions. Sometimes you want a premade black box, but often you want to open up the box shortly afterwards, and that creates the inevitable trend towards either remaking it as your own box, or being one of hundreds of people who gradually grow it into a monstrosity that does everything.

But a library that is concept-focused doesn't have to get that much bigger: it's just another kind of interface, like a programming language or an operating system, and that puts it on a more sustainable track.


I think of it as the assembly of data viz.


Personally I think it’s force simulation and geo/projection libraries are the thing that have made it stand out. It seems to be the de facto choice for any mapping or simulation visualisations. It’s DOM selection mechanics are neither here nor there in comparison.


Having done a lot of geo-spatial web projects, I would, hand down, use OpenLayers as my go to over D3. Not that there is anything wrong with D3 and it is a great library but man it's a pain to maintain and extend a D3 mapping solution.

I just ported a D3 solution to OpenLayers for a client and they where over the moon with the speed at which they could implement new features. Mobile touch was the killer for them and made them finally bite the bullet and port, getting a D3 mapping solution to behave correctly on both the desktop and mobile requires a lot of wiring.


"D3 seems to hold its position very well"

Just look how long Processing held its position. People are still using it with P5 on the web, but I think it makes a bad figure compared to D3.


Declarative data binding?


Used D3 for a grad data viz course project a year or two ago. Extremely powerful but unless you are doing custom visualizations or are excellent in JavaScript and Visualization it’s a bit overkill. Much easier to do monthly reporting or even one off stuff in Tableau, Power BI, or the like. Tableau or Power BI I could pass to other analysts and without experience they could figure it out. If I sent them my d3 code they would cry.

Again if doing advanced visualizations for a large newspaper, commercial presentation, it’s extremely customizable and makes some really beautiful charts.


Using a non-proprietary platform also has the distinct advantage of supporting whatever APIs you want to use and not having to deal with extracts of data. Having to wait for one of these platforms to support JSON from a web URL is really quite silly when you could write the query quickly (in python or whatever) and pass the values to your view.

Or, you know, use PyGal or another server-side charting library: http://www.pygal.org/en/stable/


Or, you know, use PyGal or another server-side charting library: http://www.pygal.org/en/stable/

Is there much of a market for server-side rendering? I've been putzing around with a d3-inspired charting library in Rust mostly as a brain teaser though.


I very much prefer to generate svg charts on the server. Not only is this more sensible when you have a huge number of data points, but it even works with js turned off (unless you need interactivity).


Not disagreeing – I do this often too – but the downside can be having a very large dataset which in turn generates a very large response, which might be more efficiently sent as data and constructed client-side. Less importantly, Google’s lighthouse tests for a certain number of DOM elements, which a complex chart can easily exceed.


Yeah my server-side use cases are for times when I'd want to use a graph outside of a browser (e.g. PDF reports, printing, email distribution). If you're targeting a browser IMO it only makes sense to move away from the browser and/or JS if you're trying to create a static, raster image.


Completely agree with this statement. Wish it was implementable similar to a tableau or something for more universal appeal -- or alternatively it had enough easily default applications with the ability to go much more custom if you needed/wanted to.


Aren't Tableau and PowerBI both expensive for anything other than the most trivial visualization scenarios?


Anecdotally, I've worked with a lot of companies who are already on the Microsoft stack and either get heavy discounts for Power BI or already have it included in their Enterprise licenses. (There's also another class of companies who refuse to look at any competing product if Microsoft have an offering...)

My recommendation to the technical decision makers in those companies who are either confined to Power BI or just feel a lot safer working in that environment (instead of investing heavily in development time to recreate the same thing) is to embrace the Power BI service out of the box as much as possible, and then if they really need to, create custom D3 Visuals for their reports (yes, this is possible with Power BI, I was quite delighted when I discovered this): https://docs.microsoft.com/en-us/power-bi/developer/visuals/....


There are tons of different d3 modules (40+). As someone who uses d3 extensively, but rarely uses its selection and data binding functionality, I put together a birds-eye view of the different modules. There is tons of great functionality that is usually skipped over in favor of the DOM-manipulation methods, like managing colors, dates, data munging, etc.

https://wattenberger.com/blog/d3


That is an excellent set of articles for learning d3!

I love how it gives an overview of all the d3 modules, then explains them in groups by related functionality. I just started exploring, and will study it over time.

Thank you for sharing your knowledge, the articles are really well-done and high quality. I'm guessing the visualization of d3 modules is done in d3 itself. Beautiful in concept and presentation.


Incredible resource


I switched from D3 to echarts[0] and never looked back. Still powerful and customizable but comes with a much easier API to reason about.

[0] https://github.com/apache/incubator-echarts


Same here. I still haven’t found anything I couldn’t do with echarts. D3 is always highly upvoted on HN, though it’s overkill in 99% of use cases in my opinion.


I really enjoyed d3 for the most impressive data viz products. However I always found the overhead to get d3 going and piping data to the explicit locations to be burdensome for most applications. My sense was that you would use d3 as a final polishing step for any data viz project/product or if you wanted to make a data product that had complicated requirements/display needs. I would be curious to hear if anyone uses it for data exploration or if the overhead has been lowered over the last couple years (its been 3 years since I've used it).


Link to the actual course that mentions all the details : http://vis.csail.mit.edu/classes/6.894/


Maybe I'm the minority, but whenever I worked with D3 to do anything beyond the simple barcharts this shows, I ended up having to manipulate the SVG DOM manually, which quite frankly sucks.


This is my experience too. Once you start wanting to do anything that is not just out of the box you need to know the ins and outs of SVG at which point you may as well just make your own SVGs in JS. Which is what I've ended up doing.


While I'm sure the authors poured in a lot of effort, I found this tutorial difficult to follow despite the neat "notebook"-style webpage.

This was remarkably more intuitive and clear and I breezed through

https://alignedleft.com/tutorials/d3


As much I try to like ObservableHQ, this "updates happen above the code, sometimes a bit far, so you if you scroll too much, you don't see them" is one thing that makes it less intuitive than Jupyter Notebook style.

It took me some time to realize that running cells actually change something in the country list.


It's a feature, not a bug. ObservableHQ notebooks are reactive - there's a DAG underneath. It's not a Jupyter-style execution log.


> there's a DAG underneath

Would you or someone care to elaborate? Do we mean the 'linking' between 'nodes' (cells) as an interpreter/compiler would do over a file/object structure?

Thus I assume, opening a world of options e.g. for vectorizing performance, type checking and all?

If so, we have in one such notebook a true slice of "visual" IDE the kind of which Microsoft could only ever dream about! (so far) ;-)

[Side-related note: I'm amazed at the emergence of the notebook paradigm over the last 10 years, accelerating for at least 2-4 now. See how they do it at Netflix. There's a case to be made that the notebook paradigm could really bridge wide open the "programming rift" between nerds and, well, everybody else, at least in skill-driven professional contexts. There's a short way from here to a slew of clever domain-driven script languages plugging straight into BI.]


> Would you or someone care to elaborate? Do we mean the 'linking' between 'nodes' (cells) as an interpreter/compiler would do over a file/object structure?

A DAG, or Directed Acyclic Graph. The most familiar example would be dependency graphs between packages. Or the dependency graph of your code (as interpreter/compiler would look at it). Or, any dependency graph in general.

So the way reactive programming works - whether in React, ObservableHQ, or Excel - is this: you have these computation units (cells, pure functions) which have dependencies and they themselves are dependent upon. This forms your calculation graph, which you calculate by starting at the node without dependencies and evaluating one node after another in topological order[0].

The main optimization this permits is reducing the number of calculations: since dependencies are accounted for and navigable, whenever a node X changes, only nodes that depend on it need to be recomputed (and their dependants, recursively).

"vectorizing performance, type checking and all" are not related to this concept. Reactive programming deals just with the dependency graph and (re)computing the right amount of nodes in the right order. Contrast that with a typical REPL model (or Jupyter model), where you execute cells one after another in the order you wrote them, and they mutate the global state of the application.

RE your side note: yes, the notebook thing is a curious phenomenon, especially in a worse-is-better way (why did it have to be first Python, and now JavaScript?!). It's much older than that, though - you could trace its origin through things like Mathcad (essentially a buggy Jupyter requiring lots of clicking, but which produced a convincingly-looking math papers, and could do proper symbolic calculations out of the box), back to the early Lisp era (you don't have to type things into a Lisp REPL; if you type them in a file and annotate with comments as you go, you get a half-baked plaintext Jupyter).

--

[0] - https://en.wikipedia.org/wiki/Topological_sorting - i.e. you turn a graph into a sequence sorted so that the dependencies come before the things that depend on them.


Ah, I see now, thank you very much for the detailed explanation.

Having used Excel for years as a barebones "logical framework" of sorts (before I knew better, in my teens, to solve various optimization problems in games notably like "best in slot" or "best resource distribution"), I've internalized a deep intuition for reactive programming. I had never realized this was an actual paradigm!

On optimization / O(n), models tend to (d)evolve into highly recursive 'traps' with this approach, in my experience. I learned the value of e.g. indexes, isolating concerns, generally larger but flatter surfaces indeed.

RE notebooks: I had no idea there was such a history of that. It's interesting that the approach only became somewhat popular recently.


You're welcome!

Fun thing I recently discovered about Excel: there's a button in it, Formulas tab > Formulas Auditing > Trace Dependents (and the other - Trace Precedents), which makes Excel start drawing arrows between cells, letting you explore the underlying calculation DAG.

Could you tell me more about those 'recursive' traps?

RE notebooks, personally I blame it to a combination of a) Python taking the scientist community by storm (perhaps thanks to scipy), where prior popular scientific toolkits were proprietary, b) popularization of lightweight markup languages (like Markdown) and c) popularization of browser as runtime.

There is history of scientists using org-mode for computational notebooks and publishing purposes, ticking both a) (powerful, open toolkit supporting not only Python, but just about anything) and b) very good markup language (org mode), but this ties potential collaborators to Emacs, so it had no chance to popularize. I don't know the relative timeline of org mode code evaluation vs. IPython/Jupyter, so I can't say whether this qualifies as prior art.


> Could you tell me more about those 'recursive' traps?

Well, these feel like 'traps' insofar as you suddenly fall into a crawl where things were fine just a step before. It's really a hands-on engineering kind of situation. You can't feel it much with small datasets. So I'm sure you know the kind.

I really tried to write something worth reading but I'm afraid, after about a page at it, these are just the ramblings of a young mind before learning to program, etc.

Here's the gist:

- I discovered circular dependency, breaking the DAG.

- Off-by-one-errors on base cases.

- O(n^x) without realizing, which hurts later.

It's just that now I know much more expensive words and concepts to describe or solve these 'traps'. ;-)

RE notebooks, I think Python is the language of choice of science and data for various reasons that made it a no-brainer for IPython/Jupyter, whose primary purpose was clearly datavis afaict. You can plug community kernels¹ for just about any language, though I'm not sure how much it integrates with tooling (I know Julia is popular for math in Jupyter).

Despite having a terminal opened 24/7, I never actually tried Emacs and org mode and I feel I missed a whole space in that regard...

Notebooks in their current form are certainly popular, but I hear too many good features in other paradigms that leave room for improvement (or yet another strong solution).

[1]: https://github.com/jupyter/jupyter/wiki/Jupyter-kernels


Thanks for elaborating.

Yes, reactive paradigm definitely includes extra challenges - the DAG that's actually being executed is usually implicit for the person reading the code, so as it grows large, it may cause surprises and generally be hard to follow. If you've ever worked with C/C++, you've seen this in action as the recompilation problem - you change one innocuous header file, and suddenly half of your project needs to be rebuilt (the #include instructions in your project files are what forms the dependency DAG).

I wouldn't worry too much about circular dependencies. Reactive systems usually need to know the dependencies of each component to build a DAG for execution (whether you explicitly declare them or they get read from your code), so at this point cycles can be detected. You have to be clever to cause an infinite loop here. There are ways around the apparent occasional need for circular dependencies (this is the same problem as circular dependencies in software architecture in general, and same solutions apply).

(Though to be honest, I wish for a computation system that would work with cyclic graphs. Some "circular dependencies" are feedback loops, and I don't see a reason why a scientific-computation-oriented system couldn't try to compute a fixpoint, or let you view the looped execution over time.)

> O(n^x) without realizing, which hurts later.

O(n^x)? Not sure. A bunch of reactive "cells" in a DAG is no different than calling each of them one after other in the right order; if you get sudden >= O(n^2) out of this, it just means some of your cells are doing dumb things. Note that cells don't get re-executed just because you referred to them a couple of times. If you have:

  Cell 1: x = strlen(someString); //O(n)
  Cell 2: for(i = 0 ; i < N ; ++i) { doSomethingConstantTimeWith(x); } // O(n)
You don't reexecute Cell 1 multiple times, so the overall complexity is O(n), not O(n^2).

> I missed a whole space in that regard...

You missed a bit, alright, but I'm not sure if I should recommend you should go and investigate, given that it can be a time suck (albeit a very rewarding one) :). But if you're willing to risk it, be sure to read some propaganda material on how Org Mode is the best thing since sliced bread (it is), and if you've never seen Lisp before, be sure to check it out eventually.


Having a mental picture of the DAG of any execution is the sort of spatial intuition that we're generally good at, I agree, it's the "implicit" graph we tend to build by association. I've very little experience with C/C++ (intro level at best) but from the Go angle I can see how reactive programming is required to avoid huge compilation times.

> I wish for a computation system that would work with cyclic graphs.

It is baffling to me that we haven't such a paradigm available. I don't know much about academic CS but I'm fairly sure there's one among a gazillion formal languages that describes circular spaces.

Intuitively, I'd think it would have interesting applications for the programming (modeling, computation, reasoning) of oscillatory phenomenons notably.

I totally agree with you in 'practice', though I've tested literally none of it even on paper. The basis paradigm, in a best-effort thinking-aloud, is that any statement execution is a loop in itself; which fundamentally gives objects a 'thickness' in time, a time dimension; thus some φ or θ property (angular-whatever you want to measure, some periodicity expressed in a common clock).

Based on this, circularity is not a problem but a feature, and this would define some Fourier of a "program", a system of elementary executions — its periodicity in time, how "big" the loop.

I don't know, it's really interesting to think about such a paradigm of representation, of programming 'models' and 'problems', behaviors.

About O(n^x), I guess I was trying to be as general as possible. Indeed, that was exactly "dumb things"! Retrospectively I'd argue it's possibly by going everywhere including into the dumb that you really get a "feel" for a particular problem/solution space. Like flawed DAGs ;-)

When you naively translate ideas into computations (like a recipe to game something optimally), it may end up looking more like

    # a bunch of discrete values, 
    # may be n-dim with indexes, table lookups...
    Column 1: x = [1, 2, 3,..., xn] 
    Column 2: y = [10, 20, 30,..., yn]
    Column 3: z = [(x+y), 2*(x+y), 3*(x+y),..., zn]

    # programming horror
    Columns 4, 5, 6...: 
    for i in x: 
      for j in y: 
        for k in z: 
          {unefficientImplementationOf f(i,j,k)}
          # 200-char highly redundant Excel formula

    # games have "levels" (for all objects potentially)
    # levels change rules: recursion down to L1 to compute
    Sheet 2: # "level 2", new indexes x, y, z, w...
               # calls L1 every single cell
In effect that last block (new sheets) creates new 'real' dimensions (with weird metrics) over the first 2D arrangement (sheet 1). Just a very not smooth surface, actually not even fully coherent in many cases (lots of exceptions).

And when you don't optimize because you'd rather copy numbers (monkey brain who can't make educated guesses) than find the actual functions (which must be stupidly simple because games can't perform complex computations, but admittedly made to be hard to retro-engineer). Basically, Excel as a numerical emulator for some game space, some system to be gamified (empirical optimization baesd on axiomatic rules).

I sure have fond memories of trying to crack these problems. High success rate (like physics, it's real world so you approximate all that needs to be). I was one of those guys making turn-key "calculators" e.g. for items or progression in games like WoW, tools to solve complexity. The most interesting were social tools, e.g. for players to "fairly" distribute some resource (positive 'reward' or negative 'work') based on some KPI — how ethics, values translate into numerical models is quite the challenging but satisfying problem I find.

About Lisp, I assume you mean the programming language? That's indeed probably #1 in my list of "different things" to try. I've read people who literally grew up in Lisp, in the 1980s iirc, and how that changes one's perspective, actually much beyond mere programming. I've probably read the wiki page and a few articles along the years. But right now I've just committed to doing a Lisp-trip (training + small personal project) this year — yours was the straw that broke the procrastination back.

(To be honest, I've a weird history with programming, I started before 10 with BASIC but I'm just taking it professionally now (career change), some 30 years later. Go figure. Life.)

Thank you for elaborating and all the good advice / perspective.


It's not just a Jupyter Notebook But In JS, though. Dynamically interacting with the DOM is perhaps the core feature of Observable, while Jupyter/iPython very much embraces the strict imperative style (ipywidgets etc being more a nice hack than anything)

While I do think there could be UX improvements, I don't think this is a flaw in Observable.


I think that in JS there is much more potential for interaction (as frontend IS JavaScript; for Python and other languages, it is this language + JavaSript). Yes, ipywidgets are hacks (as many other interactive things).

So, I am surprised that there is no standalone REPL in that style. The closest things I've found are ObservableHQ and RunKit (the other is much more Jupyter-like, but still - AFAIK runs code in the backend).


Always viewed this as a good feature, not a bug


When we launched https://VisualSitemaps.com, we decided to use D3 since it really showed the site-mapping DataViz[1] really well and even allowed for real-time manipulation ( try drag n droppin the nodes in the demo below )

However, it does have its performance degradation once you go beyond 3000 nodes of data. So we are now in the process of rebuilding our mapper in Canvas+WebGL via Pixi.js.

[1]https://app.visualsitemaps.com/share/7b4fd8556b102ed739cc308...


Cool library! I like it when libraries don't have too many configurations set to a default.

e.g. https://github.com/danielgindi/Charts has so many default options set for a graph that 80% of code is doing something like

  chart.option.isEnabled = false


Any other libraries (that are as low of level) as d3.js? Is D3 still being used heavily in production by people's experience?


D3 is still a useful library for visualization even if you replace the DOM manipulation with React, Vue, Svelte, etc. I think the trend is to use D3 as a library but use a more modern framework to manage the DOM.

I have a browser extension (treeverse.app) where the entire UI is D3. I probably wouldn't write it that way today, but the codebase dates back to ~2014 and it's held up fine.


I also mostly use d3 for its powerful helper functions nowadays: scales and axes, colors, svg path generation, geographic projections, array functions like d3.group (formerly d3.nest), csv/tsv parsing etc.

I tend to leave DOM manipulation to Vue.js unless I need to animate transitions between states. I find it easier to reason about binding data to an html/svg template with Vue than to mess with d3 data joins and enter/update/exit pattern.

d3 is and stays an impressive piece of work.


Do you find there is any performance difference using Vue.js to do DOM manipulation rather than the D3 methods?


I haven't run benchmarks but I also haven't noticed a difference in performance.

With Vue.js, I use computed properties + v-for in html template.

With D3 I use the the enter/update/exit pattern with d3.selectAll(...).data(...).join(...).

Both seem to only modify the needed DOM elements. Vue.js uses a virtual DOM while D3 attaches a __data__ property to DOM elements.


Thanks, that's helpful to know about how you implemented it in Vue vs D3, and that there wasn't a noticeable performance difference for your use case.


I used to use D3.js as the default go-to for data. Now I start with Vue.js, and use D3.js only inside some components when I need, animations between transitions.

D3.js, as someone else already noted, is "jQuery for data". And yes, it has its good parts, but the same as for jQuery) the problem is with:

- mixing static HTML/SVG with generated

- no modularization (every modularization is custom)

For the latter, while it may not be a big deal for small projects, for bigger it sucks.

For big stories, I created a game in D3.js. Modularization was... well, argument why I rewrote it a few years later to Vue.js.


Am I right in thinking that you are using Vue.js to create/delete/update the DOM elements? If so, how does performance typically compare to using the usual D3 enter()/exit() etc methods for that?


I used to do performance-sensitive things in D3.js (e.g. many moving dots).

For projects with Vue.js, the performance was not my concern.

In any case, if needed, you can use D3.js rendering within a Vue component.


Thanks, that makes sense. I'm currently doing D3.js rendering within a Vue component. I suspect that my code will be cleaner if I switch to the Vue.js constructs for managing the DOM, rather than D3's enter()/exit()/update(), so even if the performance is ultimately the same, there should still be a maintainability win.

It sounds like it's worth me doing a Vue implementation and running some benchmarks.


Used in production? Yes.

Alternatives at a low level? Hard to say - it really does allow you to do a lot at a very low level if you want BUT...

- Vega Lite / Vega

- HighCharts (paid)

- ChartJS

- Raphael (unsure if this is still used as much)

- Leaflet / Turf for GIS visualizations

- Server-side? PyGal? http://www.pygal.org/en/stable/

If you don't want to go so low-level there are a huge number of D3 abstractions that allow you to pick your chart and work with the data. Britecharts from EventBrite is one example of an actively maintained abstraction. http://eventbrite.github.io/britecharts/


We end up using some of the d3 helpers, like on axis & scales, but go straight to chart libraries + custom DOM/SVG/webgl for the rest, and normal React framework for interactivity. The d3 functional helpers are still real gems and IMO under-appreciated. I think it was rewritten a few years to help expose those better (e.g., for tree shaking.)

As soon as you go to react & friends, most of the d3 model does become redundant, and Yet Another Thing to burden the team. There's still a temptation to reuse beautiful d3 gallery items posted by others, but generally, they're annoying to patch up & now you must maintain them. Better to use dedicated maintained libs if that's what you want. There have been attempts to do Standard D3 Chart libs, but afaict it's a graveyard of abandonware, and I approach most efforts here with strong caution. (The trend for the last few years has been each bigco to release their own NIH framework and then abandon as the authors leave, and little incentive for D3 design consultants to maintain old projects.)


What "straight chart libraries" do you use? I'd be interested in digging into that further.

I agree that React / Redux state can take the place of a lot of the D3 models. That said, it's nice that D3 handles entry / updates and changes so you can point to redux and it "just works" and all the transitions, etc are handled as your app state is handled.


At this point... I think we only have Susie Lu (Netflix)'s annotator, but good chance of it being removed. For example, we had to do our own timeline to get decently fast zoom rebinning on modern screen widths that we abandoned a JS widget here, and histograms ended up being so custom that we'd get little from a library.

The shortlist for GIS is DecKGL. Graphistry gets a lot of requests here, so we helped the DeckGL folks on some Arrow/GPU stuff and we had promising experiments as part of it, but (if I remember right) we hit memory issues and something else. We'll be evaluating that for GIS next time it makes sense.

+1 to others on HighCharts team!


Thanks - that makes total sense for a company that does this for a living, especially on GL for more scalable rendering. I'll also be checking out your product.


Vega is low level, but built on top of D3.

https://vega.github.io/vega/about/vega-and-d3/

Vega lite is what I use as my day to day web graphing driver (it exports to svg which is awesome...)


ECharts has been very nice, it's free and open source https://echarts.apache.org/en/index.html


Bonus is that it uses almost the identical APIs as HighCharts. We switched from HighCharts to eCharts at my work and we've been happy with it.


Thanks for this. Never seen HighCharts. Looks nice (pending $$$)


We use Highcharts at work, and have for years. It's a really good library, definitely worth the money. We had a recent adventure with building new charts in Victory, but performance was not good and we came back to Highcharts and removed all our Victory code.

We're still happy with that decision :)


HighCharts is very nice if you don't want to invest the time into honing your d3 skills.


I started a new project using it for a project I was working at my last job ~ a year or two ago. I got a few hits from recruiters interested in it once I added it to my resume.

I liked it quite a bit. It's pretty fast and decently easy to use. Plus, all the examples Mike Bostock has thrown around online really helped for giving me examples of how to do various visualizations I was interested in.


How low level? I make GoJS, a powerful diagramming library (the focus is more on diagrams and interactivity, than data). It's low level in that you can make a lot of different things with it, but higher level in that it has things like an undo manager.

https://gojs.net/latest/index.html


Vega lite comes to mind


If you get rid of the `width = 940` line then the charts become responsive. `width` is a preset observablehq variable.

There's more work to do but it's a start




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: