The `gvpr` tool in graphviz deserves more recognition.
Graphviz, at its core, is a simple graph description language (DOT), which allows progressive enhancement, and a bunch of auto-layout tools where some double as renderers.
The double-duty of the layout/renderers masks GraphViz's true power, which is that you can create a pipeline modifying your graph source (confusingly referred to as '.dot' files), progressively adding attributes.
This is where gvpr comes in. While most are familiar with GV's all-in-one layout+rendering tools like `dot` or `neato`, gvpr is a generic Awk-like programming tool that iterates over nodes and edges. With it, you can implement any layout algorithm you like (or not!), to specify part or all of the attributes of your graph (like positioning and style), to then be rendered using `dot` to the format of your choice.
When we started Graphviz, we vaguely sensed the potential for graph databases or environments, like neo4j, but we didn't work much in that area. It would take more resources, or intensity, than what we had. Anyway, gvpr is a modest attempt at a sort of graph query language.
Neo4J is interesting. It does something important, mainly, rationalize the mess of graph representations and algorithms. It's not hard to understand why that is valuable, but Neo4J sell it as being "efficient for linked structures" (relational databases don't have foreign keys and indices) and "intuitive" (have you ever tried to read the code for graph matching in Cymbal? I rest my case.) They have investments of hundreds of millions? Worth billions? They contributed something valuable to the tech economy.
It's surprising the conventional relational database products don't pay much attention to this opportunity, but they have gone vertical, e.g. credit card processing, because in the end it's more profitable to solve big end customer problems.
I'm actually disappointed that `gvpr` isn't mentioned at all on GraphViz's main page, because I think it's its most powerful tool. All I could find are manpages, and not even on GraphViz's own website!
I don't know of any example, but I immediately thought of a use case: to use it as a sort of CSS language where you take a graph with only 'semantic' custom attributes and use the GVPR language to transform these into shape and style attributes that are recognized by Graphviz, so that you don't have to repeat all the style information for each node and edge. That's already pretty exciting to me!
I've used GraphViz a number of times and highly recommend it as a standard tool on your belt. Having a stand-alone executable that can export to SVG is great.
The most complex thing I've done with it [1]: a tool (MIT-license) that builds diagrams of the data and addressing pipeline for a DSP processor, and lets one 'scrub through' the assembler code frame by frame and see the values propagate through the blocks.
Also PlantUML [2] uses it for most diagrams.
Getting layout and positioning the way you want can be tricky but is usually achievable with patience and hidden objects.
My graphviz-fu got so much better when I started using invisible objects for the grid-like layouts. At that point you're just using graphviz for positioning/spacing/styling instead of creating the overall topology.
You're handing graphviz a list of relationships. Sometimes you know the ideal way to visualize this is a tree, or something with right angles, but graphviz isn't laying it out that neatly. So you create extra nodes, and edges to those nodes, so that graphviz considers it a "balanced" tree or grid. Then you use styling to turn those nodes invisible (showing just the nodes/edges you want). Once I have this correct I usually contain it within a "cluster", and then link that cluster to the larger diagram.
if you want X nodes falling near in the graph you just put them into a box with white lines and they move as one from this point. Can be made with the cluster environment.
My single biggest issue with Graphic is that the quality of Arrow end-points is very poor. They don't terminate cleanly at the destination. This is visible in the 2nd link you've posted.
I'm amazed that this hasn't been addressed. It's been there for years.
Makes the end products impracticable for distribution in a professional setting. It's Ok if you're the only consumer ... not great for other cases.
I remember this issue being raised, I think on the graphviz Discourse, and the graphviz developers responding to it as if it was a hard to solve limitation of graphviz for some reason. But I can't for the life of me remember the details.
"Very poor" seems harsh, but, yes, it hurts us, too. (That, and text with slightly-off baselines.)
The relevant code starts around https://gitlab.com/graphviz/graphviz/-/blob/main/lib/common/... If this isn't right, maybe somebody can figure out why. Perhaps the loop stops sometimes on the wrong side of the boundary? Anyway, it's equally possible that when the endpoint coord is handed off to a lower level driver, the arrowhead mitering is wrong. There is no question there was once upon time explicit code to try to cope with this problem, at least in the native PSgen, but I can't find it now. Maybe it wore away as waves of open source development washed over it, along with static tables for a bunch of "standard" PS fonts.
I think I found at some point that the arrows could be made correct -- at least for the SVG rendering in Firefox -- by removing the stroke (stroke-width="0").
Ostensibly each arrow style graphic would need an explicitly-defined "attach point" at whatever is deemed the "tip", so that the tip can be oriented perfectly in relation to the vertex and edge of the GraphViz graph.
If nothing like that exists, I can see how it would be a very troublesome refactor.
I never had this problem. Arrow style is configurable if I remember correctly. You can configure (extend) also overlapping in boxes so the arrows never enter into the box area.
I'm talking by heart and I could be wrong in any case.
I once created a dataset of X-hooked-up-with-Y for my circle of friends. About 50 nodes and a few hundred connections (after I asked around a bit for additions).
It became a wedding present for one rather central "node" with the title "drosophila neural regulation network" or something like it. No names, just lines and circles. It's still the centerpiece in his home office.
1) is there an option to make the graph horizontal instead of vertical?
2) can you make the svg option separate, instead of hidden behind the PNG button (with CTRL key)?
3) how did you do the handwriting style?
That said, compared to the visual clarity of most of the visualizations you share, the graphs that come out of Graphviz are pretty hard to read. If you zoom in and pan around you can easily get disoriented in some patch of nothing but spaghetti lines. This is definitely an area where some mbostock magic would be of huge benefit.
Graphviz is an amazing tool, it has just suffered from being cloistered to technologists whose jobs aren't to solve the kinds of problems it is specifically adapted for. I use graphs in security and privacy quite a bit, and even built the tech for a hopeful security platform using a graph back end. They yield the fastest path through complex problems, and I use them to do in a couple of hours what typically would take client staff months.
They're really compelling but oddly unpopular.
Imo, they're useful for one-off pattern discovery, and they're most valuable for finding single or a few exceptions and outliers in normalized data, and you need to be in an environment where there are asymmetric returns on finding those. Surveillance seems to be their default use case, with open ended scientific research a close second. This comes up with graph based recommendation engines, which are essentially a surveillance/marketing product based on preferences. Most businesses aren't based on discovery of anything other than customers for a transaction they already have.
These aren't problems engineers typically solve, which are more about scaling and optimizing, they're more marketing and sales problems, where you're looking for exceptions and opportunities. (security and privacy are the complementary antithesis of these) A graph based product (imo) is ideal for product marketing analysts optimizing for customer preferences and discovery.
From a product perspective, graphs are analogous to ML, where you'd use a clustering algorithm on loosely structured data to yield categories, comparisons, and implied relationships, whereas a graph yields the same thing over the structured normalized data you can feed into it once you have imagined an ontology to fit it.
That sounds really interesting! Can you give a few more details how you use Graphviz? How does it give such a great advantage "to do in a couple of hours what typically would take client staff months."
Did I understand it correctly that you use it to discover patterns? Are these patterns discovered by just using the layout engines? Arent' other tools, e.g. networkx in Python or cytoscape in javascript easier to use interactively in a REPL? What is the typical workflow (maybe plot, find interesting pattern, change query/data in a loop)?
I'm really interested in how Graphviz can be so great. I am currently working with the other mentioned tools for visualization purposes.
Really simply, I typically use Neo4j, but if I have flat homegenous data, I just use Graphviz because the dot markup format lends itself to parsing easily in awk command lines. The times I have used networkx was when I needed a graph abstraction layer to reason about another graph query, so networkx wasn't used as a persistent graph store, but more of an intermediate data structure for orchestrating multiple service and api calls, like a low rent graphql. I'm a crap developer, but the graphs were what i needed to piece the logic together coherently.
One example of clients taking months is mapping counterparties to agreements. Let's say you have inherited a division that has file share full of contracts and you want to understand the line of business. You get the counterparties out of the contracts and find all the paths for obligations between entities within the division and their counterparties. The graphviz/dot layout gives you a map of all those parties in a single slide and shows clusters, instead of a 3 lb. document with a paragraph for each of them that would have cost a massive amount of consulting time, or interviewing several people to get their narrative understanding of how the business worked, the graph provides an objective map.
You could just use D3js, but for me the dot markup was faster on the command line than structuring json.
The idea is if you can formulate a conceptual, narrative ontology of an organization, you can create a grammar of things and relationships, and then you can plug data (contract counterparties) into that model and form a fairly complete map.
Another recent use case was enterprise vulnerability scan data over a very full /16 address block, allocated across multiple divisions under different management hierarchies with thousands of hosts. By linking the host ownership data to projects and an org chart with the types of vulnerabilities, I could demonstrate in a couple of slides what the highest impact patching strategy would be. Again, graphviz for sketching up the ontology, then Neo to do the lifting.
On a much simpler scale that was more graphviz/dot oriented, I did some work for a startup where I worked with the executive team who had acquired a codebase and talent, and created an ontology of their pipeline customers, their stated needs, implied product features, platform dependencies, our service interfaces, their code bases, and demonstrated the flow of how work on the code bases flowed through to impact revenue. This ultimately got represented as a Sankey diagram, but it was graphviz/dot I used to sketch up the initial ontology.
Have you tried gephi? It's not exactly an alternative to graphviz (eg. you can't cluster nodes) but it handles much larger graphs and has a bit more flexibility in layout. It has plugins for both dot and neo4j input.
Wow that looks cool. It reminds me of what Orange.app was for regular data viz, this Gephi is for graphs.
It's pretty notable that the coolest data viz is for discovery, whereas most managers just need to know whether the line has gone up or down. I'm thinking there may be a fundamentally different cognitive orientation to whether one is hunting for opportunity or managing a resource.
I found the learning curve to be quite steep with few examples of graphs that looked well-designed. By default, large graphs look like crap because engineers designed it, not designers (it takes both!).
However, that was back in the late 90's. Now the internet has many examples of better looking layouts, but it is still disappointing that they don't look ... designed?
But integration and automation are great. Like GNUPlot, which I've used for decades because it is so easy to automate once you learn how to make plots look better than the default.
GStreamer uses DOT output by default. But any reasonably sized pipeline is almost impossible to read without excessive zooming. But it gets the job done.
Autogenerated database documentation is often pretty hit and miss but tbls[1] does a pretty good job in that space. Especially when you comment on your tables, fields, views, functions etc (which is a good habit anyway!) the output is quite useful
Another graphing tool is Pikchr (https://pikchr.org) from the creator of SQLite; actually the SQLite doc SQL-statements diagrams were (re)done using the Pikchr. It's some kind of extension of PIC language.
I'm not very proficient with graphing, so can't compare it to Graphviz, but a few examples I tried were relatively easy.
Awesome! Since switching from Mac to Linux I've been really missing Monodraw[1] for generating diagrams suitable for code comments. Emacs artist-mode is nice but I struggled to get the hang of it, will definitely be trying this!
It's already open sourced on Github [0] (there is a link on bottom right of the page).
Note that the page is simply a wrapper of Graph::Easy [1], so nothing technically interesting to see in the repo - it just passes the HTML input to a command-line tool and prints the result.
You can either use Graph::Easy directly on the command-line or you can use python to make an HTTP request to my page (example is shown in the README of [0]).
The main downside for me is that sometimes it gets the positioning wrong, and you can see how it can be easily fixed, but it's hard to convince graphviz to actually do so. Basically I'd love a tool where I can do 10% of positioning manually and let the rest be constraint based like in graphviz.
It used to have a built in tool to move nodes it has placed. All I can find scanning the docs right now is the -n flag to neato, which will honor existing pos fields.
Graphviz has saved my bacon so many times throughout my career. Not just for visualising constraints and dependencies in planning and scheduling (construction, not computing) but in so many other areas.
Best example was for fabrication of a large subsea structure. Often, these critical assets have material traceability requirements on-par with space exploration, producing thousands of pages of certificates and documentation along the way.
I.e. you need to prove and assure the provenance of every piece of material sitting on the 1km down on the ocean floor back to the assay report of the soup in the ladle at the forge when it was formed, and everything in between; material certificates, design specs, third party testing, mother plate to material stock to final cut part identification, and all the third-party witnessing and assurance along the way. Easy for small parts but cumbersome for 60tn structures with thousands of parts and welds.
When the requirement is 100% QA/QC coverage, linking IDs to certificates and other documents in each part of the chain meant we could easily visualise things and look for rogue elements like child parts pertaining to be from two seperate mother plates, or destructive testing coverage for all materials.
Not how it's usually done but it was a useful tool for communicating to other not so technical stakeholders and helping some on the project sleep easier. :-)
I use graphviz probably weekly and it's actually become a bit of a productivity hack. If there's something I don't want to do, I can often convince myself to just DRAW it instead, usually with graphviz. Once drawn, the actual task is easier, both because the barrier of starting is overcome, and because the drawing is useful in the task.
For example, I had to parse DWARF debug info recently to scrape type information, something I dreaded doing. Instead of diving into the DWARF specs, I set out to adapt one of the pyelftools [1] examples to produce a .dot file and graphed it with dot, producing [2].
Now looking at the picture, it's nearly obvious how functions and structs and types are stored. The rest is trivia (How do I access this attribute? How do I iterate over a DIE's children?).
Here's a simple alias so that anything that writes to /tmp/tmp.dot can be viewed with a single command:
alias graph='dot -Tpng /tmp/tmp.dot -o /tmp/tmp.png && open /tmp/tmp.png'
GraphViz is great, like Markdown. I recommend https://edotor.net for online use:
1. No set-up, no accounts/registration.
2. Free, no ads.
3. Simple, pleasant UI with variable/keyword autocompletion and even multi-cursor find/replace!
4. Your graph is saved in the URL so sharing the link lets others play with a copy of your graph. I usually link-shorten the URLs since they're long:
The biggest thing that makes me keep going back to graphviz despite there being nicer libraries available is that because it uses a really simple markup language instead of code or complex objects, I can juts put a handful of print/write statements into any program I want to analyse and pipe it to graphviz to give me nice graphical overview of what's going where.
I've used it to visualise/debug everything from web scrapers in Python to directory scanning in Bash to proper graph algorithms in Java. Every time with no library installation or anything, just simple prints to a stream/file and perhaps a system() call at the end if I'm running it often enough.
My text editor, KeenWrite[0] renders Graphviz diagrams in near real-time. KeenWrite extends the functionality a little by allowing the use of variables inside the diagrams. In the example diagram[1], the 350 value stems from a variable and its value is shown near the top.
Regarding Graphviz itself, I wonder why is there no special layout logic for planar graphs? They can be recognized and embedded on the plane in linear time without intersecting edges, so it would be very nice if some of the Graphviz tools actually did that.
I've never been that happy with the output from any of the different graphviz engines. I'd like to see a UI graph editing tool which can import and export .dot format, but also allows manually moving nodes (which I guess wouldn't get exported unless arbitrary metadata is supported), and ofc exports to svg. Maybe there's already something similar I haven't stumbled across yet?
I guess my use case is mainly swimlanes and places where nodes don't have to have the same rank.
That is a different point than "basically abandoned" :-) I do use X11 programs on Unix and MacOS (so far XQuartz works well enough). And probably Microsoft Windows though I don't use it.
Odds are slim that its future owner will have the skills to work with such a resource. But I did leave a couple printouts on the boat before leaving it!
I maintain an open-source project [1] that uses graphs to model data. I wanted to make my project as accessible as possible, so Graphviz was perfect since it's dead-simple to install and use on all major OS platforms.
I've used Graphviz for many years and my main gripe with it is the lack of customization. Yes, there's "some" customization, but there's a point (not too far from the initial style) where, if you don't like the output, you're out of luck and might as well redraw it using draw.io or some other WYSIWYG editor.
Graph viz is admirably powerful, even here. The main techniques seem to be invisible objects and custom code, in case that helps. The cassowary solver group in Australia (I'm blanking on the university, Monash?) goes further via custom constraint logic, and I believe iOS layouts were deeply inspired, and maybe even had one of their students.
Graph technologies are tricky to build up -- you can go deep in many areas like this. Afaict, AT&T/Bell Research used to be the main developers of graphviz, and when that corporate lab thinned out, so did graph viz development. A similar story happened with gephi devs when they left their universities (was popular in social data), and cytoscape (popular in bio) is still largely at that fragile state.
We are historically big contributors to OSS at Graphistry, but saw the history of funding issues in this space, and how each tool reached a frustratingly low ceiling due to it. We decided to not start as open core, and instead made a free GPU tier + release individual pieces as OSS (Apache Arrow's JS tier, etc) + initially charge for GPU code, until we could build + grow more commercially targeted parts. Thankfully, we are getting to the point of enough sustainable revenue that we are pushing much more to our OSS libs now, and one of our OSS pushes for 2022 will be moving custom (GPU-accelerated) traditional layouts & AI layouts to the OSS lib pygraphistry :) A request we get often get for pygraphistry and our file uploader UI is dotfile (and graphml/gexf) conversion/ingest for then viewing+editing in our scalable interactive graph renderer, so we have been thinking of adding a pygraphistry example just to demonstrate the OSS flow!
How come the graph edges are "--" and the digraph edges are "->", but not "<-"?
That means it's not possible to flip graph/digraph to see how it looks with/without arrows, but naievely it doesn't seem that the difference makes any difference except making that difficult, if the connections must always be written left to right. (I've not used more advanced features, so it could be that it does more, or could be that it was built expecting people to know what they want first, which would all be fine)
FRA, the Swedish government agency for radio communications posts some riddles every year around Christmas. With one you got a couple of phone call logs and were asked questions about who was who in a terrorist group.
Plotted the calls with graphviz and the questions were instantly trivial to solve. Great tool with many applications.
Many moons ago I created all graphs in the LPIC2 exam prep book [1] using graphviz. The book is open sourced now, and all dot files are preserved in the accompanying repo [2].
One thing I learned creating these graphs was that with some tinkering you have quite a lot of control on how the result will look, without littering the semantic in the source file with a lot of markup.
I choose graphviz because the dot notation can easily be checked into source control. It is also easy to create reproducible results using a makefile.
My team uses a succinct graphviz dot file to note the dependencies between different git repositories. This is then used by the build system to figure out what needs to be rebuilt. There are lots of other ways to skin this cat, this is just the method we chose, and it's oddly pleasing.
GraphVizio[1] "an addin for Microsoft Visio which allows you to layout Visio diagrams using the renowned Graphviz algorithms developed by AT&T research laboratories." - https://www.calvert.ch/graphvizio/
I used Graphviz years ago to generate diagrams for NLP structures, and other visualizations. It was the only thing that could do what I needed. I ended up moving to Processing/P5 to create "spring physics" interactive graphs, then wrote my own eigenvector embedding visualization that could project into 3D space and had a moveable camera so you could fly around with WASD and mouse. Cool stuff! Graph visualization and 3d graphics was a fun rabbit hole
Do you happen to have any open source code for this that I could look at. I am looking to do some interactive graph viz and would love to see if there is something I could reference
Basically any sketch that has "3d" or "aspekt" in the title is on that track. But you can also see from the thumbnail many of them are 3d world related. The navigation of that site is pretty tricky...but you can find the source code by:
- Click on a sketch
- Click on the 'abacus' control menu to the right
- Click on the files tab in that
Source code in Java should be available there.
Please remember it requires a version of Processing in Java to work. I worked on this in 2011/2012 so you may need to figure out which Processing version that is.
Also, here's a youtube video of flying around one of my sketches:
Used it as a helper in my first project at Snowflake (role hierarchy layout) https://github.com/Snowflake-Labs/sfgrantreport and won a nice hacking award for it. Then the product saw it and folded what I made into the main UI.
Huge props to the graphviz for making it easy to draw my ideas from whiteboard into svg/pdf/png!
Related, does anyone know if there are any good tools for learning things about a graph (not necessarily a graphviz graph, but that'd be fine!)? Things like... take a graph from me and identify all the "islands" that don't connect to each other? Or to find the shortest 'path' from point A to point B in a graph?
Graphviz is one of my favorite tools. I use it for most of my diagraming https://github.com/dylanowen/mdbook-graphviz and I use it as debugging output when I'm working through graph problems.
Likewise! I have used it to easily visualize dependency graphs at multiple employers. The pictures can get a bit messy when they're very large, but they were amazing for detecting incorrect/missing dependencies.
Did anyone see the new pathing algorithm for connecting graph nodes that made some rounds on twitter? It looked way better than a normal force-directed graph. I can’t remember what it was exactly, and have lost the reference, but it was a side effect of solving what looked like a traffic problem.
Graphviz is one of my favourite tools, but I use it very rarely. I noticed a mention of support for UTF-8 and some great documentation layout improvements, but what are the other big features that have been implemented in the last few years?
Most of the work has been on CI/CD, Windows, rationalizing and modernizing the code, and Zheng, Pawar and Goodman contributed a stochastic gradient descent solver, described here, https://arxiv.org/abs/1710.04626 with benefits pointed out by Börsig, Brandes and Pasztor https://arxiv.org/abs/2008.10376 but it's not a default solver, you'd have to ask for it. Great people joined (took over?) the project and have done an amazing job of triage on open bugs, and some were even fixed. We rewrote parts of the network simplex solver to eliminate some of the dumbfounding "trouble in init_rank" errors. We have a long wish list, and are hoping for funding to do some of this work.
It's great to see it still going. Used it for Puppet as well as dpkg output (we made a full dependency list of our multimedia ingest application, i.e. lots of custom ffmpeg and libboost etc). That thing was huge
I've long marveled on of my ex-colleagues graphviz-fu. He'd whip out a new graphviz online doc and explain an architecture through it. Helped a ton during the pandemic as we were all remote.
I used this to generate relationship diagrams between companies, adding in projects and who is working on what; the final image was a map of an ecosystem of that set of businesses.
I used to generate most of my diagrams in PlantUML[0], but recently found that my not-taking app, Joplin[1], has Mermaid built in.
PlantUML has the benefit that you can include it in your builds, e.g. parse any .plantuml files in `make doc` or in your CI.
But I like Mermaid more: its easier to read, less quirky and easily integratable in the markdown (or note-taking apps) without the need for extra build steps.
I've had good success with cytoscapejs as well, integrating with react so the graph will animate layout on transition. I use elk, which is apparently superior to sugiyama (the algorithm graphviz uses for force-directed dag layouts).
I think dagre was ok but oversold the quality of the algos it used a bit. In many cases it can be better to use a SVG file generated from Graphviz directly instead.
Edit: I don't know of a really good visual tool as I've found text editing .dot files directly with a live visualization pretty nice. I use graphviz pretty regularly.
Graphviz, at its core, is a simple graph description language (DOT), which allows progressive enhancement, and a bunch of auto-layout tools where some double as renderers.
The double-duty of the layout/renderers masks GraphViz's true power, which is that you can create a pipeline modifying your graph source (confusingly referred to as '.dot' files), progressively adding attributes.
This is where gvpr comes in. While most are familiar with GV's all-in-one layout+rendering tools like `dot` or `neato`, gvpr is a generic Awk-like programming tool that iterates over nodes and edges. With it, you can implement any layout algorithm you like (or not!), to specify part or all of the attributes of your graph (like positioning and style), to then be rendered using `dot` to the format of your choice.