In the table/dataframe -> hypergraph dataframe transform @ https://github.com/graphistry/pygraphistry , we do `hypergraph(multicolumn_table, direct=True | False)['graph'].plot()` , which renders hypergraphs as a regular graph. That lets you pick. Saves a lot of wrangling!
Consider exploring some logs of customer activity or security events. A hyperedge becomes either:
- a node of a bipartite graph. Ex: each log event becomes a node connecting the various entity nodes it mentions. Event <> IPs, accounts, countries, ...
- .. or a bunch of pairwise entity<>entity edges. Ex: connect each IP<>account<>country directly, and label each edge with the hyperedge event id it came from.
In both cases, you can now directly leverage a lot of traditional graph thinking, and in our case, GPU acceleration & visualization.
Other systems might render hyperedges as say circles encomposing their nodes, but that's trickier at even small/medium scales, so I haven't seen much widely popular
I increasingly just directly equate 'logs' with 'property hypergraphs' and skip the relational step. Funny enough, a lot of our enterprise+gov work is undoing the weird tabular & taxonomy optimizations of SQL engines that leak into the user experience when we're to get simple 360 views for users. It's cool an org built out 10,000 tables, but..... :) Same thing when we're doing graph neural networks: logs => hypergraphs (encoded as bipartite or regular property graphs) => event embeddings / classifications / predictions .
> I increasingly just directly equate 'logs' with 'property hypergraphs' and skip the relational step. Funny enough, a lot of our enterprise+gov work is undoing the weird tabular optimizations of SQL engines to get 360 views of data back to users. It's cool someone has 10,000 tables, but..... :)
The more I program, the more I realize that all these data-structures are conceptually the same if you squint enough.
Graphs are matrices (Edge == 1. Non-edge == 0). Matrices are graphs (see sparse matricies, like COO and its really obvious). Computer Code is graphs, Databases are Hypergraphs. Circuits are hypergraphs. Etc. etc. Everything seems to convert into each other.
The study of NP-completeness is the study of graph (or hypergraph) complexity. Chordal graphs and Bipartite Graphs seem to have significantly more algorithms in P-complexity rather than NP-complexity. (Assuming P =/= NP)
--------
There are questions about which forms are easier to think about... both for the human and the computer. At least, for the equivalent forms (Relational Databases vs Hypergraph representations seem to be equivalent, though I'm not 100% sure)
When one form is "less powerful" than another form, you often get significant improvements in speed. Lists are less powerful than trees, but almost all list algorithms operate way faster.
In the table/dataframe -> hypergraph dataframe transform @ https://github.com/graphistry/pygraphistry , we do `hypergraph(multicolumn_table, direct=True | False)['graph'].plot()` , which renders hypergraphs as a regular graph. That lets you pick. Saves a lot of wrangling!
Consider exploring some logs of customer activity or security events. A hyperedge becomes either:
- a node of a bipartite graph. Ex: each log event becomes a node connecting the various entity nodes it mentions. Event <> IPs, accounts, countries, ...
- .. or a bunch of pairwise entity<>entity edges. Ex: connect each IP<>account<>country directly, and label each edge with the hyperedge event id it came from.
In both cases, you can now directly leverage a lot of traditional graph thinking, and in our case, GPU acceleration & visualization.
Other systems might render hyperedges as say circles encomposing their nodes, but that's trickier at even small/medium scales, so I haven't seen much widely popular
I increasingly just directly equate 'logs' with 'property hypergraphs' and skip the relational step. Funny enough, a lot of our enterprise+gov work is undoing the weird tabular & taxonomy optimizations of SQL engines that leak into the user experience when we're to get simple 360 views for users. It's cool an org built out 10,000 tables, but..... :) Same thing when we're doing graph neural networks: logs => hypergraphs (encoded as bipartite or regular property graphs) => event embeddings / classifications / predictions .