I'm not sure if you're correct or wrong? But its a strange statement to make.
Like, any graph can have a "minimum spanning tree" representation for example. But the minimum-spanning tree has some degree of information loss compared to the original.
Similarly: a graph / tree could have a Lexicographic ordering (ie: a list representation), but this list-representation lost a lot of information in the translation process.
---------
It just turns out that a lot of algorithms could work on the minimum-spanning tree (or the lexicographic list) of the graph.
So instead of working on the general graph of a problem, you simplify it down into a simpler structure, and THEN work on the simplified structure.
In the table/dataframe -> hypergraph dataframe transform @ https://github.com/graphistry/pygraphistry , we do `hypergraph(multicolumn_table, direct=True | False)['graph'].plot()` , which renders hypergraphs as a regular graph. That lets you pick. Saves a lot of wrangling!
Consider exploring some logs of customer activity or security events. A hyperedge becomes either:
- a node of a bipartite graph. Ex: each log event becomes a node connecting the various entity nodes it mentions. Event <> IPs, accounts, countries, ...
- .. or a bunch of pairwise entity<>entity edges. Ex: connect each IP<>account<>country directly, and label each edge with the hyperedge event id it came from.
In both cases, you can now directly leverage a lot of traditional graph thinking, and in our case, GPU acceleration & visualization.
Other systems might render hyperedges as say circles encomposing their nodes, but that's trickier at even small/medium scales, so I haven't seen much widely popular
I increasingly just directly equate 'logs' with 'property hypergraphs' and skip the relational step. Funny enough, a lot of our enterprise+gov work is undoing the weird tabular & taxonomy optimizations of SQL engines that leak into the user experience when we're to get simple 360 views for users. It's cool an org built out 10,000 tables, but..... :) Same thing when we're doing graph neural networks: logs => hypergraphs (encoded as bipartite or regular property graphs) => event embeddings / classifications / predictions .
> I increasingly just directly equate 'logs' with 'property hypergraphs' and skip the relational step. Funny enough, a lot of our enterprise+gov work is undoing the weird tabular optimizations of SQL engines to get 360 views of data back to users. It's cool someone has 10,000 tables, but..... :)
The more I program, the more I realize that all these data-structures are conceptually the same if you squint enough.
Graphs are matrices (Edge == 1. Non-edge == 0). Matrices are graphs (see sparse matricies, like COO and its really obvious). Computer Code is graphs, Databases are Hypergraphs. Circuits are hypergraphs. Etc. etc. Everything seems to convert into each other.
The study of NP-completeness is the study of graph (or hypergraph) complexity. Chordal graphs and Bipartite Graphs seem to have significantly more algorithms in P-complexity rather than NP-complexity. (Assuming P =/= NP)
--------
There are questions about which forms are easier to think about... both for the human and the computer. At least, for the equivalent forms (Relational Databases vs Hypergraph representations seem to be equivalent, though I'm not 100% sure)
When one form is "less powerful" than another form, you often get significant improvements in speed. Lists are less powerful than trees, but almost all list algorithms operate way faster.
"Hypergraphs can be viewed as incidence structures. In particular, there is a bipartite 'incidence graph' or 'Levi graph' corresponding to every hypergraph, and conversely, most, but not all, bipartite graphs can be regarded as incidence graphs of hypergraphs."
A hypergraph over vertices V can be viewed as a subset of the power set of V, i.e. E ⊆ 2^V. Which can be represented by a bipartite graph over vertices (V ∪ E), where v ∈ V and e ∈ E are connected iff V ∈ E.
Representing a hypergraph as a bipartite graph doesn't lose any information. The entire structure is still there. It will take you two pointer indirections instead of one to traverse an edge, but otherwise it's exactly the same - all the connections are still in the right places.
There must be something being lost in translation here, because bipartite graphs cannot represent all graphs, let alone hypergraphs.
* A -> B
* B -> C
* C -> A
This is a simple, 3-node cyclical graph that fails to be bipartite. All graphs are hypergraphs, so this is also an example of a hypergraph that fails to be bipartite.
EDIT: Funny enough, if we remove C->A and replace it with "C -> D" and "D->A" it'd be bipartite, with A/C as one side and B/D being the other side.
-------
We can convert this cyclic graph into other representations. For example:
* A-> B
* B-> C
Now we have a minimum spanning tree, which also happens to be a list [A, B, C]. This was accomplished by removing an edge (the C->A edge).
----
There are a great many operations that "see" a graph in other forms. For example, the following list describes the graph:
1. A->B
2. B->C
3. C->A
This "list representation" of the graph is certainly... a list... and it captures all of the information of the graph. But its also not really useful to do graph operations in this form. It may take a long time to find a node's edges for example.
I'm assuming you're saying that there's some kind of "bipartite form" that can accurately describe all hypergraphs. But while that's technically true, that's also not really what people mean.
I'm not saying that all hypergraphs - or all graphs - literally are bipartite graphs, I'm saying that you can construct a bipartite graph that represents the same structure as the hypergraph.
Given the hypergraph:
A -> (B, C)
you can convert it to a bipartite graph by introducing an extra vertex of the second type to represent the hyper-edge:
A -> e
e -> B
e -> C
You can do the same thing starting from a "regular" graph, although it's kind of silly for most purposes; for the basic cycle, you'd have:
A -> e1
e1 -> B
B -> e2
e2 -> C
C -> e3
e3 -> A
I'd say that's a much more valid representation of the graph than the spanning tree, which is lossy (since you can't tell from [A, B, C] whether or not there's an edge C->A).
You have a vertex set {A, B, C} and an edge set {A->B,B->C,C->A}. I'll label the edges D,E,F for convenience.
We define an incidence structure between the vertices and edges, by making red vertices to represent original vertices, and blue vertices to represent original edges. A directed edge A->B is represented by the path A->D->B in our incidence structure. In total, our incidence structure has six nodes {A,B,C,D,E,F} and six edges:
A->D
B->E
C->F
D->B
E->C
F->A
Said differently, we have subdivided every edge of the original graph into two edges in the incidence structure. That doubles the lengths of all cycles, so we don't see any odd cycles; also we have an explicit bipartition into red and blue.
All that said... directed hypergraphs are kinda wild.
The idea is that if you have a bipartite graph and the sets V and W are the vertex sets for the bipartition, then V is the hypergraph's vertices and W is the hypergraph's hyperedges. A hyperedge w in W is incident to whatever it's adjacent to according to the bipartite graph.
Like, any graph can have a "minimum spanning tree" representation for example. But the minimum-spanning tree has some degree of information loss compared to the original.
Similarly: a graph / tree could have a Lexicographic ordering (ie: a list representation), but this list-representation lost a lot of information in the translation process.
---------
It just turns out that a lot of algorithms could work on the minimum-spanning tree (or the lexicographic list) of the graph.
So instead of working on the general graph of a problem, you simplify it down into a simpler structure, and THEN work on the simplified structure.