- omnisci SQL. in contrast, rapids.ai opens layers below (cudf arrow, dask, ...) that enable cooperating solns on top (blazingsql, cugraph, custreams, prefect, ...) that are faster + easier for their domains, w fallback to general dataframes/sql.
-- omnisci is governance by a VC co, while rapids.ai is by nvidia (who wants to sell hw, not sw) and more OSS partners
Omnisci did good engineering, so it does have strengths. ex: its geospatial visual analytics means it's a good esri alternative consideration, as it is more polished than manually stitching together cuspatial + blazingsql + leaflet etc. Likewise, commercially polished for hostile enterprise environments (procurement, ...).
re:scale, see rapids tpcx-bb numbers ('big data'), I think on 10TB datasets. it shows scale + cost effectiveness wins vs others. less obvious, out-of-core so can do TBs even on one GPU, and full tpcx-bb needed the above versatility where sql is a kludge.
re:graph vs table, if you do just points and no edges, the node table is just a regular table you can do regular tabular data analysis + viz in. ex: load in samples scored by some ml model (x/y plot w lots of data columns for each point), then connect nearest neighbors to make it into an interactive graph. we are doing more and more here in practice, it's fun :)
- omnisci SQL. in contrast, rapids.ai opens layers below (cudf arrow, dask, ...) that enable cooperating solns on top (blazingsql, cugraph, custreams, prefect, ...) that are faster + easier for their domains, w fallback to general dataframes/sql.
-- omnisci is governance by a VC co, while rapids.ai is by nvidia (who wants to sell hw, not sw) and more OSS partners
Omnisci did good engineering, so it does have strengths. ex: its geospatial visual analytics means it's a good esri alternative consideration, as it is more polished than manually stitching together cuspatial + blazingsql + leaflet etc. Likewise, commercially polished for hostile enterprise environments (procurement, ...).
re:scale, see rapids tpcx-bb numbers ('big data'), I think on 10TB datasets. it shows scale + cost effectiveness wins vs others. less obvious, out-of-core so can do TBs even on one GPU, and full tpcx-bb needed the above versatility where sql is a kludge.
re:graph vs table, if you do just points and no edges, the node table is just a regular table you can do regular tabular data analysis + viz in. ex: load in samples scored by some ml model (x/y plot w lots of data columns for each point), then connect nearest neighbors to make it into an interactive graph. we are doing more and more here in practice, it's fun :)