i want to be able to trigger datasets to be rebuilt automatically when their dependencies change, which as i understand it is a large part of pachyderm's value proposition, but it is unclear how to integrate pachyderm into the larger data ecosystem. my users expect data to be available through a hive metastore or aws glue data catalog. they expect to be able to query it with aws athena, snowflake (as external tables), and other off the shelf tools. i need to be able to leverage apache iceberg (or delta lake or hudi etc) to incrementally update datasets that are costly to rebuild from scratch. it doesn't seem that pachyderm can do any of these things, but maybe i am just missing how it would do them? i would love to have a scheduler that is just responsible for triggering datasets to update when their dependencies change, but it seems that pachyderm is built around a closed ecosystem which makes it incompatible with tools outside that ecosystem.