More

amakelov · 2024-07-13T22:33:06 1720909986

Thanks! And thanks for sharing the pointer - I think I've seen `mltrace` at some point in the past. The tool has some similarities, but seems different from `mandala` on a philosophical level - `mltrace` seems more opinionated and domain-specific to ML. `mandala`'s goal is to make persistence logic a more generic part of Python semantics, and there's much more emphasis on composing `@op`s freely in whatever complex ways you want. Python functions are great and everyone knows what they do - let's give them more superpowers!

If this is something you're interested in, I can probably give a more detailed comparison if I find the time.

amakelov · 2024-07-13T22:23:07 1720909387

Thanks Rachit! Great running into you after all these years!

Being aware of types is certainly a must in a more performance-critical implementation; this project is not at this stage though, opting for simplicity and genericity instead. I've found this best for maintenance until the core stabilizes; plus, it's not a major pain point in my ML projects yet.

Regarding incremental computation: the main idea is simple - if you call a function on some inputs, and this function was called on the same (modulo hashing) inputs in the past and used some set of dependencies that currently have the same code as before (or compatible code, if manually marked by the user), the past call's outputs are reused (key question here: will there be at most one such past call? yes, if functions invoke their dependencies deterministically).

You can probably add some tools to automatically delete old versions and everything that depends on them, but this is definitely a decision the user must make (e.g., you might want to be able to time-travel back to look at old results). I'm happy to answer more nuanced questions about the incremental computation implementation in `mandala` if you have any!

amakelov · 2024-07-12T18:44:11 1720809851

Thanks for sharing! This is a great project. It is quite close to the memoization part of `mandala` and I'll add it to the related work in the README. I think the similarities are:

- using `joblib` to hash arbitrary objects (which is a good fit for ML which inlcudes a lot of numpy arrays, which joblib is optimized for)

- how composition of decorated functions is emphasized - I think that's very important

- wrapping outputs of memoized functions in special objects: this encourages composition, and also makes it possible to run pipelines "lazily" by retracing memoized calls without actually loading large objects in memory

- versioning: in a past version of `mandala`, I used the solution you provided (which is now subsumed by the versioning system, but it still quite helpful)

The differences: - w.r.t. memoization, in `mandala` you can represent data structures in a way transparent to the system. E.g., you can have a memoized function return a list of things, and each thing will have an independent storage address and be usable by downstream memoized functions. Most importantly, this is tracked by the provenance system (though I'm not sure - maybe this is also possible in `provenance`?)

- one big finding (for me) while doing this project is that memoization on its own is not sufficient to manage a complex project; you need some declarative way to understand what has been computed. This is what all the `ComputationFrame` stuff is about.

- finally, the versioning system: as you mention in the `provenance` docs, it's almost impossible to figure out what a Python function depends on, but `mandala` bites this bullet in a restricted sense; you can read about it here: https://amakelov.github.io/blog/deps/

Re:Unison - yes definitely; it is mentioned in the related work on github! A major difference is that Unison hashes the AST of functions; `mandala` is not that smart (currently) and hashes the source code.

amakelov · 2024-07-12T18:27:22 1720808842

Great question! The versioning system does something essentially equivalent to what you describe. It currently works as follows:

- When a call to an `@op` is executed, it keeps a stack of @track-decorated functions that are called (you can add some magic - not implemented currently - via import-time decoration to automatically track all functions in your project; I've opted against this by default to make the system more explicit).

- The "version" of the call is a hash of the collection of hashes of the code of the `@op` itself and the code of the dependencies that were accessed

- Note that this is much more reliable than static analysis, and much more performant/foolproof than using `sys.settrace`; see this blog post for discussion: https://amakelov.github.io/blog/deps/

- When a new call is made, it is first checked against the saved calls. The inputs are hashed, and if the system finds a previous call on these hashes where all the dependencies had the same (or compatible) code with the current codebase, this call is re-used. *Assuming deterministic dependencies*, there can be at most 1 such call, so everything is well-defined. I don't think this is an unrealistic assumption, though you should keep it in mind - it is pretty fundamental to how this system works. Otherwise, disambiguating versions based on the state of the code alone is impossible.

- When a dependency changes, you're alerted which `@op`s' versions are affected, and given a choice to either ignore this change or not (in the latter case, only calls that actually used this dependency will be recomputed).

The versioning system is mostly a separate module (not in a way that it can be imported independently of the rest, but it should be pretty doable). I'd love to hear more about your use case. It may not be too difficult to export just versioning as its own thing - though from what you describe, it should also have some component of memoization? As in, you seem to be interested in using this information to invalidate/keep things in the cache?

amakelov · 2024-07-13T09:57:44 1720864664

Forgot to mention: yes, the dependency tracking is transitive, i.e. if your @op calls a @track-decorated function, which in turn calls another @track-decorated function, then both dependencies will show up, etc.

amakelov · 2024-07-12T14:57:48 1720796268

Thanks! Yes I think a caching solution like this is great for notebooks, because it makes it very cheap to re-run the whole thing (as long as you reasonably organized long-running computations into `@op`s), overcoming the notorious state problem.

Graphviz is indeed a lifesaver; and you can similarly think of `mandala` as a "build system for Python objects" (with all the cool and uncool things that come with that; serializing arbitrary Python objects with strong guarantees is hard https://amakelov.github.io/mandala/tutorials/gotchas/).

I've no experience with rust, but I'd be curious to hear about the difficulties that came up. I'd expect to see some overlap with Python!

culebron21 · 2024-07-12T18:15:00 1720808100

What I can remember immediately:

1. Imports are more complex than in Python, because a module can be just a block in code, not necessarily a separate file/folder. E.g. `pub mod my_module { <code with functions and constants> }` is a module inside another module, so you don't need a folder and `__init__` inside to have inner modules.

Also, `use something` may man an external crate.

`use super::something` means import from upper level of modules tree, but it's not necessarily a folder.

2. I can parse what types my functions require in their signatures, or structs require in their members (but I must have resolved where really those names point at), but there's also type elision -- i.e. you don't need to explicitly write the type of every var, it's deduced from what is assigned, for example `let my_var = some_func(...)` -- will make `my_var` have the return type of some_func. Now I must also keep track of all functions and what they return.

And then, there are generics:

    let my_var: Vec<MyType> = vec![...];

Vec is generic, and in this case it has MyType inside. Well, this may be enough to just register `MyType` on my deps list. But then I may do some more calls:

    let my_other_var = my_var.pop().unwrap().my_method();

Here, `pop()` returns `Option<MyType>`, unwrap returns `MyType`, and then in the end, my_method may return whatever, and I essentially need something like a compiler to figure out what it returns.

This seems big like a little compiler or language server.

amakelov · 2024-07-12T14:37:21 1720795041

Oh also totally missed the Borges mention the first time - I'm a big fan of his stories!

amakelov · 2024-07-12T14:34:35 1720794875

Thanks! Indeed, despite the fact that the main goal is to track ML experiments, the approach taken in `mandala` has a lot in common with e.g. time-travel debugging (https://en.wikipedia.org/wiki/Time_travel_debugging).

In reality, there are many similarities between experiment tracking, debugging, and high-level computation graphs - they're all different ways of getting a handle on what a program did when it was ran.

amakelov · 2024-07-12T00:49:11 1720745351

This blog port gives an overview of the core dependency tracking logic: https://amakelov.github.io/blog/deps/

ledauphin · 2024-07-12T00:58:54 1720745934

thank you!

amakelov · 2024-07-12T00:14:45 1720743285

Ah, yes, the notorious state problem in notebooks. In your project, do you find the dependencies statically or dynamically?

vrtnis · 2024-07-12T01:12:37 1720746757

Statically - basically just parsing the code into an AST and then walking through the tree to collect information about variable usage and definitions.

amakelov · 2024-07-11T23:45:38 1720741538

Great question - personally, I mostly use it from notebooks, and I think it's a great fit for that. Bundling experiment tracking with incremental computation makes a lot of sense in a notebook (or any other interactive) environment, because it solves the problem of state: if all your computations are end-to-end memoized, re-running the entire notebook is cheap (I routinely do this "retracing" to just get to some results I want to look at).

That being said, nothing prevents you from running this in a script too, and there are benefits of doing this as well. If your script mostly composes `@op` calls (possibly with some light control flow logic as needed), you get resumability "for free" on the level of individual `@op` calls after a crash. However, the workflow you're describing may run into some features that aren't implemented (yet) if your runs write to different storages. `mandala` makes it easy to get a "holistic" picture of a single `Storage` and compare things there. Comparing across storages will be more awkward. But it shouldn't be too hard to write a function that merges storages (and it's a very natural thing to do, as they're basically big tables).

omneity · 2024-07-11T23:57:21 1720742241

Thanks for the in-depth explanation! I’ll keep an eye on it :)

Fine-grained crash recovery does sound like a great application as well.