Having separate module interface and module body files, as in Ada or Modula-2/3,...

masklinn · on Aug 29, 2021

> Having a terse little file where I can scan the interface of a module, rather than having to scroll through the implementation and see which declarations have `public` in front of them, is a great way to quickly refresh your mental model of a module's API.

That information is trivial to extract, there is no reason to force the developer to maintain it and keep it in sync.

> And it allows you to have both interface docstrings and implementation docstrings, which a documentation generator could use to compile an API guide for clients and a developer's guide for people working on the internals.

You could call these "docstring" and "comment".

occoder · on Aug 29, 2021

>> Having a terse little file where I can scan the interface of a module, rather than having to scroll through the implementation and see which declarations have `public` in front of them, is a great way to quickly refresh your mental model of a module's API.

> That information is trivial to extract, there is no reason to force the developer to maintain it and keep it in sync.

That's true, in theory. In reality though, when is that information extracted? What do you use to extract it? Where is it saved once extracted? How easy is it to review it? Do you need to use an IDE to do that?

Any less than satisfactory answer to these questions will make this worse in practice than having the developer maintain it.

bogeholm · on Aug 29, 2021

> That's true, in theory. In reality though, when is that information extracted? What do you use to extract it? Where is it saved once extracted? How easy is it to review it? Do you need to use an IDE to do that?

You could use your build system to extract it to a website (eg. https://docs.rs/)

gavinray · on Aug 29, 2021

  > That's true, in theory. In reality though, when is that information extracted? What do you use to extract it? Where is it saved once extracted? How easy is it to review it? Do you need to use an IDE to do that?

This is part of the GNU Ada toolchain.

I don't write Ada, but I have looked into it. I strongly dislike having to write an entire separate type/interface file that repeats the type definitions from the implementation.

This information exists -- a tool should be able to extract the signatures and spit out the interface file automatically (IE a header for C/C++)!

In Ada, this tool is called "gnatchop"

https://learn.adacore.com/courses/GNAT_Toolchain_Intro/chapt...

Instead of writing an ".ads" and ".adb" file (like ".h" and ".c"), you just write an ".ada" file and feed it to "gnatchop", it creates the two files for you and you're ready to compile.

  gnatchop example.ada # (example.adb + example.ads created)
  gprbuild p_main # (builds from example.adb)

Another neat thing Ada can do is interop with C++! It has C interop, but can ALSO support C++.

https://gcc.gnu.org/onlinedocs/gnat_ugn/Interfacing-with-C_0...

https://docs.adacore.com/gnat_rm-docs/html/gnat_rm/gnat_rm/i...

It can take C/C++ headers and auto-generate the interop code you need as well

  $ g++ -c -fdump-ada-spec -C /usr/include/time.h
  $ gcc -c *.ads

https://gcc.gnu.org/onlinedocs/gnat_ugn/Running-the-Binding-...

https://gcc.gnu.org/onlinedocs/gnat_ugn/Generating-Bindings-...

foerbert · on Aug 29, 2021

I can understand your hesitancy, but I think in practice I strongly disagree.

A big part of why I like Ada so much is the fact it lets me hold such a strong mental model of the program. I can specify quite a bit about how it should all work, and the compiler holds me to it.

Most or all of that sort of information is tied up in the .ads file. If I want to refer to the model, I can check the .ads file, even if my project doesn't compile yet. Everything I need to know is there, from the very first line of code.

Most importantly, if I'm working in the .ads file, I'm changing the model. Changes here are Important. If I unknowingly make a change here, I've lost my understanding of the model. I really don't want that to be possible.

Meanwhile the .adb is more the implementation. If I'm changing the .adb, I'm just altering the details, but the overall model stays the same. Maybe what I'm doing in the .adb tells me I really do need to change the .ads because the model has a problem, but that doesn't mean I should just go make the easiest little change to the model that makes the .adb work.

Frankly, I think that extra little bit of friction in having two files that need to be in sync makes it easier to write better programs. Something as huge as changing the model should have something that helps cue me in that I'm doing something Big.

turbohz · on Aug 29, 2021

The compiler should be able to provide it

Jtsummers · on Aug 29, 2021

You don't want to extract it. The intention is that the specification (the thing you code against, you do write those down, right?) is separate from implementation so that you can provide multiple implementations.

If you rely on extracting the interface from the implementation then you have to have another mechanism to compare two implementations to see if they provide the same interface. That's kind of an insane way to do things from the Ada perspective. You've made things harder for yourself and less certain for the users of that interface.

Put the public bits into a package specification file so that anyone can know as the user or the implementer what is expected. Swap out implementations as needed and have high confidence that (short of logic errors in the implementation) it will at least provide the same interface because, well, it wouldn't compile otherwise.

Also, the specification files are a bit like C or C++ headers. You can write a program predicated on their correctness without actually needing an implementation to verify against. The *.ads files tell you "These functions, procedures, and types exist. I promise, so you can go about your business even though an implementation may not be available yet.

nerdponx · on Aug 29, 2021

I have never used Ada. But I really like the idea of having implementation documentation separate from interface documentation, and treating it like real documentation with its own searchable/linkable HTML or PDF reference document.

A good IDE can then fold/collapse the in-code documentation, and the programmer can have it up in a separate window along side.

This could be an interesting model for literate programming. Instead of it being "linear", like reading a novel, it would be like reading a translation of an ancient text, with the original source material on one side of the page and both the translation and detailed reference notes on the other side. And of course there would be a hypertext component to the documentation, which would allow you to build a "table of contents" and jump around the codebase.

masklinn · on Aug 29, 2021

> This could be an interesting model for literate programming. Instead of it being "linear", like reading a novel

It's unclear what you mean by "linear" here. Surely your "translation of an ancient text" is a linear read following the "ancient text" it translates even if it has forwards and backwards references?

Knuth's original conception of literate programming was non-linear in terms of code, you'd write some text, write some bits of code, possibly add a reference to an other snippet, write some more text, write some more bits of code, and tangle then stitches the source back together by following references.

More "modern" literate programming is non-linear in terms of narrative, making the "comments" / "docstrings" the main content but then having the code execute "normally" ignoring said comments.

Jeremy Ashkenas's tools (e.g. undescore, backbone, …) are all written and published in that style even though Javascript is hardly conducive to it, and shown in exactly the "original source material on one side and translation and detailed notes" you seem to talk about on the other. That is what Ashkenas called "annotated source": https://backbonejs.org/docs/backbone.html, https://underscorejs.org/docs/underscore-esm.html.

Recent revisions of underscore have been modularised and show individual segments you can look between instead: https://underscorejs.org/docs/modules/index-all.html maybe that's what you're thinking of when you talk about it being non-linear?

It's missing some of the bits e.g. the symbols themselves are not hyperlinked and there is no glossary, but because in the modularized version each function is the sole export of its module it's easy to jump between functions. Not that I'm convinced this makes for a great experience as it requires keeping a lot in memory, but there you go.

nerdponx · on Aug 29, 2021

I meant "linear" as in you start at point A and read until point B. That is, pieces of information are presented and organized as a sequence of one item after another. I am envisioning a system where the programmer has code in one window and the explanation of the code in another. Like a book with text on the left and annotations on the right.

masklinn · on Aug 29, 2021

> I meant "linear" as in you start at point A and read until point B. That is, pieces of information are presented and organized as a sequence of one item after another.

That is the definition of the word, it’s not actually helpful in understanding what you’re thinking about.

> I am envisioning a system where the programmer has code in one window and the explanation of the code in another. Like a book with text on the left and annotations on the right.

So… literally what i posted.

MaxBarraclough · on Aug 29, 2021

> Having a terse little file where I can scan the interface of a module, rather than having to scroll through the implementation and see which declarations have `public` in front of them, is a great way to quickly refresh your mental model of a module's API.

There's also another structured approach to the interface/implementation distinction: leave it up to the IDE to offer an interface explorer. This is the approach used by, say, Java.

I'm not sure that either approach is outright better than the other; it's a trade-off.

zetalyrae · on Aug 29, 2021

I tend to prefer things being on the code itself as opposed to being added dynamically by the IDE. For example I think type annotations should be in the IDE.

This is because it lets me read code in extra-IDE settings: browing GitHub, or in a patch file, or on a book. Or I can write code on paper.

Another benefit is that you can design a program entirely by writing the module interface files, and typechecking them against each other without an actual module body file.

Then, as you start actually implementing the program, you can implement one module at a time, typechecking it against the module interfaces of its dependencies, without said dependencies having any actual code in them. So you can write the actual implementation in whatever order makes sense.

badsectoracula · on Aug 29, 2021

Free Pascal has units (aka modules) with the interface and implementation in the same file but in separate sections, e.g.

    unit Foo;
    interface
    
    type Weekday = (Mon, Tue, Wed, Thu, Fri);
    
    procedure DoThisAt(Day: Weekday);
    
    implementation
    
    procedure DoThisAt(Day: Weekday);
    begin
      // stuff
    end;
    
    end.

This helps keep things together and up to date (the Lazarus IDE can automatically sync the implementation section with the interface section, no need to type stuff twice manually) and you can still scan the interface section to see its API without bothering the implementation section (but it is still just a scroll away if you want).

(FWIW this is an old feature taken from Turbo Pascal which itself took it from UCSD Pascal)

joppy · on Aug 29, 2021

In C++ you can’t really even separate them if you want to define templates, because (unless I am mistaken) template instantiation can only be done at compile time rather than link time. It’s sad to not be able to cleanly separate the interface from the implementation.

zetalyrae · on Aug 29, 2021

Yes, I think the module interface file should be for the user rather than the compiler.

The compiler would parse both the module interface and module body files, merge them and check for consistency, and produce both an object code file for the code in that module and a binary module interface that contains an efficiently serialized form of the interface, the bodies of generic functions, and maybe the table of monomorphic instances for separate compilation.

Then the build system makes sure to import the relevant binary interface files when building a project.

pjmlp · on Aug 29, 2021

Just like C++20 modules allow.

You only need to export the public parts of the templates.

bboreham · on Aug 29, 2021

If my memory isn’t failing me, the Sun C++ compiler I used in 1994 did template expansion and compilation at link-time. However it was rather annoying in use, having to wait a long time to get errors arising from instantiation.

pjmlp · on Aug 29, 2021

You can now when using C++20 modules.

ojeda · on Aug 29, 2021

"Interface files" that are intended as a form of documentation should be generated automatically instead.

As for interface vs. implementation docs, nothing precludes having both in a single file either.

OneWingedShark · on Sept 1, 2021

> Having separate module interface and module body files, as in Ada or Modula-2/3, is a great idea that sadly a lot of people are burnt out on because C and C++ do this in a very unprincipled way. > Having a terse little file where I can scan the interface of a module, rather than having to scroll through the implementation and see which declarations have `public` in front of them, is a great way to quickly refresh your mental model of a module's API.

Then there's the other-way of doing it: imagine a language with a database/browser (e.g. smalltalk), where the implementation is just linked and can be accessed in the-same/an-other window.

This sort of system could also have documentation-comments attached to the interface (e.g. for usage), and the implementation (e.g. for maintenance logging, rationale, etc).

mojuba · on Aug 29, 2021

Coming from Pascal/Delphi background I too find the structured separation useful (unlike the unstructured one in C/C++), though obviously it has a cost of typing the declarations twice.

Then for more modern languages where there's no separation some IDE's can auto-generate the interface declarations along with the associated comments. E.g. Xcode does it for Swift and it's kind of OK.

badsectoracula · on Aug 29, 2021

> though obviously it has a cost of typing the declarations twice

In Lazarus you can have the IDE do the syncing for you (Ctrl+Shift+C). Modern Delphi might have something similar (if not exactly the same thing).

nerdponx · on Aug 29, 2021

Doesn't OCaml have something like this too?

kqr · on Aug 29, 2021

F# does, so I assume OCaml does too.

"Wait F# does?"

Yeah, but you won't have seen it because almost nobody uses it to the point where some tooling even has trouble understanding what it is. :(

yawaramin · on Aug 29, 2021

Indeed, and that was inspired by Modula-2 actually: https://dev.to/yawaramin/ocaml-interface-files-hero-or-menac...