Are modules without imports “considered harmful”?

jraph · on Feb 8, 2022

It seems to me one of the worst problems explicit imports help solve is forward compatibility of code with respect to the modules and the standard library. And this problem seems worth solving.

If you require user code to explicitly import functions from modules, you can safely add a function to a module or to the standard library without risking to break someone's code.

Importing the module is not sufficient: really, the functions (or any other imported objects) themselves need to be named. (or, the function needs to be qualified with the name of the module; actually, Python seems to do a reasonable job at this)

As a text editor user, it also helps me discover where such or such function is defined. The alternatives are:

- using an IDE that will consume a lot of CPU cycles and memory to resolve the functions

- grepping, with the risk of finding another function with the same name so I would also need to check if the type signature matches.

I'll gladly spend the time to write my import statement to solve all these issues. It does not bother me. It's not where I spend my time when I program.

It's okay if IDEs hide those import statements or produce them automatically (and it does not mean it's the wrong approach). The point is: code should not break (compilation or runtime) if a module or the standard library evolves and add stuff that could have otherwise clashed. The work being done automatically and the result being hidden by default does not mean it can be done the same way in the future and therefore should be avoided; that's because the environment can change, so it has to be done at the time the code is written, manually or automatically.

phoe-krk · on Feb 8, 2022

Common Lisp hacker here, with a real-world case of this:

> It seems to me one of the worst problems explicit imports help solve is forward compatibility of code with respect to the modules and the standard library. And this problem seems worth solving.

This is exactly the problem in a lot of legacy Common Lisp code in which packages :USE other packages. :USE means that all symbols from package A become accessible in package B.

It also means that a widely :USEd package cannot add a new symbol without possibly breaking a lot of code that depends on it. In the optimistic case, nothing will happen; in the realistic case, you will get package conflicts if the symbol that is newly exported from A is also exported by another package C that is :USEd; in the worst case, the :USEing package successfully grabs that symbol overwrites its meaning with something else, which means that the functions or variables or classes or whatever else named by that symbol suddenly become something else and everyone else that :USEs package A suddenly needs to deal with an overwritten definition, and chaos ensues.

One CL solution is to never :USE other people's packages and explicitly import single symbols; another solution is to use local nicknames which means that it's possible to "import org.random.foo as f" and refer to symbols via F:BAR.

kazinator · on Feb 9, 2022

> USE means that all symbols from package A become accessible in package B.

All exported symbols from package A become accessible in package B.

Also, symbols that package A itself uses from another package are not visible by using package A.

You can create an intermediate package to isolate yourself:

   (defpackage foo-2019
     (:use foo)
     (:export this that other thing))

Now you can ":use foo-2019" and get only those four symbols from foo. The :export actually imports those symbols into foo-2019, and then marks them for export. The :use foo makes all of foo's exported symbols symbols visible in foo-2019, and this allows the :export clause to refer to them.

phoe-krk · on Feb 9, 2022

Right, sorry - should have written, "all symbols exported from package A become accessilble in package B".

alaaalawi · on Feb 9, 2022

Also check Oberon the language and its module system. and was used to create Oberon the system. all procedures/functions are called like this: M.F (Module name - it can be abbreviated but not removed, and F procedure/functions.)

thaumasiotes · on Feb 9, 2022

> Importing the module is not sufficient: really, the functions (or any other imported objects) themselves need to be named. (or, the function needs to be qualified with the name of the module; actually, Python seems to do a reasonable job at this)

Well, in Python you can do it well or poorly at your own discretion.

Good:

    from module import fun
    ...
    fun()

    import module as m
    ...
    m.fun()

Bad:

    from module import *
    ...
    fun()

jraph · on Feb 9, 2022

Indeed, import * should be avoided.

travisd · on Feb 8, 2022

Interestingly, I suspect Facebook employs the largest auto-loaded codebase in the world and I suspect this is made easier by the fact that everything is controlled under one roof. There is some namespacing (à la PHP namespaces) for standard library stuff, but most naming just takes the form <TeamName><NormalClassName> to avoid any conflicts.

em-bee · on Feb 9, 2022

as i wrote in my other comment here: https://news.ycombinator.com/item?id=30268565

if modules are always addressed by their full path then adding new functions to a module is not a problem.

jraph · on Feb 9, 2022

Sure. However, this makes for verbose and wide code, which is not necessarily desirable. It may (arguably) reduce readability. Your brain has to parse and check that each full path is the same, which increase the effort you need to make to properly read the code.

I guess it is a matter of taste.

em-bee · on Feb 9, 2022

well, i guess i haven't seen much code where modules were used so frequently that the verbosity became an issue.

in the few cases where you have one or two verbose function calls that are called many times you can always assign them to some short variable and solve the problem that way.

i don't see the path checking issue. module paths form an easily recognizable pattern that doesn't increase my effort to read the code. you'd have to have some very similar named modules for that to be a problem.

mjw1007 · on Feb 8, 2022

It's worth thinking about.

Nowadays many people are letting their IDE do 90% of import statement maintenance, and then folding the lines out of the way so they don't have to bother reading them.

That suggests to me that having explicit imports at the top of every file isn't the best representation of the information they contain.

As far as naming is concerned, maybe it would be better to share a description of the project's naming rules in some larger scope, and let individual files override that.

In languages where imports are doing work for dependency resolution as well as naming, maybe it would be more readable to have this information in one place than scattered around many files.

armchairhacker · on Feb 8, 2022

I honestly think having the IDE resolve imports and then fold them is a good solution.

The users don’t need to know the specific imports 90% of the time, but it really helps the compiler. So it’s a sort of metadata for the code, something we don’t really have. Usually, the same code is seen by both developer and compiler.

(It may also help incoming developers, but personally i think it’s negligible. Now they just have module+identifier instead of identifier. The real help comes from “view symbol documentation” “jump to source” which is there regardless).

fiddlerwoaroof · on Feb 8, 2022

I find passing dependencies via constructor arguments make codebases a lot easier to navigate: it encourages a pattern where you have a single file that wires up the entire system and you can look at that file to get a sense of which modules are talking to which other modules.

Gilad Bracha had some interesting posts that talk about this pattern: https://gbracha.blogspot.com/2009/06/ban-on-imports.html?m=1

colanderman · on Feb 8, 2022

I find that pattern makes navigation harder, particularly in untyped languages: because the particular implementation is now no longer statically resolvable, now I have to walk up a call tree to find which implementation (or even, which type, in an untyped language) corresponds to each of those constructor arguments, rather than just looking at the top of the file.

fiddlerwoaroof · on Feb 8, 2022

I think this indicates bad modularity in the codebase and/or badly specified interface contracts. I’ve found that this makes finding bugs easier because I can start in the wiring file and then drill down to the only code that handles stuff relevant to the bug: by giving one place that lays out the high-level structure of a codebase, that place becomes like the “table of contents” for the rest of the code.

This particular benefit is almost completely destroyed by DI frameworks that use annotations and other magic to automatically wire together a codebase. So, I just don’t use those.

EdwardDiego · on Feb 8, 2022

Constructor args are the simplest form of dependency injection :)

And when I'm using a DI framework, the only injection point I'm happy with.

But, having worked on a large codebase that used constructors with no DI framework, it makes adding a new dependency used in multiple bottom nodes of the graph very painful.

wonnage · on Feb 8, 2022

It’s super painful to figure out where this magically injected dependency came from. Now you have to look at every place that the object was constructed. And god forbid those constructors also use injected dependencies.

I hate Guice.

fiddlerwoaroof · on Feb 8, 2022

The solution here is not to use something like Guice and to only call the constructor once.

kerblang · on Feb 8, 2022

The folding would be mostly unnecessary if we wildcarded (e.g. `import foo.bar.*`) instead of explicitly spelling every single thing out, but many dev teams forbid that... and then want the IDE to compensate for it.

My point actually being that if folks won't accept wildcards to begin with, I can't imagine importlessness gaining any traction, although perhaps we'll bifurcate as happened with the extremes of checked-exceptions-everywhere vs. never-exceptions.

8note · on Feb 8, 2022

Import problems tend to get caught at code review time

> are you sure you want to use com.do.not.use.this.is.a.copy.collection ?

chenglou · on Feb 8, 2022

I help maintain ReScript (https://rescript-lang.org) and we've been rolling without an import statement for years now (basically OCaml's module system). The default is to just write `MyModule.doThis` at callsites. Sometime you do wildcard open (`open MyModule`) within the right scope. Sometime you do it at the top level for convenience (e.g. stdlib), but folks try to be judicious. And yes, you can alias a module path, e.g. `module Student = School.Class.Student`. Worth noting: the reason why fully qualified `MyModule.doThis` can be the default, is that we _usually_ dictate filenames to be unique within a project, so there's no verbose path issue nor refactoring issues (and yes this works fine... Facebook does it too).

Static analysis-wise (which is what most of the blog post's about), things are basically the same. The tradeoffs are mostly just in terms of readability. I used to be ambivalent about this myself, but looking at the growing body of GitHub TypeScript code with mountains of imports automated by VSCode, imo we've landed on a decent enough design space.

mutatio · on Feb 8, 2022

You can do this in Rust, the imports in Rust being synonymous with your aliasing of a module path.

I guess it depends on the depth of your paths, but I'm unsure what it solves, it just moves the verbosity to call sites.

codeptualize · on Feb 9, 2022

Not familiar with Rust, but I really like the ReScript (OCaml) module system.

It just gets out of the way. Verbosity is easy to prevent with "open" statements. The nice thing about those is that you can put them in a local scope where you use them (which is generally recommended).

It's just a lot more flexible and a lot less hassle.

_0w8t · on Feb 8, 2022

Just using modules names as prefixes everywhere instead of importing individual symbols works when modules have short names. Which in turn requires that the modules should be relatively fat so relatively few of those should exist allowing for short and descriptive names.

But in a language like JS where the module cannot span multiple files one quickly runs out of good short names. Then importing individual symbols or at least aliasing the module becomes a necessity.

em-bee · on Feb 9, 2022

1. ambiguities

what is wrong with using the full module name? i find it makes the code more readable.

2. bad search & code completion

when i don't know what function i need, then i need to search the whole library anyways. if i address modules by their full path then code completion will only need to complete on the top level namespace (the list of modules) and then drill down. a clever IDE can track what i already used and offer that first.

3. compiling more than necessary

the compiler only needs to track what functions are actually being called and only compile those or the modules they are in.

4. dependencies are not obvious

an ide can track which modules are used if i use the full path to address them.

without an IDE i'll have to read the code. ok. the long module names stand out, but granted, grepping for import is easier.

the pike programming language works without import. modules are always called by their full path, so the parser and compiler can find and resolve them and compile those that are needed.

pike does have an import statement. it loads all the symbols in a module into the local namespace. it is the equivalent to pythons "from foo import *"

i am not sure about this but my understanding is that otherwise only the symbols that are actually used are being loaded. import is a mere convenience for the coder but potentially causes more memory to be used and may make local namespace searches slower so it's use is not recommended. (not that this would be significant unless you import hundreds of modules)

WalterBright · on Feb 8, 2022

I don't miss #include. Not for a second. Not even a fleeting thought.

I even added imports and modules to D's embedded ImportC C compiler.

skocznymroczny · on Feb 9, 2022

D still has imports. Even if you want to use something 'common' like writefln you need to drop import std.stdio; somewhere. Luckily D allows to put imports inline so it's not that bad.

withinboredom · on Feb 8, 2022

PHP via composer has autoloading that "just works." I've never had any issues figuring out which thing is being used.

ehnto · on Feb 8, 2022

Namespaces and PSR are the reason I suspect. For anyone not aware, the PHP community has a set of standards that outline some behaviours around modularity that aren't stricly language structures, just best practices. The defactor package manager, composer, also follows these standards, and life is sweet.

Namespaces is pretty much what this article ends up landing on also, if I understand. There is no material difference between `\App\MyClass::myFunc()` or putting at the top of your file, `use App\MyClass` then latter, `myFunc()`

Autoloading: https://www.php-fig.org/psr/psr-4/

notpachet · on Feb 8, 2022

Etsy has (or had when I was still working there) a cool directory-namespace based custom autoloader for PHP (named, not inexplicably, EtsyLoader). When the PHP runtime detected that the script was trying to use an identifier not in scope, the loader would try to find a matching file by splitting the identifier on underscores into different folders, with the last segment being the target file. So Listings_Migration_Checker would get resolved to listings/migration/checker.php.

Way more deterministic and predictable than Rails autoloading, which always seems to devolve into a crazy mess. And you knew exactly where a constant was located on disk just by eyeballing the name.

ehnto · on Feb 9, 2022

An eCommerce framework called Magento did a similar thing, it made for some fantastically long class names, but I was never left wondering where something lived and autocomplete was never confused. They did this with underscores because namespaces didn't exist back then, but namespace based autoloading is essentially the same thing. PHP passes the autoloader function the full namespaced string `\Some\Namespaced\Class` and you do your directory traversal based on that inside the autoload function.

PHP makes it pretty easy to have multiple custom autoloaders too, which I've used to help bring legacy projects into modern times before. Even inside older frameworks with no awareness of classes, you can register an autoloader and start using the classes and namespaces in your module/plugin. When PHP runtime detects an unknown class, it will go through every autoloader function registered with spl_autoload_register.

metadat · on Feb 8, 2022

This sounds really cool!

Unfortunate and a bit bummed out that nothing relevant currently turns up in google:

https://www.google.com/search?q=EtsyLoader+php+github

svachalek · on Feb 8, 2022

I think there are plenty of real-world cases that can be studied here: Ruby comes to mind, and the early days of JavaScript. Even though most of it isn't stateful, this has a lot of the same issues as global variables. Personal answer here is a solid "yes". I don't think any of the ideas in the article help much.

bradrn · on Feb 8, 2022

I know of at least one language which actually uses this approach in its module system; namely, Io [http://iolanguage.com/]. In Io, if you refer to an undefined name, all Io files in the current directory are searched for that name. The file containing that name is then automatically imported. I’ve never used Io so I don’t know how it works out in practice, but I suspect that the small community helps, in that backwards compatibility becomes less important.

tehjoker · on Feb 8, 2022

This seems crazy to me that anyone would want to use wildcards in anything but a one off script. The lack of discoverability and potential for crazy-making clashes is omnipresent.

codeptualize · on Feb 8, 2022

This sounds like the OCaml module system.

co_dh · on Feb 9, 2022

The unix command is a module system without import, and is very convenient.

It is only possible when 1) you have very simple data structure. line of text in this case. 2) each command has a lot of options, thus reduces the number of commands.

madsbuch · on Feb 8, 2022

I got a bit confused here. It appears the author talks about explicit imports vs. implicit imports.

Ofcause it would be alright to not import anything in a module meaning that it would have no external dependencies.

The_rationalist · on Feb 8, 2022

In Kotlin top level declarations can be accessed seamlessly without imports. This is an agreable trade-off.

barchar · on Feb 8, 2022

reminds me of the venerable "makeheaders" tool, which works very well, however the tooling issues are very real.