> Stop trying to justify your refactoring with the "public but internal" argument. If the language spec says it's public, it's public. Your intentions have nothing to do with it.
This is so wrong. APIs are for people, not tools, so intent is primary. When tools are not expressive enough to capture and enforce intent, you document it, but it's still primary. Someone using a "public" API that clearly says "for internal use only" is no different from a person that uses workarounds like reflection or direct memory access, and there is no obligation to keep things working for the.
> there is no obligation to keep things working for the.
You opened with the correct observation that APIs are mostly for people. Saying there is no obligation here contradicts the expected social norms. And even more importantly, intent does not tightly correspond with reality, and what can happen, tends to happen. The actual code actually existing always has the final say. If you intend to have the best outcome for everyone involve, conform to the unalterable realities as much as possible - if the interface should be public, make it public. If the interface should be private, make it private.
I specifically said "when tools are not expressive enough to capture and enforce intent".
Suppose you're writing a library in Python. Everything in it is public. Even the dunder class members are, because it's just name mangling, and the language spec even documents what exactly it does!
Now, is anyone going to seriously claim that every single identifier in every Python library is part of its public API, and any change that affects it is a breaking change? Because that's certainly not the "expected social norm".
Granted, Python is a somewhat extreme example. But in practice, this also comes up in languages like Java and C#, when dependencies are more intricate than what the access control system in those languages can fully express.
And then there are backdoors:
> What can happen, tends to happen. The actual code actually existing always has the final say.
You can use Reflection to access any private field of any object in Java. There's actual existing code doing that in practice, too. Does it have the final say, and does it mean that internal representation of any Java class in any shipped Java library has to be immutable, so as to not break the API clients?
The language I'm using doesn't let me express the public/private divide I wish to make correctly (e.g. "private" implementation functions for a public C macro.)
The API is 100% intended for internal use only, but someone insists on ignoring that and consuming the private API anyways. Instead of forcing them to write their own headers which silently break at runtime when function signatures change in certain calling conventions which don't check those signatures, I instead allow them to include headers with a few keywords like "private", "internal", "do_not_use", or "i_am_voiding_my_semver_warranty" in the path, perhaps only after they make some similarly scary #define s, so it's at least a build failure.
C++ and C# have the same kind of problem: except for the iffy freind declaration in C++ there is no way in the language to denote that some method is bot meant for use in other modules. C# has the internal scope for each assembly, but this breaks in combination with unit tests placed in seperate testing assemblies.
Generally, proper unit testing is at odds with strict scope restrictions in the tested code. I guess we need more allowances fornunitbtesting at the language level to fix that. E.g. allow testing code to be marked as such and ignore that certain things are declared private, but in turn only allow it to be run in a testing context, but not regular builds, to prevent abuse.
This only covers a small part of the problem. Things that should be private and requite separate yesting are atill required to be more visible than they are supposed to.
This simply cannot work in many cases. It is quite unrealistic to test complex logic that is hidden behind a narrow interface completely. You are hit with the full combinatorial complexity of what is behind that interface, even if might consist of independent parts internally. If you can test these parts independently, the number of required tests is a fraction of what a black box approach requires.
Another situation is checking numerical code for correctness and accuracy. There it is extremely advantageous to have testable small functions that map to individual mathematical expressions. But these are again implementation details that need to be hidden behind interfaces.
Your first statement is exactly what I've arrived at. It's just not avoidable in general.
I have to clarify that I'm not fixated on C#. Sure, you could create a helper assembly in .NET that is a mess of essentially of disembodied functions for computing every slightly more complex function that happens to be in your program. But this breaks OOD.
In C/C++ you can't do quite the same. The best you could do there is break OOD and try to hide these global functions by using private headers (which are ugly in their own ways).
> C# has the internal scope for each assembly, but this breaks in combination with unit tests placed in seperate testing assemblies.
This is a limitation imposed by the IDE, not the language. There's nothing stopping you from compiling code and tests into the same dll that you unit test. Likewise there is no need to separate code from the tests (apart from different files), they can simply not be included in release builds.
Source layout structure does not have to be a 1 to 1 mapping of the output structure.
Not keeping tests separate from the tested code can lead to chaos in the long term. It provides an incentive to blur the lines in inappropriate ways, e.g. by adding helper code for tests to the code under test etc.
In my experience the more things are public the better. Very often a quick workaround turns into a monster bodge because some method is marked strict private instead of protected or public.
So I usually make most stuff public as such, but put internals in a namespace/scope that makes it clear that these are implementation details. Relying on implementation details always carries the risk of breaking when upgrading.
This allows for a lot of flexibility when needed, while also not polluting the "truly public" API.
Making an API public means that people can do whatever they want with it. If you are not sure if you want to allow the API in the future it should not be public. People will always look to do the laziest thing possible which might mean hooking into your "public internal API". Then you will never be able to change it and you will have to maintain it forever.
They can do whatever they want with it but you have no obligation to maintain nor support it if it’s not a documented public API, in my opinion.
It’s a bit like a house on a corner with a big front yard. People may cut through the grass to save time but you can’t blame the homeowner when he finally puts up a fence.
It’s unlikely that allowing the public to short-cut across your yard would create a prescriptive easement, but there are certain circumstances where allowing a party or the public to cross a parcel of land for a sufficiently long period of time does prevent the property owner from erecting a fence.
Although it’s not a hard-and-fast rule, in Ontario certain property owners have paths open to the public most of the time, but close them for at least one day of the year, often Christmas or New Year’s Day. The intent is to prevent access for any one or more continuous years, which in turn prevents an easement from being asserted by any party.
Although it’s far from as simple as, “if you allow the public access to your land for one continuous year, you allow it forver,” a certain folklore around this has arisen, and thus the practice.
This is so wrong. APIs are for people, not tools, so intent is primary. When tools are not expressive enough to capture and enforce intent, you document it, but it's still primary. Someone using a "public" API that clearly says "for internal use only" is no different from a person that uses workarounds like reflection or direct memory access, and there is no obligation to keep things working for the.