Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Interactive GCC (igcc) is a read-eval-print loop (REPL) for C/C++ (github.com/alexandru-dinu)
170 points by pr337h4m on Sept 28, 2023 | hide | past | favorite | 69 comments


This should perhaps have "(2009)" in the title; it is a fork of http://www.artificialworlds.net/wiki/IGCC/IGCC . Andy Balaam worked on the original IGCC 2009, 2012. Alexandru Dina worked on this fork 2018-2021. The fork seems to be mostly minor modernization, and adding color.

It works by compiling and re-running the whole session each time and assuming that it only printed `n` bytes last time but printed `n+m` bytes this time, that it should only show the user the newest `m` bytes.


How does it handle side effects? If I open a file in append mode, write some bytes, then close the file does it continue to grow while I work on later things?


It keeps growing, sending all the bytes again for each eval.


Not well.


This should not be called a REPL, but a RACRDL (Read-Append-Compile-Run-Diff Loop).


Similar to Cling[1] from ROOT.

[1] https://github.com/root-project/cling


The odd part is that this is not just for fun. For many physicists when I was at CERN, a C++ REPL was a commonly used tool to interactively debug analyses to such a degree that many never compiled their code. Back then, I believe, it was some custom implementation included in ROOT (https://root.cern/). I even went out of my way to write C++ code compatible to it just so it could run in that custom C++ REPL implementation, otherwise some colleagues weren't interested in collaborating at all.


As CERN alummi I can attest the same for ATLAS TDAQ folks already in 2003.


Yes, before CLING there was CINT!


Cling has been largely upstreamed as `clang-repl`. All distributions of clang+tools has it now, and unlike Cling this supports the latest Clang features. However Cling still has some special features for GPGPU that `clang-repl` doesn't.


Interesting to see that I had a C++ REPL on my system, installed with clang by default, all along.

From the documentation, it seems this is primarily aimed at incremental compilation on demand via Compiler As A Service.

Indeed, for an end-user, calling this a REPL seems to be a stretch. There's no auto-completion or special commands like showing docs or the checking the type of things, you even have to #include <iostream> to manually print the results of simple expressions.

So I guess read-eval-loop would be more fitting :)


I use it on occasion.

Invoking it with rlwrap makes it a little nicer by giving it readline support.

I think there is a lot of low-hanging fruit for someone to make an IPython-inspired repl around it.


Nice! TIL.

Exciting to see this upstream - something I didn't get around to with ccons which was one of the early prototypes of this built on top of clang (2009). Back then clang C++ support was still being worked on, so ccons targetted C. Later, I collaborated a bit with the Cling folks who I think did take some inspiration from some of my approaches (and likely some of the clang/llvm changes that I landed upstream for supporting functionality / bug fixes). Exiting to see this in the tree now!

https://raw.githubusercontent.com/asvitkine/ccons/master/doc...


Great to hear, I had to build cling myself a some years before the pandemic for a classroom and each and every time it was a race with A/V software starting to trigger everything when X number of people started running a "new" executable.


I have a vague sense that in the era of ML/AI hype and the complex Python/C++ stack that makes it all possible, Cling and the ROOT project should somehow be more visible? Is the project missing some VC driven marketing?


People who hype things use Python for coordination and most gruntwork is done by the GPU or C++ libraries anyhow so C++ doesn't help that much in the big picture. (And hype-people probably often aren't technical enough for C++)


I think there is a causality chain that goes something like this: hype people get attracted to where they see popularity, popularity requires exciting usability and results without much investment and such usability requires smart management of complexity (as there is no free lunch).

In our context Python became popular because it has smartly hidden the complexity via C++ libraries. But while that approach is powerful it is not very flexible (you need to work within the rails of these libraries).

People have tried inventing a new paradigm (julia) that combines usability with performance but it has not struck a chord (yet?). The latest effort being hyped is to double-down on python semantics and build a performant C-like superset (mojo).

The interesting question regarding the ROOT project is whether you can stick close to the C++ universe yet build powerful tools that hide complexity. Its not trivial, e.g. would it automatically make good use of heterogeneous CPU's/GPU'?

But its amusing to think that after CERN taught the world how to share data with the WWW project it might teach us next how to actually work with that data :-)


Julia main focus is scientific computing, and they aren't doing that bad,

https://juliahub.com/case-studies

There have been other interactive environments for C++ that predate ROOT, but they were too resource intensive and the market killed them.

Namely, Energize C++, which came out from Lucid as they pivoted away from Lisp Machines, applying the same kind of technology to C++, based on XEmacs.

https://www.youtube.com/watch?v=pQQTScuApWk

https://dreamsongs.com/Cadillac.html

And the last version of Visual Age for C++, based on the Smalltalk development experience.

http://www.edm2.com/index.php/VisualAge_C%2B%2B_4.0_Review

https://books.google.de/books?id=ZwHxz0UaB54C&pg=PA206&redir...


Lucid was never developing for Lisp Machines. Their product was a portable Common Lisp system for UNIX machines.

Later they developed a C++ development environment (C++ compiler, IDE with XEmacs and code infos in an object database).


Sure, it is a common error I do referring to all those CL environments as "Lisp Machines".


Lucid CL ran on top of a UNIX/C machine. A Lisp Machine is a whole machine itself.


Being pedantic, I already mentioned that is a common way I wrongly refer to them, hence the quotes.


> People have tried inventing a new paradigm (julia) that combines usability with performance but it has not struck a chord

For any non-scientific purpose, Julia is not attractive, so I doubt it will gain much more traction.

> whether you can stick close to the C++ universe yet build powerful tools that hide complexity. Its not trivial, e.g. would it automatically make good use of heterogeneous CPU's/GPU

That's where we are going. Technically being generic over CPU/GPU (for defined set of operations) is not hard, just takes work (and leaders in this domain are interested in keeping thing proprietary), but that's already work in progress.

In general terms, copying Python usability (or 90% of it) is also simple and other languages head that way. Rust or Nim can come very close (when you have good set of libraries and pass the boilerplate).


And then there's cppyy, also with ROOT heritage... https://cppyy.readthedocs.io/en/latest/history.html



I used to use Cling on macOS back in 2018 or so. It was really cool for the time.


Xeus-cling is a Jupyter Kernel for C/C++: https://github.com/jupyter-xeus/xeus-cling#a-c-notebook

With the xeus-cling Jupyter Kernel for C/C++, variable redefinitions in subsequent notebook input cells do not raise a compiler warning or error.

There's JsRoot, which may already work with JupyterLite in WASM in a browser tab?

There's a ROOT kernel for Jupyter, too: https://github.com/root-project/root/tree/master/bindings/ju...

Instead of the ROOT Jupyter Kernel, you can just call into ROOT from Python with PyRoot (from a notebook that specifies e.g. ipykernel, xeus-python, or pyodide Jupyter kernels).

"ROOT has its Jupyter Kernel!" (2015) https://root.cern/blog/root-has-its-jupyter-kernel/

IDK if there are Apache Arrow bindings for ROOT?; though there certainly are for C/C++, Python, and other languages

You must install jupyter_console to use Jupyter kernels from the CLI like IPython with ipykernel.

In addition to IPython/Jupyter notebook, jupyterlab, vscode, and vscode.dev+devpod;

awesome-cpp#debug: https://github.com/fffaraz/awesome-cpp#debug

"Debugging a Mixed Python and C Language Stack" (2023) https://news.ycombinator.com/item?id=35710350

ROOT: https://en.wikipedia.org/wiki/ROOT

"Root: CERN's scientific data analysis framework for C++" (2019) because if PyRoot to C++ https://news.ycombinator.com/item?id=20691614 :

`conda install -c conda-forge -y root jupyterlab jupyter_console xeus-cling jupyterlite`

SymPy's lambdify() doesn't support ROOT but does support many other ML and NN frameworks; From "Stem formulas" (2023) https://news.ycombinator.com/item?id=36839748 :

> sympy.utilities.lambdify.lambdify() https://github.com/sympy/sympy/blob/a76b02fcd3a8b7f79b3a88df... :

>> """Convert a SymPy expression into a function that allows for fast numeric evaluation [e.g. the CPython math module, mpmath, NumPy, SciPy, CuPy, JAX, TensorFlow, SymPy, numexpr,]


> JsRoot in JupyterLab

"jsroot and JupyterLab" https://github.com/root-project/jsroot/issues/166

JupyterLite docs > Create a custom kernel: https://jupyterlite.readthedocs.io/en/stable/howto/extension...

`jupyter lite` builds a set of packages into WASM WebAssembly with emscripten empack.

JupyterLite docs > Configuring the pyodide kernel > Adding wheels https://jupyterlite.readthedocs.io/en/stable/howto/index.htm...

Vscode.dev also supports the pyodide kernel.

Does `pip install root` work in JupyterLite (in the pyodide Python kernel)? Probably not because root is not a plain python package.

- [ ] Create emscripten-forge recipes for JsRoot and root, so that root is usable with the pyodide kernel supported by JupyterLite and pyodide

emscripten-forge recipes are compiled, packaged, and hosted.

emscripten-forge/recipes//recipes/recipes_emscripten/picomamba/recipe.yaml: https://github.com/emscripten-forge/recipes/blob/main/recipe...


The real deal, or is it yet another one of those Potemkin REPLs that append code to a temporary file, compile it and diff standard output?


Writes to a temporary file and compiles it: https://github.com/alexandru-dinu/igcc/blob/84f68c7056d0d996...


The ethics discussion in the neighboring threads aside, this approach has some nasty surprises, and I think it's borderline deceptive that they aren't mentioned more prominently.

It's not that rare that my REPL sessions e.g. in Node are

> perform expensive computation

... result ...

> transform result

... transformed result ...

With this approach, the REPL slows down quadratically with session length, and any expensive command at the beginning is executed over and over again.

Also, better don't do any file I/O and then continue the session unless you're prepared for the consequences.


If it's re-executing commands, I agree, it is very deceptive. I had no idea until I read this thread, and this whole time I had been confused about how they were doing this. How does this even work when you have non-idempotent operations?


It doesn't.


Compiling diffs - but if the user experience is REPL-ish and it can help some people learn the _basics_ of the language, what's wrong with that?

(Wrong as in "murdering puppies" wrong, not as in "here are the downsides of X" which I believe exist for every X.)


> what's wrong with that?

Why nothing at all, of course. A REPL need not be more than a way to test and explore syntax, functions, and logical structures.

> the user experience is REPL-ish and it can help some people learn the _basics_ of the language

PREPLISH exists for Perl ^_^

https://github.com/viviparous/preplish


The inefficiency of that process increases electricity usage, the carbon footprint, and therefore contributes to climate change. Which, eventually, murders puppies.

Good enough for your arbitrary standard? I would have thought the fact that all preceding side effects will be re-executed each time a new line is written is bad enough, but whatever.


>The inefficiency of that process increases electricity usage, the carbon footprint, and therefore contributes to climate change. Which, eventually, murders puppies.

You'll waste more electricity trying to solve it correctly than it will ever waste.

The real problem to solve would be: how to make GCC, Clang, etc. fast. Or how to make C++'s syntax easy and fast to work with.

Or how to convince people to switch to Rust? ;)


Penny wise, pound foolish. By this logic computer games are a genocide in which millions participate every second


The true question this types of discussion evoke to me is :

Are we able to live ethically ?

(Which leads to "What is ethics ?", which eventually leads to " Ethics is nonsensical".)


When my dad saw a queue of tourist cars on a weekend around half of the ring of the capital city he said that if this really causes global warming, this must be banned immediately.


Well, if you ban tourism, you're on an efficient track of slowing global warming.


Banning Python would also be huge for reducing our carbon emissions.


You see, logical thinking doesn't bring us far in ethics, because it leads to "The biggest thing we can do to reduce our carbon emission is to cease".


> Compile errors can be tolerated until the code works

Huh, that's not really great. You kind of want (1) the REPL to be syntax-aware, and not feed loops into the compiler until they're syntactically complete (with a multi-line editor) and (2) to reject input that causes the overall program to fail to compile.


Perhaps noise and wasted compile attempts could be reduced by having a simple counting mechanism for {}, (), and []. This wouldn't be correct if the program uses macros with unbalanced characters, but that's rare, so I suppose it would suffice to allow the user to turn it off.

So a compile wouldn't be attempted if there are obvious unbalanced braces.


Coming to C++ from python had me looking for something like this for a long time, but I've come to embrace my little single-file "replit.sln" which has proven to be much more flexible anyway.


gdb has always been my go-to repl for C/C++


Yes I remember when implementing AA_tree in C++ I was able to call on demand my AA_tree class helper function that would output the tree state to a dot file and I was able to see the tree schema update live. Pretty amazing.


I'll give an unpaid plug to https://www.softintegration.com/ for their Ch product, which is beyond excellent (and has existed for a very long time.) Clang-Repl/Cling is, however, nice to see, especially as it is open source.


Is it possible integrate this in a notebook environment? This would be amazing for teaching C/C++.



> using namespace std;

Good god.


I'm not a C++ programmer, could you explain why this is a bad idea?


If it's in a header file it imports the namespace into everything else. People make a big deal about it even though most of the time it isn't. If it is isolated to a compilation unit it's fine.


> People make a big deal about it even though most of the time it isn't.

For your personal project maybe it isn't. On shared or reusable code it's quite problematic.


It's problematic if it is in a header file that is going to be reused, not if it is in its own compilation unit.


It's less problematic in that case, but still problematic. It's less apparent during initial authorship, because you often have all the names that might collide already in your head at that point. It's during later maintenance when you lack context (like when you/your coworkers haven't seen, or don't recall, the code) that the problems become more apparent. You're basically digging holes behind you as you're walking along, and even though you might not trip over them, someone else walking on that path eventually will.


Not really. If someone is using vector<type> or unordered_map it's unlikely its anything else and if you are using an IDE, you can check easily. You are basically implying that anything short of full names are a problem which is just not true.


> Not really. If someone is using vector<type> or unordered_map it's unlikely its anything else

There are far less distinctive names than those in namespace std. Have you seen remove(), erase(), move(), search(), apply(), hash, format(), etc.?

Not to mention std::operator overloads like == and < that you suddenly drag into overload resolution needlessly, which can bring their own fun into the mix.

> if you are using an IDE, you can check easily

Lots of people don't. And even those who do, don't immediately have the symbol available to go to its definition. It's quite normal to have to wait O(minutes) for semantic analysis to become ready.

> You are basically implying that anything short of full names are a problem which is just not true.

No, very much not so. If you use it on something that's unlikely to collide (like std::cerr), `using std::foo;` is generally fine outside of headers. `using namespace std;` is the one that's problematic.


Lots of people don't.

If they did they wouldn't have these problems of wondering where functions are defined.

Who are these people not using an IDE when working professionally with a team of people? That's a much bigger red flag than putting using namespace std; inside a compilation unit.

If there is ambiguity, then you can just add a std:: in front of a function. This is all within a single compilation unit anyway.


> Who are these people not using an IDE when working professionally with a team of people? That's a much bigger red flag than putting using namespace std; inside a compilation unit.

You seem to have completely ignored what I wrote after "even those who do..."? Like I explained, you're not immune to these issues just because you use an IDE.

And to answer your question, it's lots of people at companies whose C++ talent you would (or should) appreciate. And many people use both IDEs and text editors, depending on lots of factors, like the size of the task.

> If there is ambiguity, then you can just add a std:: in front of a function. This is all within a single compilation unit anyway.

Just because it's in a single TU that doesn't mean it won't be somebody else's problem.

First, someone who modifies a header you include will now potentially break your code simply via a name collision. You're making it harder for them to change their code without breaking yours. And you're probably not their only consumer.

Second, you're now doing ADL lookups instead of regular lookups. This can add a ridiculous amount of noise to error messages for widely used identifiers, which will make life much harder for everyone.

Third, not everything results in an ambiguity. The moment someone introduces an overload that happens to be a better match for your lookup, this can silently cause your code to misbehave. It might not be the most common problem but it's sure as heck one of the most painful when you or your teammates eventually get bitten.


First, someone who modifies a header you include will now potentially break your code simply via a name collision.

That would mean that they are writing their own global functions that collide with the standard library which is a pretty big mistake itself.

Part of knowing the standard library is knowing not to make some function called end() in the global namespace. This really isn't a big deal. It's one of those group think ideas that permeates and lots of these jihads in programming have been totally wrong. This one I think is just blown out of proportion. There is a lot of simplicity in not having huge long lines for every type. Part of this can be done with auto, part of it can be done with aliasing, but an isolated namespace declaration isn't the end of the world.


> Part of knowing the standard library is knowing not to make some function called end() in the global namespace.

For end(), sure. But I listed a bunch more names that are way more likely to collide though... remove(), erase(), move(), search(), apply(), hash, format() are way more likely to collide. The fact that you have to pick the most implausible names to make your argument should be enough evidence that it's a strawman.

> That would mean that they are writing their own global functions that collide with the standard library which is a pretty big mistake itself.

No. It just means you're in the same namespace, or a sub-namespace of theirs.

You're also ignoring ADL effects and silent overload resolution changes, which I mentioned already.

> It's one of those group think ideas that permeates and lots of these jihads in programming have been totally wrong.

You've made factual mistakes in a bunch of your arguments -- most recently your misconception that such collisions only happen with the global namespace. Calling this "group think" and "totally wrong" when your arguments rest on incorrect assumptions is not really warranted.

> There is a lot of simplicity in not having huge long lines for every type.

You're arguing as if we don't understand that... despite the fact that I evidently did, as I explained earlier how you could achieve that simplicity by doing "using std::foo;" instead of "using namespace std;", while avoiding the vast majority of the downsides I'm pointing out.

OTOH, what you don't seem to be considering (and which I've been trying to point out) is the various types of friction you're introducing for your coworkers and future maintainers.

> an isolated namespace declaration isn't the end of the world.

"Not end of the world" is a... low bar. Just because a practice isn't the end of the world, that doesn't mean avoiding it isn't a better idea. I've pointed out several reasons why: unintended ADL lookups, ridiculous error message noise, silent misbehavior, name collisions, difficulty of changing dependent code without breakages, etc. And I've pointed out how you can still achieve your goal while mitigating these considerably.

I don't have anything else to add.


The fact that you have to pick the most implausible names

Actually I just picked one as an example.

Calling this "group think" and "totally wrong"

I didn't actually say this was 'totally wrong'

what you don't seem to be considering (and which I've been trying to point out) is the various types of friction you're introducing for your coworkers and future maintainers.

I do and it's not that bad. I didn't even say that I do this, just that it isn't the problem that some people think it is. You are having a meltdown over nothing.

You've made factual mistakes

Nope

most recently your misconception that such collisions only happen with the global namespace

I didn't actually say that.

I don't have anything else to add.

That's for the best, because this really isn't worth getting upset about.


How else would you know that “cout” is std::cout and not some other cout? /s


Are you sure every name in std is as unlikely to collide as cout? What about format(), apply(), ...?


If a name collides, nothing prevents you from explicitly qualifying it.


When it silently changes the behavior you have no idea there was a collision.


Hey Bjarne Stroustrup himself uses it in his "C++ Language Book", so it must be fine. /s




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: