Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: I'd like to practice coding GUI from scratch. Any recommendations?
154 points by scott01 on March 20, 2022 | hide | past | favorite | 91 comments
Hi folks!

As the title says, I want to practice coding a GUI framework, probably in C++.

I read about a retained and an immediate modes, have a rough understanding how event loop works, and overall my goal is to practice architecture and optimization (especially cache-friendliness), and less so graphics and typography rendering, though I understand it's unavoidable to implement a rendering pipeline (I plan to start off with AGG or Skia graphics libraries).

The dummy app itself will be less about forms but more about data representation, e.g. an audio editor or a node-based system, where data should be updated and visualized in real time, and everything should feel responsive.

Could you please give some research directions? Maybe case studies, best practices, some interesting software, maybe to read more in detail about related architectures, etc.? Or maybe personal stories? I'm struggling to find related info which would not be focused on some framework, etc.




I'd recommend two things. The first is Brad Myers' Software Structures for User Interfaces course at Carnegie Mellon University, which focuses on the guts of how graphical user interfaces work.

https://www.cs.cmu.edu/~bam/uicourse/05631fall2021/

The second is Dan Olsen's book Developing User Interfaces, which has all of the details of how GUIs work, from graphics to interactor trees to events to dispatching. For some reason, it's absurdly expensive on Amazon right now.

Both Dan Olsen and Brad Myers were early pioneers in GUIs and GUI tools, so you'd be learning from the masters.


> The first is Brad Myers' Software Structures for User Interfaces course at Carnegie Mellon University, which focuses on the guts of how graphical user interfaces work.

The videos are locked behind a login. Does one need to be a student?


Brad Myers has a public YouTube channel, here's a playlist of all those videos (his channel, his playlist): https://www.youtube.com/playlist?list=PL3856C8FlIWfr_tX8CMUh...


Looks like there is a more recent book by Dan Olsen, "Building Interactive Systems: Principles for Human-computer Interaction", which seems to be available at a reasonable price and similar in scope, maybe that's an option as well.


That's just awesome info, thanks for this! Added to reading list.


What a fantastic resource. Thank you.


> For some reason, it's absurdly expensive on Amazon right now.

Amazon bots have noticed that your comment is on HN so lot of people might buy it, that is why they jacked up the price :)


The cached version Google has (from 6th Feb) has the same price.


Clearly amazon bots used some AI to predict they’d make that comment months in advance, and jacked up the price (:


> I have heard what AWS does for this. I won't talk about it, at all, other to say: I assure you, one day we will all pay the price for their hubris. Quite possibly All of us, Everywhere. And they know it.


I have done lots of UI interfaces in different platforms. I have used Motif and X Windows, Win32, MFC, Gtk, Qt, Cocoa, simple Web interfaces, DirectX, OpenGL, Metal.

I love Dear Imgui: https://github.com/ocornut/imgui

It is the simplest thing in the world. If you are starting and are going to do 3D graphics anyway you don't need state(you just redraw the screen 60 times per second).

Anything else is extremely sophisticated. I also loved Qt, but the policies got a little cumbersome, and went native.

The big problem is that with state there is a lot of complexity involved that is dependent on a platform, and once you pick one it is hard to change.

The big advantage of 3d graphics and dear imgui or any other open source software is that it works anywhere and you are not as dependent on a single company.



I actually started tinkering with Dear Imgui, very cool library! From your experience, does this approach scale for complex apps with a lot of data updates, with dozens time series visualized, edited, etc., or am I overthinking?


It does, but you have to do a few extra things beyond what you'd do for the typical stuff like in-game editors and debug tools.

- Data update are the best part -- there's no synchronization needed with an IMGUI approach because the "model" and "view" are the same thing.

- adjust the event loop so it only redraws on changes (e.g. mousemovement) rather than every frame (unless you're drawing every frame anyways for a game)

- IMGUIs can have trouble with very large lists -- imagine you have a tree view with 20k objects, you're going to process each one of them each time through the gui, even though most of them are outside the visible list. But it's hard to only draw the "visible" ones since you don't know the future (e.g. imagine iterating a list where some items are hidden). Usually this isn't a big deal to work around, just keep it in mind.

- Biggest problem i've run into with IMGUI style is layout. Since you are processing widgets as you go, you can't adjust to future things. For simple layouts like stacked property sheets this is fine, but once you get beyond that it can be complicated. I'm playing with some ideas now where I use a constraint based layout to come up with "reference boxes" up front that then the IMGUI can use to layout but it's still a pretty open problem.


When listed elements(including trees) are very large, there's a reasonable compromise pattern throughout UI in pagination. We've already found that infinite scroll isn't that desirable for productive UX because it eliminates any kind of landmarks, and tree views are the same way: so, my current belief on what to do is to default to expanding breadth-first to some limit, and then bias node expansion in places where the user has clicked, including some history so that the UI can present multiple leaf nodes simultaneously. You can imagine this as having cursors for each explored branch and viewing the area around the cursor, with "18,000 more items..." at the edges of the view.

Re: the layout problem, it seems to be a factor in a lot of real-world apps(buildings, hardware design, etc.) and to the extent that it's "solved" by retained mode, it's a solution based on moving around the processing order so that you deal with a different set of constraint edge cases manually. As such I think it really is just a unsolved class of issues, which wasn't tackled before because it involved a degree of algorithmic complexity and resource use that wasn't on the table in the 1980's when the GUI was first widely adopted.

So I think it's fine to explore having a whole algorithmic path and data structures dedicated to building the bounding targets for both collision and rendering - I have done similar when I've poked around at doing my own framework. That reflects the actual complexity of the solution, and it becomes especially apparent how deep you could go once you allow your targets to be arbitrary shapes with 2D transforms.


Thanks for the insights! A related question, have you tried implementing “latent updates” of different IMGUI widgets based on whether their data was updated or not? I was kinda thinking how to avoid redrawing everything in a non-forms UI, like a node editor where each node is a graph, for example.


Yes. The maintainer regularly posts screenshots of apps using dear imgui in the wild, and they are as complex as you can imagine. https://twitter.com/ocornut?s=21


Do you think GTK is still good? It's on my list to learn to make a Linux app for Pinephone.


A lot of folks have given you the advice to learn one or more existing frameworks (Dear ImGui, Qt, Cocoa, etc). This is good advice, but I don't think anyone has yet articulated why it's good advice. I'll try ...

Good UI frameworks are composable, and progressively expose their API users (developers) to their constituent parts, which can include systems such as drawing/rendering, animation, compositing, event handling, text editing, view hierarchy, navigation, accessibility, and so forth. As you build more complex applications with a UI framework, you naturally find yourself customizing default behaviors or implementing new controls or functionality on top of the existing capabilities. You start to see how the framework itself is built just by using it.

Once you reach expert level with a UI framework, you will have a conceptual understanding of how high level features of the framework are implemented in terms of the lower level functionality. As an expert, you will feel confident that you could implement a new type of control, or reimplement existing functionality such that your version is a perfect peer to the built-in functionality from the library author on all important axes (developer API, performance, user experience, etc). In some cases this might be a lot of work, but at least you should know generally how to go about it if you had to.

See if you can get to expert level with at least 2 different styles of UI frameworks. At that point, you will be capable of building your own.


Lotsa good wisdom here

> Good UI frameworks are composable, and progressively expose their API users (developers) to their constituent parts, which can include systems such as drawing/rendering, animation, compositing, event handling, text editing, view hierarchy, navigation, accessibility, and so forth.

I’m curious what would be good examples and bad examples for you here?

My first real UI kit was with VisualWorks Smalltalk many years ago. For me, this idea of composability and exposure was really strong here. It wasn’t always the greatest code, but all the source was exposed in the class library, and so you could just use the SelectionInList or you could dig into it, put breakpoints in it, rally take it apart and learn how it all worked at whatever level of abstraction you wanted to wade into.

UIKit/Cocoa didn’t provide that level of exposure because it’s a closed source binary, but there was a time when the documentation was pretty good. And much of that still persists today. And much of it was honed by many years of NextStep development, so there’s a certain consistency to much ( but not all) of it. So it hasn’t been as good, but it’s been decent.

Then there’s been Android. This has been the worst. Early to market. Continuously evolved. Chasing the latest trend in UIs. Historically sparse documentation, and when you do find stuff through searching, good luck figuring out the relevancy. So this has been the worst for me.

I’d be curious which of the toolkits you’ve advanced in have been strong in your rubric, and which less so?


I was deliberately vague in my post about which to choose. There's surely something worth learning from every major UI toolkit in common use today. I'd say all the bad ones I've used have something in common: they were all written by people who never grokked a good UI toolkit before setting off and writing their own. But learning from them what not to do is useful too!

I personally have learned the most from Cocoa, but systems like React and Dear ImGui have definitely shown me new ways to think recently.

In terms of contrasting what's strong vs weak, look at how Cocoa changed from AppKit to UIKit. They kept a lot of the same ideas: such as runloops, target/action, view hierarchy, and the responder chain, but did away with some things that were redundant (NSCell) or poorly suited to producing fluid UIs (timer based animations). Put another way, they improved composability (everything is a view as opposed to some things being views, others cells, and others still windows), and they introduced a new low level system in CoreAnimation that provided compositing and animation and made it a fundamental building block.

---

An unpopular opinion I hold is that a good API design is more important than source code. Like, pick any method on UIView: strong Cocoa programmers could write a workable implementation given just the method signature and the documentation if they had to. The design of the API and how it all fits together is the real magic. Once you get that, the implementation follows. ... on the other hand, Microsoft actually tried to do that once, and their implementation was bad: https://github.com/microsoft/WinObjC/tree/develop/Frameworks...


I agree with Android. I've been trying to get into it for the last 2 weeks now and it's such a mess and it feels really outdated and the flagship IDE leaves much to be desired. Very frustrating experience.


> At that point, you will be capable of building your own.

But almost certainly should not do so.


Why not? Doesn’t it depend on what your goals are?


If you want to start from scratch, I'd recommend coding along with Casey on https://handmadehero.org. He develops an entire game starting with double-buffered pixel graphics and then moves on to sprite s with movement functions. The should give you plenty of background to be able to show a sound waveform (I'll offer one warning though ... Being able to edit sound I'm a way that still sounds good is much harder).


Well, he pretty much inspired me to go into this direction, such a great series! I particularly enjoy episodes on memory management, though, admittedly, they're a bit over my current skill level.


A long time ago I wrote a CAD system that ran on DOS and generated G-Codes for CNC machines. One of the features is that it would show the tool's path and it cemented my love for trigonometry. You seem to have found something you love too - kudos for chasing it!


The problem of creating GUIs libraries almost from scratch is the platform fragmentation, which means that a cross-platform GUI libraries would have to abstract away the Windows API win32, that is one of the most stable and most complete GUI libraries; MacOSX Objective-C Cocoa; Unix and Linux X11 - X Windows System, that lacks Win32 button and other higher level widgets, in this case it is better to use GTK as backend instead of targeting X11 directly that may even be replaced with Wayland on major Linux distribtions; Linux Wayland backend directly; Linux framebuffer for embedded Linux systems; or OpenGL for a immediate mode GUI like Imgui. Another trouble is that GTK is not so stable like Win32 and has many breaking changes on every release that may require anything depending on this library to be modified.

The following design patterns are widely used with graphical user interfaces: observer-design pattern; model-view-controller; model-view-presentation; two-way-data binding; property binding; command design pattern; and composite design patterns for representing a collection of objects as single object.

For understanding event loop it may be much easier to implement a Xterm or VT100 keyboard-driven terminal user interface TUI since this does not requiring dealing with too many backends.


Forget cross-platform. There are no great cross-platform toolkits.

Pick a platform and make the best application you can on that platform.


The greatest cross-platform toolkit is Qt that abstracts away all platforms including mobile and embedded systems. Qt also works on top framebuffer on embedded Linux and on other RTOS operating systems. The disadvantage of Qt is that is a C++ library with a lack of C ABI or C api (extern "C") that would allow Qt to be called from other programming languages than C++. That is the reason why Gtk has far more bindings than Qt, although they are unstable due to Gnone breaking changes and lack of backward compatibility.

Most applications just pick Windows win32 API or MacOSX and forgets about everything else and are never released for Linux or other Unices, unless the application uses electron, Qt with C++, Qt with Python or Java SWING like JetBrains IDEA IDE family. The main problem of Linux desktop is the lack of a high level graphics library toolkit with a stable API, ABI and a C interface that does not introduce breaking changes on every release. Unlike Gtk, Qt is more stable, but it lacks a C API and C++ has a fragile ABI not friendly to foreign-function interface or cross-language linking.


Do you see a future where cross-platform could essentially be a set of tools/frameworks to build high performance native experiences targeting the browser as the primary runtime?


Eventually - Google is pushing this with Flutter and Microsoft is pushing this with Blazor.


> There are no great cross-platform toolkits.

True. But there are several entirely adequate cross-platform toolkits. Pick a toolkit, not a platform.

(GTK+ (via gtkmm) user for 23 years, on 3+ platforms)


> in this case it is better to use GTK as backend instead of targeting X11 directly that may even be replaced with Wayland on major Linux distribtions;

X11 can work on Windows (many Xservers available, since forever), MacOS (used to ship with one, but now you've got to install one), and you can run an Xwayland in Wayland. So it's as cross platform as you can get. Might not look great, but then cross platform doesn't usually look great anyway.

If you're going to do X, avoid Xlib, it adds restrictive abstractions on top of X protocol and really confuses things. XCB is much closer to just reasonable interfaces to the protocol. Ultimately, X11 is a distributed systems communication protocol which happens to have graphical output as a side effect; understanding the communications part first lets you get the most out of it.


I've written a fair share of GUI apps in C++ on multiple platforms, starting with Win32 API and MFC on Windows and later Qt on Linux. The way you write applications with these frameworks is different enough to have a significant impact on the structure and design of the UI layer.

I don't have any suggestions for how to build such a framework, but I would encourage you to play around with a few existing frameworks to see how they are designed, what you like about them, and what ideas to avoid.

For just one example, see the difference between managing the event loop entirely by hand with Win32[1] and using "signals and slots" in Qt[2]. There is a lot more to UI frameworks than this, and they vary on many more aspects.

- [1] https://docs.microsoft.com/en-us/windows/win32/winmsg/using-...

- [2] https://doc.qt.io/qt-5/signalsandslots.html


Thanks! I had a look at signals and slots. Is this similar to Smalltalk's idea of objects sending messages to each other? Like messages they don't know about can be ignored, or something like that (I only watched Alan Kay's videos on this topic, never programmed in Smalltalk myself). If I'm not mistaken, Objective-C had something similar and was also successful for UI, so maybe message passing mechanism is something to explore more…


With Qt you don't really have to deal with unexpected messages. A signal is triggered by an event, like a button click or a slider value changing, a checkbox being toggled, this kind of thing. A slot is just a place to receive it, it's a regular function or method with a compatible signature for the kind of event it's meant to process. There are no unexpected calls because you have to explicitly connect a signal to a slot, something like:

    QObject::connect(&button, &QPushButton::clicked,
                     &mainWindowController, &MainWindowController::onButtonClicked);
If no slot is connected, the signal just doesn't go anywhere.

Something that's special with Qt is that signals and slots are declared in the class using custom syntax, e.g.

    slots:
        void onValueChanged(int value);
    signals:
        void valueChanged(int newValue);
Yes here "signals:" and "slots:" look like the same kind of keyword as "public:" or "private:" in a class declaration, even though internally they're #defined as "public:" and "/**/" respectively.

Qt makes use of a tool called MOC – the Meta Object Compiler – to generate code for signals and slots. It's not a pre-processor but a code generator, which has to run before you can actually compile the code. I think you could probably implement a similar system of event subscriptions without this additional code generation step.

See this page for details: https://woboq.com/blog/how-qt-signals-slots-work.html


signals and slots are just an anonymous callback system. There's no more (or less) magic going on. Anonymous? That means that the thing initiating the callback has no clue what the objects are that will execute callbacks.

Gtkmm, or rather its dependent library libsigc++, does not require a separate build step like MOC, it can handle signals and slots "natively" because it used a more modern version of C++. Qt required MOC when it was first written, and they never moved away from it, despite the language having moved on to allow the required functionality all by itself.


http://www.cmyr.net/blog/gui-framework-ingredients.html has a bunch of what you'll need to know about building a GUI from scratch, especially if you're interested in any of the "hard parts" rather than making a toy. It's Rust rather than C++ but many of the principles are similar.

Other than that, I do recommend the immediate mode GUI approach if you're writing everything yourself, as it's the lowest overall system complexity of all the approaches. It has a number of drawbacks, and things don't abstract/compose as easily as a standard object-oriented retained, or a newer React/SwiftUI approach, but exposing the raw parts has its upsides.


Rust world really needs a GUI...would be great if you helped out making a GUI for Rust: https://www.areweguiyet.com/

There is a start of a native Rust GUI which you could contribute to: https://github.com/linebender/druid


If you're looking into building toolkits for not-too-popular languages, I feel like Erlang should be interesting, the process model sounds like a perfect match to GUI components. I've never seen anyone trying to build that, for some reason. I have a feeling GUIs with borrow checkers are hard to get right, wouldn't advise building Rust toolkits as practice.


Rust is expected to be more-commonly used, cause it better handles the large domain of C/C++ based code, as it is less error-prone, but still fast and energy-efficient, which. Linux kernel is starting to support Rust now.

Rust for instance has no overhead of garbage collection, while Erlang does garbage collection at the end of every function.


> If you're looking into building toolkits for not-too-popular languages,

Best shade thrown on HN all week! Bravo.


It would also help you learn Rust, which will likely be a more in-demand skill than C++ in the future.


I've got a little Rust GUI library that's pretty young, and relatively simple, so maybe fun to contribute to: https://github.com/audulus/rui :)


So much this, a UI that understands traits, structs and automatically displays a control for that type in both bulk edit and single form mode would get money from me.


I highly recommend watching Casey Muratori's presentation on immediate mode GUIs, and his Handmade Hero series (links below).

https://caseymuratori.com/blog_0001

https://handmadehero.org/


> The dummy app itself will be less about forms but more about data representation, e.g. an audio editor or a node-based system, where data should be updated and visualized in real time, and everything should feel responsive.

Would you be interested in a 'no restrictions, informal' collaboration?

I have little to no experience writing GUIs, but your "dummy apps" sound to me like the sorts of things I could populate the 'guts' of, to make useful.

For instance, I work a lot with audio signal processing, have considerable experience in that area, and am "half-assedly" (low priority side project) writing tools that would benefit from a GUI.

My email is on my profile.


I find the SerenityOS userland code base quite approachable. It's all written in C++.


That looks like a good idea, probably can explore allocators, etc. from there. Thanks for that!


I've been enjoying hacking on Fidget as a side project! It's inspired by Figma and is written in pure Nim with an OpenGL backend. Previously I'd been dabbling with ImGui, but wanted something with an efficient event loop driven renderer. It's enjoyable writing in a compact language that's fast and provides memory management.

Here's my fork of Fidget: https://github.com/elcritch/fidget

I've uploaded a few youtube samples: https://www.youtube.com/playlist?list=PLfEcHAJujZRlloKeDj6iK...

Based on what you're asking about you might be interested in following an example where I tied in Nim's async system into Fidget: https://github.com/elcritch/fidget/blob/devel/tests/progress...


I would only like to urge you to think about accessibility. There are many custom GUI libraries out there, but almost all of them are made without thinking about accessibility. It kind of forces certain architecture, which may be hard to fit later. A11y APIs expose widgets in form of a tree.

I thought about writing a cross platform GUI toolkit, that would start from being a cross platform accessibility library.


To be honest, I'm aiming to simply practice UI architecture and optimization, and it's unlikely that something will come out of it. But I didn't know about A11Y before — thank you for bringing this up, will probably check it later.


If there's one angle I'd jump on the bandwagon of harping on about it's definitely this one because a) the subject doesn't get nearly the attention it needs and b) the heavyweight toolkits (Qt and Electron would probably be the mainstream contenders) seem to be where the objectively minimum-viable implementations are (although I could be wrong), which unfortunately seems to bring "learn how to leverage $toolkit_x" to the dependency table if you want to do accessibility well.

Installing a screen reader like NVDA and observing how it interacts with the screen might be interesting to learn about the scope/depth of the accessibility APIs in general. (And then there's observing how screen readers handle different websites...)

Imgui and similar generally have zero accessibility support whatsoever, although it's certainly not impossible to add - just a bunch of messages back and forth to the accessibility layers in the OS - but the support would wind up being a module, and a bolted-on one at that that would have very little uptake because everyone who didn't need it (for themselves) would just compile it out. Whereas with Qt and Electron it's kind of like rabbithoooooookaybackawayfromtheinternals and everyone just leaves all the bits in :P and the support just comes along for free. (An interesting study in the tradeoffs between minimalism, modularity, and the commoditization of support of niche use cases...)

Also, regarding practice of UI architecture and optimization, I actually sometimes think about the idea of drive-by contributing to open source projects - they have pre-existing contexts and conventions that must be adjusted to (an important skill that is tricky to develop alone), and it's possible to approach them with an objective impartiality that is sometimes tricky to square with emotional investment in personal projects. This may be one approach to independently evolving mechanical abilities (structure, domain-specific mental modeling and problem-solving, best practices) separately from fundamental personal development, which may (seems to?) go a little slower.


Could you elaborate a bit more about the a12y-focussed architectural constraints that you want to enforce?


Yeah. Don't.

GUI frameworks are like cryptography: you never want to roll your own. You will spend endless time implementing all the widgets a typical UI framework has, tweaking it until it feels kinda right, and even then they won't feel totally right until you've done extensive user testing and acted on the resultant feedback. Hundreds or thousands of hours of work for something that won't be as good as Windows, macOS, or Qt.

And I haven't even gotten started on internationalization or accessibility! Ohohohoho, boy.

There's a reason why the best (only good, really) UI framework is proprietary to a large, multi-trillion-dollar company.

In conclusion, just use your OS's native framework or one of the half-decent ones for Linux if you're working on Linux.


If you have a mostly 2d UI in mind, you might to from something even simpler than opengl, like SDL or raylib - anything that lets you out pixels on a screen.

My personal experience is that starting developing in opengl always ends up wasting hours getting the right libs installed to get potentially full 3d GPU acceleration - which might be beyond the point if you want to draw stuff.

Casey muratorri is spot on in saying that not having a single, simple, unniversal and basic API to "open a window and draw pixels" is what we lost the most during the 90s.

Anyway, try to bypass the not-interesting-to-you parts, and have fun !


I would definitely suggest starting with SDL, because it is a "single, simple, universal and basic API to 'open a window and draw pixels'" and it recently added an API to draw geometry directly without OpenGL. It also gives you a flexible event system that can be used for messaging.


Noted, thanks! Yes, I was thinking more about 2D


I cannot believe that no one has mentioned HTML Canvas with JavaScript. I'm no fan-boi/gurl for JavaScript, but you can do some crazy and amazing things with HTML Canvas in a modern web browser. Plus, "everyone" seems to think that TypeScript is a pretty great advance on JavaScript, by adding a bunch of useful things, like stricter types. As an added bonus, it's easy to share the results with others -- all they need is a browser.


I think the original question was about C++?


I checked the original question again. It says: <<probably in C++>>

My point was to open the door to the amazing web browser-based GUI platform that is HTML Canvas + Java/TypeScript. :)


First, you open a port to $DISPLAY...

xlib is quite a pleasure to use though, and it probably doesn't qualify as a "framework" that you don't want to use.


X is deprecated and virtually abandonware at this point, so... username checks out.


Xorg open source and open source software doesn't become 'abandonware'.


Sure they do, once nobody develops for them anymore, which is almost the case for X11. Nobody wants to touch that moldy old code base. There's somebody keeping the lights on I guess, but for all intents and purposes, ALL of the developer effort is on Wayland now. Wayland will catch up with and surpass X.


People do develop Xorg (X11 is the protocol, Xorg is the software) and as long as there is at least one person "keeping the lights on" it will be developed. Hell, even if every single programmer on earth stops working on the codebase, it can still be resurrected at some point later if someone decides to work on.

Open source software is resilient to 'death' and 'abandonment' issues plaguing closed source software.

I'm writing this as someone who is using a window manager (Window Maker) whose original developers abandoned it for years, then someone else picked it up some years ago to continue development and i personally have contributed bug fixes and features to it years after both of those.

OSS doesn't die for as long as there is someone out there who is interested on it.


I have tried to make bindings for GTK in Go way when its C interface didn't even support callbacks (so I used polling to capture events). Didn't go too far because there was a lot of repetitive work involved, but learned a lot in the process.

If you see yourself doing something 3+ times, automate it early!


There is no right way to design a GUI library, so ignore the people that insist you read a book and follow the paradigms it mentions to a T. In fact, we really need less of those OOP GUI libraries right now. Definitely try poke around with reactive, etc.


I'd recommend actually building something with Dear ImGui (IMO, the best GUI framework in existence by a country mile). Merely reading about the differences between retained and immediate mode cannot do justice to the paradigm.


I think you should checkout www.juce.com framework github https://github.com/juce-framework/JUCE.


Can someone comment on how https://github.com/aseprite/aseprite/ works.


I'm keeping an eye on Skia to try out some vector based UI ideas I've been marinating for a while.


Skia is a drawing library.

GUI frameworks/toolkits/libraries require drawing but also require an event handling framework, which Skia does not provide.


I got the impression that the point was actually building something, not stacking frameworks.


think about.. the various "hierarchies" / aspects / of the relations of the things in it. Event-passing is one ; Containment (x part-of y) is another ; Visual overlap yet another (now that depends on point of view) ; ... there are more.. depending on the purpose.


There're definitely keywords I'm going to Google :) As for hierarchy, I'd like to avoid having class hierarchies where everything is a subclass of Widget, etc. I guess I want to explore how much of the data required for rendering of such UI can lie in contiguous memory — hope it makes sense…


Start with Bresenham's line algorithm :) if you want to go with "from scratch".


"First one must invent the universe".

That being said, I'm interested to see the range of answers to OP's question, partially as I'm curious what HN considers "from scratch" and partially because I've thought about doing the same thing.

Edit: Same thing making-gui-from-scratch, not the audio stuff.


I started doing this in QuickBasic when I was 14 (trying to copy Windows 3.1 LOL). That little project taught me more about what I didn't know and it also revealed to me plainly, my inexperience with abstractions. Before that I had not really reflected on my own level of understanding about abstractions.


> Before that I had not really reflected on my own level of understanding about abstractions.

This is the main reason I try to do this when I can. When building things for fun I try to do it from the lowest level of abstraction that I can so that I understand more of what the higher levels of abstraction cover up.

I guess I just like turning unknown unknowns into known unknowns.


From scratch == floppy disk and magnetized needle.

But seriously from scratch in this case is a canvas plus possibly some primitives for marking the canvas. Cairo provides these things.

I used to have an old 2MB DOS notebook that I would fire up now and then to play with my ideas in this space. I could get mouse input through an INT op code of some sort, and there was the keyboard and getting access to video RAM.

I have tried to play with the Linux kernel drivers underlying things like wayland and X but I lack the determination to see that through.

Another post here someone listed the object model paradigms and that’s probably closer to what OP is interested in.


FWIW /dev/fb0 et al are considered/marked deprecated and KMS/DRM is considered The Way Of The Future™, but the old framebuffer interface still works just fine and is actually surprisingly fun to play with. Your problem may not entirely be lack of determination: I took a look at KMS myself a while back, and found so little documentation on how to use it (both correctly and incorrectly - very helpful to understand when learning) that I just gave up and used the deprecated framebuffer API, which is as complicated to use as mmap() (and poking the KDSETMODE ioctl() to tell the kernel's tty layer not to touch the framebuffer).

What's particularly fun is that an optimally-compiled kernel can boot to userspace disorientatingly quickly - like, ~500ms on decade+-old hardware. So you can be running your code almost as the user's finger is still moving away from the power button (yeah, okay, notwithstanding BIOS/EFI dwell time).

QEMU is also just as fast (and has no POST delays), and rebuilding small initramfs images takes like a few hundred milliseconds if done carefully, so you can end up with iteration processes where you can tweak a line of code, hit ^S and have an updated initramfs running the newly-compiled code in a fresh boot in like 1 second or so. This was on a low-end i3 NUC.


This is intriguing. I did find some drm example code and play with it a little, but I was either in my actual desktop environment or in a vbox that I needed to keep working. I should go into it again sometime after shaving the qemu yak as preparation.


Huh, you went a little further than I did then, cool. I also vaguely recall playing with some example code, but IIRC I wound up stuck with something that cycled my laptop's backlight every time I loaded it, and the documentation (and my poor attempts at googling) all pointed at that only being an issue if I was using code that was different, or something. Yeah I just gave up lol.

As for what I did get working... uhh... I did get graphics on the screen in the end, but the actual hardware I was testing with proved a tad flaky at the time and I never had the first clue how to figure out what was going on, so I ended up moving on to other things. I did make reasonable progress on the QEMU yak-shaving front before getting to that stage though.

- Compartmentalizing yak-shaving QEMU distinctly from everything else might be accelerated slightly by borrowing your distro kernel for a bit (because they generally always have the modules necessary to reach an initramfs compiled-in).

- You can skip out on needing a disk image, partition scheme and conventional bootloader by passing `-kernel` and `-initrd` (and `-append` to set the boot arguments). This neatly solves an entire dimension of issues at once, and is awesome.

- Specifying `-serial mon:stdio` (and `-append 'console=ttyS0,115200'`) writes serial port output to QEMU's stdout *while* giving you a way to fall-through to the QEMU monitor using ^A c, and insta-kill the VM with ^A k (orrr it might be ^A q or ^A x or something, I don't remember lol). This is awesome for bringup and furthermore irreplaceably practical as a way to get guest-side stdout (aka printf() \o/) onto a consistently-located spot on the screen (ie no pesky windows opening in random locations etc). (As a point of comparison, the -curses option does a fancy thing where 80x25 VGA textmode is translated into a curses window - yup, booting MS-DOS this way is fun - but in this mode terminal scrollback doesn't work; -serial mon:stdio doesn't initialize curses, so terminal scrollback works normally/correctly.) All this is complimented well by `-nographic` which disables VGA output, likely markedly irrelevant beyond bringup, but probably quite useful to begin with.

- QMP, the QEMU Monitor Protocol, is a JSON-based just-verbose-enough-to-be-annoying control protocol that lets you do nice tricks like telling running QEMU instances to reset/reboot (fully restarting QEMU will definitely slow things down a bit). QMP can listen via TCP or a domain socket. Thankfully the initialization handshake is effectively one-way, so IIRC you can just stuff the "yes hello version 1 blah blah" down the wire along with the reboot request as a fixed payload crammed through netcat.

- "Oh but of course" surprising-but-not-surprising thing: in order for the text console to have any practical purpose, writes to it must be synchronous so that the "about to load X" has fully made it (to the VGA text region|to video memory|out the serial port) before X happens. That synchronicity is a truly significant source of boot-time slowdown, and (maybe second to jiggling what kernel code is compiled in, what's in modules, and what's left out), completely disabling boot messages is one of the secrets to kernels that load virtually instantaneously. This is straightforward - `-append 'loglevel=0'` - aaaaand naturally happens after bringup :). Once you get there (and have commandline reboots working), with all the noise gone (QEMU prints nothing itself) you can make decisions about things like whether you'd like to preserve the output from previous boots in the terminal scrollback (incidentally `tput clear` resets that).

- Playing with an initramfs-less kernel (or the distro stock initramfs) can be useful to figure things out, but gets boring after approximately 2 minutes. One frequently-used approach to generating initramfs images is `find | cpio --create --format=newc > initrd.img`, but the cpio method has the caveat that the character and device files (/dev/console etc) must exist in the source directory that gets scanned from (and be of exactly the correct type/major/minor) for the resulting initramfs to work correctly. An alternative, even faster and IMHO honestly easier-to-manage way can be found in the `usr/gen_initramfs.sh` script that comes with the kernel, which basically just passes `usr/gen_init_cpio` a specially-formatted file list (see `default_cpio_list` for a (small) example - incidentally the cpio generated by that file is hiding in every kernel that doesn't have an initramfs embedded in it). This program has the benefit/advantage that you pass a virtual file list and can create character/device files and directories just by describing them in the file list. I would wager that taking the shell script apart and figuring out how to wrangle wrangling this approach is honestly less wall-of-text than it may initially look; you just describe what you want put where. (One note: the cpio format is cute - if you store /dev/console, but don't store a directory reference for /dev before it, /dev/console won't exist as far as the kernel is concerned >:D you have to spell out the directory entries!)

- Just in case... if you get the idea to unpack an existing initramfs with cpio perhaps to visualize how it works (and maybe start by repacking an existing known-working initramfs), you'll need to unpack the initramfs as root so its contents are given the correct ownership and permissions and cpio can create device and character files and whatnot (so that the initramfs gets re-packed with files that have the correct permissions and types and etc). In this scenario I point you at cpio's `--no-absolute-filenames` option; it is the negation/opposite of the `--absolute-filenames` option whose description is: "Do not strip file system prefix components from the file names. This is the default." This obtuse documentation is telling you that, by default, cpio will extract the image to /, which it will do successfully because it must be run as root; and you will presently find you have a very, very broken system :) - I had to reinstall Arch a decade ago because of this lol, not even completely reinstalling every single package on the system fixed it. Basically, if using cpio in extract/"copy-out" mode, forget `--no-absolute-filenames` at your computer's peril. With `--no-absolute-filenames` cpio will extract very happily into the current directory.

- HI I AM VERY LOUD AND I AM HERE TO TELL YOU TO NOTICE AND REMEMBER THE ABOVE POINT IF YOU TAKE NOTHING ELSE IN

- The kernel and initramfs are typically compressed somehow. You might want to do some experimenting to find what works best in QEMU - given that the input files will likely be sitting in the page cache, and QEMU's -kernel and -initrd options work by mapping the files directly into the guest's memory space in one go, skipping initramfs compression (on regen) and decompression (on boot), along with disabling kernel compression, theoretically has virtually no downsides. I forget whether I found gzip to (for whatever reason) be slightly faster despite all of this.

- Copying all the source files to, and generating the initramfs into, a tmpfs will basically mean that the process of re-cutting the image boils down to memcpy() with more steps. IIRC this is what took things from "cool! 5 seconds!" to ^S "...wat." ^S ^S "...how..." ^S ^S ^S ^S ^S "wat how is this possible" ^S ^S ^S :) QEMU can pretend to be Firecracker partially successfully :P

- In terms of actually doing anything at all in the initramfs, a first-class startpoint would probably be BusyBox, not just because it'll be very helpful, but because its (entirely approachable) build system wants to compile a static binary out-of-the-box by design. You'll need to make sure all its little symlinks are in place in the resulting initramfs (but you can also copy the busybox binary to /bin/sh in a pinch), but beyond that, the generated binary should work both on your host and inside the VM.

- Going beyond BusyBox involves selecting a libc. There's nothing stopping you simply just copying your distro copy of glibc into the initramfs, at which point everything you compile normally with gcc will immediately work (and this also sets up the playing field to be able to "argh what file does it need *now*" other stuff into the system, like for example gdb). The convenience of this approach is kind of hard to beat, but pursuing other options is generally how you cut down on initramfs size. I never moved on to properly playing around with alternative libcs, but I did note that diet libc includes an invocation wrapper that handles calling gcc for you. (And on a sidenote, getting remote host-side gdb debugging working would probably be quite the fun nightmare, but is probably the correct route to go down... because gdb will probably want to pull in python... :v)

- Under certain circumstances you might want to transfer data in and out of the running VM. Rewriting the initramfs and rebooting stays pretty fast up to ~20MB or so (!) so that works for getting even largeish amounts of data in (where rebooting is okay), but beyond that, and for two-way I/O, 9P protocol support is built into the kernel as a straightforwardly lightweight alternative to NFS, and `diod` works very well on the host/serving side. (The side project/quest is that the kernel must now support networking and you get to play with busybox's `ip` (or `dhcpcd` if you're lazy). Welp.)

- A lot (not all) of distro kernel configs enable the option to cram a compressed copy of .config into /proc/config.gz, which could work as a known-functional starting point if it's available. (Or you could just `make allnoconfig` then headdesk for the next few hours... this option does admittedly have a meditative component to it, and you learn where all the important stuff is :3)

- On real (Intel) hardware, I (very belatedly) learned about `i915.fastboot=1` only the other day, which skips the backlight cycle on boot when transitioning from the VESA or EFI framebuffer to properly accelerated graphics.

- A footnote that I'm mentioning for completeness: you'll likely use an approach where /sbin/init is a shell script that runs whatever tasks you want at boot time, with a `while :; do /bin/sh; done` at the end (if nothing else this saves you 100% CPU usage of one core from Linux's panic routine spinning the CPU in an infinite loop). This init script would likely stay open in your primary editing session and change frequently instead of having to hammer away repeatedly at the shell on each reboot.

- I mention ^S throughout - just to clarify, I use inotifywait to react to file save events, and in this situation I'd probably do something like `while true; do ./build.sh; inotifywait -qq -e moved_to .; done` (moved_to might need changing, see `inotifywait --help` and experiment with `-m`), where build.sh might call GCC, re-cut the initramfs, then hit the QEMU reboot endpoint. (Yeah, the workflow I ended up with had one terminal for build and another for QEMU/stdout. If I were revisiting this today I'd probably use `printf "\e[1t\e[5t"` to unminimize+raise the build terminal on errors and otherwise keep it minimized, but that's just my own workflow. (Sidenote, searching "man dtterm(5)" FTW. The terminal in question is long gone but everything else supports the same escape sequences. Haven't found a better reference yet.))

- Oh, look into `openvt`. BusyBox comes with a copy.

All of the above basically pretends Buildroot doesn't exist and reinvents it, arguably poorly :). (I should probably look into it sometime and make a proper comparison.) In all fairness Buildroot is legitimately probably slower than this approach.

At the point I got bored with all this and moved onto something else I was bikeshedding the filelist generator for gen_init_cpio (and debating just writing the cpio archive myself, although that would have been of marginal help). This was a moderate-difficulty devops problem: I had virtual files I wanted to always be present along with piles of real files from all over the place in different source directories that I wanted brought together into one image, I needed to ensure directory entries existed before their contents, and then I wanted to be able to toggle groups of things on and off depending on different requirements. This will realistically probably surface as a small "build tooling" concern at some point, because the flexibility afforded by gen_init_cpio (compared to cpio and being stuck with moving things in and out of source trees) affords sufficient complexity to encounter this sort of scaling problem. I guess it's subjective whether that complexity affordance is a bug or feature. All this is objectively a small trivial detail, I'm probably just remembering it as a bit of a saga because this is what I was working on when my attention span went out to lunch.


Thanks for all of this.

I work on projects that mostly use buildroot to make their images, but I've also done that other ways when I was playing with an RPi.


> But seriously from scratch in this case is a canvas plus possibly some primitives for marking the canvas. Cairo provides these things.

You need an event system also. Cairo does not provide this.

We wrote out own custom 2D canvas with Cairo for Ardour, but the event handling has to originate somewhere else (in our case, GDK).


Thanks for the reference, bookmarked! I think this might be too much for current practice goals, as I'd like to practice architecture that would enable me to create somewhat flexible GUIs and then to attempt to optimize it for cache.


i recommend opengl for this instead of trying whit GUIs, you could make easy small simulations and shot in your foot whit rendering pipelines fast and easy, opengl resource are all over the place, good luck.


Super! Thanks everyone for your answers. A lot of things to explore!


Yikes. Just use Java or like C#. They perform just as good, if not better than C++ since with C++, you gotta do some things a very specific way to make it work. If anything, use the QT framework. That seems to be the biggest one for C++.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: