Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: AutoHotkey for Linux (github.com/phil294)
500 points by phil294 on Aug 30, 2022 | hide | past | favorite | 183 comments
Hello HN,

this is the first functional reimplementation of AutoHotkey [1] for Unix-like systems, as far as I am aware. Half the commands are still missing, but everything important is done, as I have worked a lot on it over the past two months. Converting scripts into stand alone binaries is also supported. Hope this will find some adoption eventually. :-) - This implementation focuses on v1.0-like classic syntax from 2004 (!). This is a significant subset of the popular current v1.1 syntax from Windows (AHK_L). The reason this does not (yet?) target the full Windows spec is how complex it is. Notably, there's also another ongoing project which targets v2 called KeySharp [2].

If you are not aware of what AHK is, it is an easy but capable scripting language for automation and Hotkeys, and all sorts of visual things like GUIs.

If you want to learn more, there plenty of info on the repo, the docs html, and there's an active AHK Discord too, and I am personally also checking the forums and HN of course.

[1] https://www.autohotkey.com/ [2] https://bitbucket.org/mfeemster/keysharp/




You did this? Nice job!

> Please also check out Keysharp, a WIP fork of IronAHK, another complete rewrite of AutoHotkey in C# that tries to be compatible with multiple OSes and support modern, v2-like AHK syntax with much more features than this one. In comparison, AHK_X11 is a lot less ambitious and more compact, and Linux only.

Fascinating, lots to check out here. Thank you.

BTW this kind of work falls into poweruser tools IMO, which is an important area for cultivation of Linux Desktop focus. Traditionally in the Linux community this has been more of a distro-maintainer's support choice (offer poweruser tweaks and tools, or not) so it's always nice to see new distro-independent, non-dev-poweruser options coming into being.


I see the benefit in this, having one script that works everywhere.

It's also good to know linux has autokey (https://github.com/autokey/autokey) for the same result. The name is close, but it's a different project.

It's not compatible with autohotkey, but you can script using python. I used it to enter dates, emails and automatize dynalist.io actions.


Yes for autokey, I've used that to get emacs-like keys everywhere (modified from https://github.com/psederberg/autokey-emacs-config to https://github.com/podiki/autokey-emacs-config/tree/key-twea...)


Autokey is "good enough" but unmaintained and really rough around the edges. I really wish someone picked it up because it's 85% of the way to being a great automation tool.



There's also sxhkd and xbindkeys, for anyone looking for alternatives.


Nice! I used to love AHK to death, incidentally it also got me my first paid programming job.

It‘s about the only software I dearly missed after switching to Mac, but recently I found Hammerspoon and couldn‘t have been happier.

Yep, Lua is a bit weird and exotic, but hey, so is AHK. In the end, you trade in some of AHKs terseness for better modularity and compatibility with existing packages.

https://github.com/Hammerspoon/hammerspoon


I’ve been using hammerspoon for several years and it has really become integral to my workflow.

You may want to check out the extension package spacehammer[0]. It includes a bunch of workflows and shortcuts that I’ve found extremely useful.

Interestingly (for me at least), it’s authored in Fennel [1], a lisp that compiles to lua. I actually found spacehammer originally when I was working on converting my personal hammerspoon config to Fennel.

[0] https://github.com/agzam/spacehammer

[1] https://fennel-lang.org/


You might also want to check KeyboardMaestro on Mac. It's honestly the only programme I miss after switching to Linux desktop. 40€ but worth every cent.


Love hammerspoon! It would be nice if it supported Linux as well.


> incidentally it also got me my first paid programming job

How?


AHK is GOLD for me. Power user? Sure. But it ties me to Windows only, and so I'm VERY keen to see a version for another OS. AutoKey failed miserably in my testing, and I have so much tied directly to AHK that I'm okay with choosing Windows to run it. There's a lot of other things that are Windows-only for me, so it's not a big deal. I actually LIKE W10 and W11. But it's great to have options!


I'm interested in hearing what you've built with AHK such that you're so closely tied to it, and by extension windows.


I have plenty of uses:

---

Keyboard layout:

my whole keyboard layout used to be a remapping with AHK :)

not done by me though and the new implementation of the layout is something else (https://www.neo-layout.org/)

---

Complicated Shortcuts:

I still use it for generating and filling site specific mail addresses with a shortcut. My keyboard types "shortcut_site_specific_mailaddress" and ahk replaces it with news.ycombinator.com+is_allowed_to_pass_catch_all@mydomain.com.

For some time it was also the one tool that could dim my dell desktop screen by hotkeys with some shady dll someone provided.

---

Usability:

I also used it for alt+drag for a while (but now have a more native tool with exactly that name).

Capslock lights up if I am not in my native input language

---

Stress Relieve:

When mobile testing a lot in the emulator I got slight inflammation from clicking the mouse too much. If that happens I remap the left mouse key to the keyboard.


Like another commenter I learned to program with AHK. I've used it for automating dispetate software that didn't natively talk to each other, automate MS Office tasks via COM objects (think VBA), make use of DLLs for certain tasks, and obviously for text expansion and replacement.

I've always looked for software similar for Linux but nothing comes close.

The documentation is second to none as well.

Anyway I'm really happy to see this, well done I hope it all goes well.


I also compiled a list of most useful AHK scripts https://gourav.io/blog/autohotkey-scripts-windows


I found a lot of inspiration in this list - each compiled into exe's but ship with the source included:

https://www.dcmembers.com/skrommel/downloads/

And here is my script file (10+ years in the making!), some are the same as from your site (like toggle on top) but with minor improvements:

https://gist.github.com/raveren/bac5196d2063665d2154


I learned programming with AHK back in the day.

One neat feature is that you can bundle a bunch of scripts with the runtime into a single file binary for easy distribution. If AutoHotkey runs on multiple operating systems, maybe combining this with cosmopolitan libc into an 'αcτµαlly pδrταblε εxεcµταblε' would let you bundle a script and use it across different OSes. That'd be pretty fun.


Yes it's awesome and works on both platforms. But these are two different, independent runtimes both highly dependent on their target systems, so there can be no cross-os binary. For that, maybe ahkx [1] (abandonware) could be feasible because it builds on Wine, or, much rather, KeySharp (linked above).

[1] https://github.com/tinku99/ahkx


Interesting, I'm curious though, how does one interact with a bunch of scripts that are in a single binary?


Ah I could be a bit more clear, it's more like a script and its dependencies. That said, you can combine a bunch of different utilities into a single script and select between them at runtime (cli, gui, etc) or even have them running in parallel in the same runtime without too much trouble, though you have to be careful since some scripts make cavalier use of the global namespace.


The state of GUI automation and keyboard remapping on Linux is pretty meh. AutoHotkey seems pretty bespoke but it might be the best option right now.


I completely agree. Windows had a drag-and-drop keyboard mapping utility twenty years ago. And Mac OS has automator and applescript, with capabilities windows and linux can only dream of since it requires taking a top-down approach with your language and your compiler and basically forcing people through the narrow gates of xcode so that all the compiled binaries get their dictionaries for interfacing with the osa scripting stuff


Meh. OSX and Windows can dream upon upgrading everything with Unix/Linux utilities.

Half of these crap can be solved with xdotool/wmctrl and some scripts. Everything else it's maked to be scripted. Cron, package managers, X11 events, notifications...

One guy posted here an smart automation menu sequence to create a jar from a class within Eclipse. Any competent Unix programmer would create that for Scite, Gvim, Emacs or whatevers with an external script (man java and man javac, FFS) in seconds.


I have never used AHK before. Can someone please explain to me what _could_ it bring to the Linux world given that scripting is much easier in general?


It's a simple and valid question, but I'm afraid the answer is rather nuanced. Most things you do with AHK are also possible on Linux in general, but in a much more fragmented way. AHK (in its complete form) unifies most functionality of xbindkeys, xdotool, GUI creation (no common tool I am aware of), popups, wmctrl, unix tools like echo/cp/, assistive technology, window manager tools, grabc, and many more, out of which some aren't even possible without sophisticated programming like Hotstrings (auto-replace while typing), ImageSearch, StatusBarGetWait and all those things. It also ships with many common programming language features such as math operations, and brings its own equivalents to unix tools like FileCopy and FileAppend.

Then there's also AHK's ease of use, which also makes it appealing to non-programmers or beginners. Every piece of logic (even loops etc) is basically encapsulated in a single-word command instruction, almost like ASM, but for humans. You create GUIs with `Gui, add, button`, maximize windows with `WinMaximize Firefox`, hotkeys with `a::MsgBox You pressed a` and so on, and all of that with basically no code. I'd recommend you take a short look at the AHK documentation, such as the intro or alphabetical list of commands. It gives you an idea of what you can do with it.

Besides, the entire language is visually oriented (and so is this implementation of mine): You create, edit, run, reload, and (optionally) compile scripts usually not by using terminal, but by context menus. This makes it far more accessible for people migrating from Windows, apart from the fact that you can apply your acquired AHK skills from Windows.

Finally, I have found that the existing Linux automation-specific tools (which mostly boils down to xdotool) are, in comparison to Windows AHK, inferior. Getting Windows-like automation functionality on Linux is quite a task. The main difference is that very often, you just don't need Window or Hotkey automation on Linux, because most programs are CLI-first, thankfully.


Awesome comment.

It shows a failure of the "Unix Way" imo. Everyone hand-rolling their homebrew combination of 20 different tools together is sometimes not the best way to go.

We can look at Docker, Kubernetes, Dropbox(etc), desktop environments, and many other things to see that people greatly value almost the complete opposite approach, within some sweet-spot.


This is exactly right.

The reason that no nontrivial popular program exists written using shell scripts is because the Unix Way of gluing together separate programs just isn't a sustainable software development methodology that leads to nontrivial useful software.


> GUI creation (no common tool I am aware of)

You can create simple GUIs straight from bash too - take a look at zenity (I'm aware it's not the only option BTW)


Here are a few things I've done in AHK on Windows:

1. Move cursor to the toolbar icon of an app, right click, navigate menus, and select an option (all with one keystroke).

2. Click a button in one program. Copy file. Switch to Excel. Switch to the right worksheet. Click a VBA button. Switch to another sheet. All with one keystroke.

3. With one keystroke, set certain settings in Visual Studio (via the menu) and run the debugger.

4. With a keystroke, sign out of Teams/Skype.

5. Launch screenshot program, present me with a dialog box asking for filename, and have it save the screenshot in a directory and provide me an Org-mode link to the saved file.

These are all "easy" in AHK. I never learned it well enough to do more advanced stuff.


AHK's prime utility seems like hacking optimizations on inefficient-for-power-users gui software.

It kind of makes sense then that this was made for and got popular on windows, because that OS is absolutely rife with annoying gui software written for the lowest common denominator of user.

Whenever i infrequently have to use windows I'm a bit shocked at the number of clicks you need to get anything done.


OK, here's one for you. How would you automate a task in Firefox? Here's something I need to do often (to get around a bug in FF):

1. Press Ctrl P to get to the Print dialog 2. Scroll to "Print Using System Dialog" 3. Click on Page Setup. 4. Select 2 Pages per side. 5. Choose "Letter" for Paper Size 6. Click on "Print"

I can automate this using AHK on Windows. How would you do it on Linux? And how is running Firefox on Linux helping me any more than Windows?


I don't have an answer, but I've a question :P

How do you trigger the behavior? Is it bound to a key or is there a different way to trigger the action?

For example, I'm an emacs user and though I remember a lot of bindings, I also use heavily helm-M-x which provides fuzzy completion on command name.

For example, I may not remember all org-mode shortcuts, but I can do:

M-x org subtree

And I'll get a list of all org-mode commands on subtrees.

(I've never used AHK.)


> 1. Press Ctrl P to get to the Print dialog 2. Scroll to "Print Using System Dialog" 3. Click on Page Setup. 4. Select 2 Pages per side. 5. Choose "Letter" for Paper Size 6. Click on "Print"

How exactly does this work? Are you just sending a bunch of shortcuts? Or does autohotkey really understand when it reached "Print Using System Dialog" and which button has "Page Setup"?


You might just send all the expected commands in sequence and be fine, but that can be unreliable. Autohotkey can see the text contents of many dialogs and other windows components, which can be parsed and reacted too by the script. And it has functions for sending mouse clicks and keyboard commands, as well as directly editing controls, activating/minimizing/maximizing windows, etc. Unfortunately, I've found many more modern UI frameworks used in newer software have less that 'clues' that can be seen. But Autohotkey does have an image matching function that works well in a lot of cases when the traditional way fails.


> AHK's prime utility seems like hacking optimizations on inefficient-for-power-users gui software.

That's like saying shellscripts prime utility seems like hacking optimizations on inefficient-for-power-users cli software. Every poweruser has always demand for automation, independent of the interface, software and OS.

> because that OS is absolutely rife with annoying gui software written for the lowest common denominator of user.

And yet, Linux is even worse in that regard.


There are those pesky extra cmd windows that sometimes show up when one gui program has to use a subprocess to run cmd.exe to start another gui program. The only only only solution I found after trying purported solutions in visual basic, powershell, c++, python, etc... Somehow, AutoHotKey's code for finding invisible windows is better than even what's on the msdn site.


Another solution to the 'flashing console' problem is to wrap cmd invocation into VBS script. It allows you to set that flag to hide console window.


There's also a few RunHidden.exe type tools out there, as it's just a ShellExecute WinAPI call.


You can do similar things with xdotool (actually a lot more complex things) but it's missing a good GUI-based recorder tool.


xdotool does about a quarter of what AutoHotKey does, and even combined with sxhkd, Espanso, Autokey etc. still doesn't match up. xdotool doesn't do hotkeys itself, and doesn't work well being run with hotkeys involving modifiers (yes I'm aware of hacks involving getting xdotool to release modifier keys - I haven't succeeded getting them to work for anything nontrivial). AutoHotKey also does hotstrings (Espanso is the closest Linux equivalent I know, and doesn't work as well IME), app- or window-specific behaviour, and more.


> xdotool does about a quarter of what AutoHotKey does, and even combined with sxhkd, Espanso, Autokey etc. still doesn't match up.

That may be true of AutoHotKey on Windows but we're talking about a Linux port here where some of the features straight up don't work or don't add anything that didn't already exist in Linux.

> xdotool doesn't do hotkeys itself, and doesn't work well being run with hotkeys involving modifiers

It doesn't do those things because it doesn't need to on any version of KDE or Gnome (or probably anything else) since about 1998, possibly earlier though my memory gets foggy beyond that point.

> AutoHotKey also does hotstrings (Espanso is the closest Linux equivalent I know, and doesn't work as well IME)

Espanso and AHK probably suffer the same fate, given this line from the AHK for Linux Documentation:

> Hotstrings don't work in some applications. Not sure if this is fixable (help needed!). The only reliable alternative is using Hotkeys.

> app- or window-specific behaviour, and more.

You can absolutely do app- and window-specific things with xdotool. It merely requires adding some steps to detect the window or process, which is identical to AHK's approach if you're just writing scripts.

The best thing this port has going for it is that it has a great UI that people are already familiar with and in spite of its shortcomings as a port it is worth checking out.

I'd love if someone spent the time to develop something like this into a script generator for xdotool that could do things like recording mouse movements, etc.

Maybe a wayland-ized version can support hotstrings and the other warty edge cases that don't work on Linux the way they do on Windows.


For developer work, it can automate an entire sequence of GUI annoyances into a hotkey. I'm making this sequence up because I don't recall how it exactly went, but the volume of bizareness was about right. AHK helped me painlessly build JAR files in Eclipse. I'd hit F12 and:

AHK would pull up file menu, export menu, select java, type jar filename, hit next, select jar file, confirm overwrite, next, next, select seal the jar, yes overwrite, click that one random dialog button that's bizarrely not hotkey selectable, type main class as 'default.java', next, confirm, click finish. Click confirm again despite finish because GUI. And bam, another JAR file ready to test.


Unix IDE's and OFC any good Unix editor (even scite) would make it trivial with a command.


Well, you always get to weigh the effort of rebuilding a toolchain vs working within the existing clunky toolchain. However, I daresay that reimplementing everything that eclipse does for java is no trivial matter and for a day-long project, not worth my effort.

Though not everything can be done with different development tools and AHK is a versatile swiss army knife. It can also do things like hotkey to make any window stay on top, send window resize message to a runaway or disobedient window, or be hotkey window clicks for games. I use it in stormworks to make my own shortcuts to edit / wiring / mirror buttons since they're 54 inches apart, and hopefully this all shows where AHK's versatility shines.


It automates your desktop with a mini programming language that can be used to build desktop apps; but it's the text expansion capabilities that are highly prized, working across all desktop and web apps and saving you 100s of hours of typing.


https://github.com/espanso/espanso does just this if it's all you need


For sure it does but AHK has a wider user base


Back when I used Windows and played MMOs during the Vista or 7 days, I used AHK to do all sorts of grind-y tasks in-game. For example, grinding professions in SW:G to try to unlock Jedi (game might have survived if you could just pick Jedi classes from the start, just saying ...) and similar painful mechanics in most MMO(RPGs at least).

I did use PyUserInput/PyMouse to "play" Universal Paperclips one time on Linux. Similar, though not nearly as featured, to AHK.


I wrote this almost a decade ago, and it's still relevant:

https://www.tidbitsfortechs.com/2013/10/using-autohotkey-to-...


Are you on Windows? Check out this sample list of AHK utilities. Download them and give it a try! https://www.dcmembers.com/skrommel/downloads/


I use AHK to script interactions with a GUI application that I support but don't control the source or database. It allows me to script some repetitive tasks in the GUI and trigger them with a defined key sequence.


Just to add to the list of very bespoke hotkeys, I have one to change the "default" text color of e-mail replies in outlook. If I press Win+A and the active window is part of outlook, it will go into File, Options, Mail, Stationery and Fonts, "Font" for replies, and open the color picker

Helpful when I'm trying to send inline responses in an otherwise busy thread with multiple colors... and then I can switch back to my default blue after that with the same keystroke


I'm not sure that scripting is easier.

For automation, xdotool is pretty good. xprop and wmctrl are not so user friendly though.

As for keyboard remapping, sxhkd is great, but finding what keys are called requires xev or something like it. xmodmap is not easy for beginners either.

At the very least you have to know what to look for. I'm not an advocate of AHK, but having all those options in the same place would be good.


I've used it pretty extensively in Windows to expand text, execute clicks on certain buttons in GUIs, execute clicks on specific coordinates on screen or within a window, and to combine all these things into complex macros.


On Windows I use it as a text-expander. @<x><y> sans chevrons expands to my email address, and I have many of these.


One reason is because there are already a lot of existing AHK scripts out there. This could allow them to be cross-platform.


I like seeing languages like Crystal in the wild! Good job.


I don't have a linux machine handy but I wonder if this can run the excellent Lintalist[0]. I use it daily in my support work and I can't imagine ever getting away from it (also a huge fan of Clipjump[1])

[0]https://lintalist.github.io/ [1]http://clipjump.sourceforge.net/


Likely not, as it requires 1.1.31+[0], and this port supports up to 1.0.

[0]: https://github.com/lintalist/lintalist/blob/master/lintalist...


I've been looking for something that can recognize the mute button in call software (Zoom, Meet, etc) and click it, but everything I've found seems to do image detection.

I understand that running in a browser makes things hard, but is there a better method? I guess I could write an extension for Meet and Jitsi, but Zoom would then be the odd one out.


I know what you are asking is useful on its own, but Zoom support starting with the microphone muted, or even not joining the audio stream at all, check out these options: Settings->Audio->Mute my microphone when joining a meeting Settings->Audio->Automatically join audio by computer when joining a meeting


On Mac, you can do this with the renowned automation software Keyboard Maestro. The relevant feature is called "Found Image" Actions and documented here https://wiki.keyboardmaestro.com/Found_Image?s[]=image

You can see a real-world example by Jason Snell [1]. There, he used the feature to find the "deactive account" button in Slack and saved hundreds of clicks.

[1] https://sixcolors.com/post/2022/08/keyboard-maestro-comes-th...


Not Linux, and not exactly what you're asking for, but Windows PowerToys has VideoConference mute which seems to install an audio driver that can globally mute. Not sure if something like this exists for Linux.

https://docs.microsoft.com/en-us/windows/powertoys/video-con...


It's easy to globally mute on Linux, but that doesn't show the "you are muted" UI elements, and makes the UX a bit more confusing. I wanted something more explicit, which is hard.


Does image detection with "always show meeting controls" not work?


It does, but at 5 fps and unacceptable CPU usage, sadly.


…you only have to do detection when required though (after the hotkey is pressed). Curious about the architecture of the 5fps solution.


I need to do it constantly, as I want to update the status if I mute/unmute manually. I want to make a physical mute button with a LED, so that's why I need that.


I believe Autohotkey's image detection has a parameter to select the relative range inside the application window to actually perform a search. If you constrain it to a smaller area (because you know that the UI only appears in that area) this can greatly improve the performance versus searching the entire window.

Edit: I should mention I have no idea if this works on the linux versus of autohotkey.


Nut.js


aka “nut.js is a desktop automation framework for Node.js that allows you to program your mouse and keyboard with JavaScript or TypeScript”

https://nutjs.dev/


I know JavaScript pretty well too, but I prefer to write local automation in Bash. Why do you need such a twisted solution?


Perhaps this is the wrong place to ask, but it seems relevant. Having skimmed the documentation I can't see any options around automating selection/clicking of dialogue box buttons?

I have a user who has limited dexterity, and the ability to have a macro that selects/clicks on (modal?) dialogue box buttons based on their label (OK/Cancel etc.) would be a life saver for them. (A pre-determined set of keystrokes based on knowledge of the tab order would not work - this needs to be generic, based on the button text for my application.)

As things stand, I'm looking at hacking a video capture device and openCV together, but I can't help but think this must be a solved problem and I just have poor google-fu?


Have you tried Sikuli? It uses OpenCV in the way you want.

http://sikulix.com/


How fast is it? I tried to do something similar myself but it pegged a CPU core for 5 fps.


> I can't see any options around automating selection/clicking of dialogue box buttons?

I can only give you some technical details for ahk_x11 here. It would be ControlClick I think, here [1] is the Windows docs for it. Not present on ahk_x11 right now, I plainly don't know if it is possible yet [2]. If it is, I definitely want to have it. Other than that, I can only think of clicking on fixed coordinates, such as `MouseClick, left, 200, 300`. I haven't done `CoordMode` yet which would be rather important for that, out of sheer prioritization. For a generic solution, there's also ImageSearch [3] for more recent Windows AHK versions which I'd like to have at some point too. Some OCR command would be cool as well.

This will all take a while though, if nobody else joins in.

Thanks for your viewpoints though, I agree with the sibling comment that these insights are valuable.

[1] https://www.autohotkey.com/docs/commands/ControlClick.htm [2] https://github.com/phil294/AHK_X11/issues/3 [3] https://www.autohotkey.com/docs/commands/ImageSearch.htm


Out of curiosity is there a reason you're not just biting the bullet and going with a more accessibility-focused proprietary OS? Seems like a big compromise for supporting FOSS, so I'm sure there's more to it. Always interested in hearing accessibility-involved use cases as an interface designer.


By "a more accessibility-focused proprietary OS", do you mean MS Windows? Or MacOS? Or both? So far, the strategy of controlling X-windows apps via piecemeal scripts seems to be paying off OK. The ideal would be full voice control of all UI interactions at an application level. Free form text entry in, say, an editor is easier to deal with. It is handled by injection of keyboard events from third party speech recognition tools that allows for correction prior to final confirmation if required.

If PowerShell or Applescript have significantly better capabilities in this arena (or if there is an alternative tool I should be looking at, or other resources that might be useful in the quest), it would be great to hear. At the end of the day though, the end user is dev, and wants a Linux desktop.


Pretty sure Mac has some similar stuff, but this is the Windows 10/11 built-in voice command functionality.

https://support.microsoft.com/en-us/windows/windows-speech-r...

I imagine someone could use their computer almost entirely through that save things that require precise mouse usage like photo editing or video games. Even then, you can achieve rough mouse usage through their on-screen grid, so you can do some things.


I know that Windows is generally the choice for blind users because of the well-worn accessibility framework access.


You could probably do it pretty easily for some dialogs, i.e. those drawn by some framework or other. But good luck doing it for all the other frameworks too. Your thought is probably the most effective...


> this must be a solved problem

Yes, this is called a11y (accessibility). Install at-spi, run xwininfo, dogtail/sniff, accerciser, qdbusviewer to find dialogues and buttons, use any of the desktop automation tools to press them.

> buttons based on their label (OK/Cancel etc.)

That only works for English. Automate based on the type, not the label.


Thank you for the a11y tip, that's really helpful!

The point around internationalisation is well made. To be clear, I'm looking to implement a generic solution that potentially works for any application on the desktop. I do not have the source code for the underlying applications, so I'm not sure how/if I can discover the button type 'externally'? A config file per-application would be acceptable, though, and could address the language related issues, especially as they could then potentially be crowd-sourced.


I used to work as a photographer at a car dealership. I was tasked with taking pictures of all the cars that came to the lot, and uploading them to our web site. I would take pictures in the afternoon, taking 1-2 hours. In the morning I would clock in and then go to a coffee shop for a few hours to let AHK do all the heavy lifting of uploading the images to the 3rd party clunker of a back end website. I would have definitely installed and used Linux if this was around then.

Awesome work.


Good work

I have used AHK a lot and looked for a Linux alternative in the past, like AutoKey, Actiona... which did not felt as good

But nowadays I think I prefer the "Linux way" tools, the bulk can be done with: sxhkd, bash, zenity, xdotool, expanso...

AHK is all-in-one personal automation tool, very useful in Windows, a single tool with a unified language

Linux has a CLI tool for everything, more disperse, different implementations, but easier to re-use modules elsewhere

Still choice is good


I'd still like my customisations to be cross-platform, so I'm watching this project with great interest.


Shameless plug https://atbswp.com not as powerful, but good enough for quick hacks.


Oh, this is sweet. Back in my Windows days I relied heavily on AHK for hotkeys as well as some glue for various tasks, and while I've managed to figure out most of the glue items on Linux (which I've used for the past five years or so), I've always missed hotkeys, for which I've never found a really good AHK-like solution.

We are truly living in wondrous times.


In case you're looking for QMK[1]-like capabilities for regular non-programmable keyboards - check out kmonad[2]

[1]: https://github.com/qmk/qmk_firmware

[2]: https://github.com/kmonad/kmonad


Thank you! I haven't tried it yet, but this was one of the only things that I sorely missed when I went from Windows to Linux.


Does it work with Wayland?


No


Which is coincidentally the same answer as every other "does X (heh) work with Wayland" question


I disagree. There are plenty of programs that run just fine on Wayland.

I acknowledge the effort that OP has put into this project. And if the OP plans to use X11 for the foreseeable future, then it makes sense to target X11. But for any new project with a wide audience Wayland is a much more reasonable target.


There is no way to "target wayland" for a tool like this. For security reasons pretty much everything this tool does is blocked on wayland. You could perhaps make a version for sway and other wlroots-using desktops that works mostly like the current tool. For GNOME you might be able to get away with rewriting it in JS as a shell extension, no idea for KDE.


IMO, a position of "LOL, it works just fine on Wayland, just write it from scratch in six different programming languages to cover 80-90% of DEs/WMs out there" is not very developer (or for that matter, user) friendly. At least X11 is a single interface to target.


What's interesting is, very often, "for security reasons" is a complete BS kludge to say "we don't want to implement this." I normally wouldn't expect that sort of thing from Linux folk...perhaps until now.


it's a variant of "for the children ..."


yeah. this is as far as you get: https://github.com/ReimuNotMoe/ydotool.

begs the question. how come ahk can run on windows(tm) but not wayland? my understanding is that win10/11 are pretty secure.


I literally just recently had to switch a brand-new Ubuntu installation to X11 because Synergy/Barrier wasn't working in Wayland mode [0]. Until Wayland gets its head in the game, I'm staying on X11 for as long as possible.

[0]: https://github.com/symless/synergy-core/issues/4090


Not true. Specially OBS, AFAIK, works on Wayland.


:D


Nice.

I installed autohotkey on windows because I wanted to duplicate subcommands that I use with linux for ages (xmonad has it more or less built in)

    meta m n # next track
    meta m space # play/pause
Was relatively easy to implement, after i got help in r/autohotkey


you can implement the wm manager of your dreams in ahk ... in like 500 lines. it's amazing stuff.

you can also go all out: https://github.com/fuhsjr00/bug.n


Does autohotkey have keyboard and mouse recording function when I first learned about scripting 20 years ago Winbatch had that function which took a non-programmer who just wanted to speed up repetitive tasks into an entry level programmer after a few weeks of use.


Does this support converting new keys into new modifiers? IIRC AHK allows you to do things like:

  - If CapsLock is pressed, treat it as a new modifier.
  - If CapsLock is released, send "ESCAPE".
If so, that would be swell.


you can do this with devinput alone, so for sure. see ydotool, xcape, or it's not hard to write your own.


Wonderful news. Now we have hope of convincing Tom Scott to join the winning team.


For those unaware, here is the video: https://yewtu.be/watch?v=lIFE7h3m40U

I wonder if it is actually possible to just "connect the flumberboozle to the GKX virtual port"


Cross platform emoji keyboard!


The one thing I miss from Windows is AHK. Thanks for posting this!


FYI - this gets close, but not quite.


It looks like the calls needed to control a GUI application are not yet implemented. Does anyone know of a Linux tool that can do GUI automation? E.g. find window A, click menu B, click menu item C…


Sikuli works on Linux - http://sikulix.com/



https://github.com/hofstadter-io/self-driving-desktop

cross platform self-driving mouse & keyboard, can also record


One of the only things I truly missed under Linux. Much can be solved using various tools such es AutoKey, XbindKeys+Guile etc. but nothing comes close to what AHK was capable of under Windows.


Nice I really missed AHK when I moved away from windows


This is so great! I have fond memories of AutoHotkey when I used Windows. It has just the right balance between scripting requirements and features.


AHK is excellent; created a fishing bot during the hardcore WoW raiding days (Stonescale eel for Flask of Titans, anyone?)


Does anyone know of a stable way to implement SpaceFN under X?

Apologies for the hijack, but this thread feels like the right place to ask.


xcape.


AHK is one of the things I miss from my Windows days (along with Macrium Reflect).


Advantages over xbindkeys?


I don't use xbindkeys so maybe my impression is wrong, but AHK is far more powerful. You can incorporate state into your scripts, or read things from the screen and act conditially on it (though the repo says WindowSpy is unsupported. I think that's what allowed screen-reading).

You can compile it into a binary with no dependencies, so you can easily bring your script to other computers or share it. You can create a GUI to run commands rather than binding them to key patterns.

I used to use it in World of Warcraft to do things like have my character's entire skill rotation on one key (e.g. just keep pressing "1" and AHK will handle pressing the next "real" key for the sequence. Hit another key and I can reset the sequence early). And to automatically tatget myself and cast a heal when my health goes below 50%, which was done by polling a certain pixel on the screen for a specific value of red.


Absolute level of awesome ( even though some companies do clamp down hard on features like these for a variety of reasons ). Have you tried offering those scripts to players? I am more than certain that most MMOs would have loved to use them.


Sorry :p this was way-way back in vanilla days, before I took backing up my code seriously.

I actually had a way more advanced, but very buggy script I used for "multi boxing", which is when you pay for multiple accounts and play them simultaneously. I used WASD, TFGH, IJKL, for moving 3 characters simultaneously, the numpad for targeting, and ctrl/shift/alt + number-row for casting spells on any of the windows. So ctrl+2 would cast the warrior 2nd ability, shift+alt+4 would cast the druid and mage 4th ability, ctrl+shift+alt+2 would cast all character's 2nd ability, etc.

Blizzard used to be a lot more lax about macros[0], and they liked multiboxing because it was more money. I had a GM message me once to confirm I was playing all 3 and not afk botting. We chatted about AHK and how he wanted to learn to code.

[0] also the tools to detect macros were way worse back then, it might have been that they just accepted it rather than actually embracing it.


Blizzard does ban for autohotkey, and even stuff like gaming keyboard macros.


I'm wondering, why introduce a new scripting language; wouldn't it make more sense to release it as e.g. a Python library? Or something that you can script with Bash?


Well this was back in the Python 2.4 days, I don't think I had even heard of it. And bash isn't easily available on Windows, at least interacting with running win32 programs. Back then it was even worse, I think I used mingw32 when I needed that? Blizzard did offer some sort of official macro system using Lua, but that was a bit advanced for me at the time.

A dedicated language is nice because the syntax can be optimized for its purpose. As a teen with little coding experience, AHK was so easy to get simple ideas working. Often one easy line of code.

Some of the bigger monstrosities I wrote would have definitely been more manageable in another language, but then I'd have to handle always-active input listening, window management, emulating the keyboard and cursor, reading screen pixels, my own state machine for key chording, and so on. Or cobble together libraries (and still learn those).

Also probably the most important part, it was fun to work with and immediate. I wasn't a professional (or even good) programmer. But I had a goal in mind and AHK never made me want to quit.


well done. don't be so hard on yourself. ahk is amazing stuff.


AHK is fairly well established in its niche, so not a new language by any stretch. That's even before you get to the complexity of Python or Bash vs a dedicated DSL.


Nice work! And thanks for the x_do.cr shoutout :-)


Anyone know of a comparable program for macOS?


BetterTouchTool has a lot of overlap with AHK I believe. Lets you remap keyboard/mouse actions globally and/or per application and perform actions or run scripts on keyboard/mouse input ands lots of features of that kind.

(Never used AHK myself but the couple times I looked for something similar to BetterTouchTool on Windows I got recommended AHK)


As someone who has used both, that's not entirely off-base, though the details are very different: AHK is better at lower-level keyboard remapping and related I/O — you really need to combine BTT with something like Karabiner-Elements for more than rough equivalence here — but BTT is much better than AHK for out-of-box support for non-keyboard event triggers, with native support for mouse button-press events[1], trackpad and Magic Mouse gestures, touch bar, Siri remote, MIDI messages, hot corners, BT LE proximity events, all manner of system events (battery status, lock/unlock, Apple Events, distributed notifications, etc.), scheduled and recurring timer events, and the list goes on.

[1] Including support for all buttons on n-button mice for all values of n supported by USB standards, something that, last I checked, Windows itself was incapable of handling at the driver level without dropping down to low-level HID event processing that's a bit of a stretch, though not technically impossible, for AHK.

This, incidentally, is why many-buttoned Logitech gaming mice come preconfigured to send keyboard events for all but the first few buttons.


Keyboard Maestro? The built-in AppleScript, Services and Keyboard shortcuts support?


I absolutely love hammerspoon for MacOS automation: https://www.hammerspoon.org/



I think the reason something like this doesn't exist for Linux is that most Linux users prefer to use a more powerful language to create their own scripts instead.


For Linux i used ,xkeybindrc and xdotools


Wayland support when ~Civvie~?


That would need to be implemented by the Compositor. Wayland does not support this kind of functionality.


[deleted]


Isn’t it a bit late to do something like this tied to X? I get the impression almost all distributions now default to Wayland. This feels like something that would have been good a decade ago, but is too late now.

(Mind you, this would be a good deal harder to implement in Wayland, where possible at all, since Wayland is predicated around a security model where this kind of thing is deliberately not supported. You can either go with something compositor-specific, and I don’t know if this is possible for all compositors, though I get the impression that it should be possible for at least wlroots; or work more the way ydotool does, where possible, in basically providing a fake input device.)


I mean, as you note, Wayland almost certainly can't provide this kind of functionality by design. It offers some ways to fake input, but AFAIK no way to hook/listen to input events without hoping that each compositor adds a way to do so. In contrast, X11 trivially enables read and write access to input events, as well as the ability to intercept input and rewrite it before applications see it. Also, while many distros now ship Wayland as the nominal default, last I'd seen it was actually used by a minority of users compared to Xorg (if we exclude ChromeOS, which seems reasonable for this conversation).


> Also, while many distros now ship Wayland as the nominal default, last I'd seen it was actually used by a minority of users compared to Xorg

You touched a good point. The one and only metric to measure success in software is the number of active users. However, i do not know of any statistics Xorg vs. Wayland users. Personally, i have no reason to switch away from X11. In contrary, it was and is a loyal companion now already for decades. I trust the Lindy effect that X11 will stay relevant for a very long time. So kudos for the OP for creating this!


So you are not using fractional scaling which is reason enough to switch to Wayland.


This is a Gnome issue. It works in KDE (and there are patches to make it work in Gnome too)

https://wiki.archlinux.org/title/HiDPI#Xorg


Different people care about different features. If you want fractional scaling and can't get it on X11, use Wayland. If you want AHK, use X11. Unfortunately, neither option currently has all the features of the other.


Wayland has had very little traction outside of RH and other corp-heavy groups. It didn't work for too long, now it works and still requires an X server to do anything useful while having less applications, worse features, higher resource usage in the available systems. I haven't seen anything to motivate me to try to switch since the "new shiny" wore off a decade or so ago.


I used i3 on older laptops, and decided to try switching to Sway when I got my current laptop 16 months ago. I’ve switched back to i3 on it on a few occasions for screen sharing in Zoom, and wow, I’d forgotten how bad the tearing is, and how not-great the management of inputs and outputs (though this is unlikely to affect full desktop environments like GNOME and KDE), and how annoying it is to not be able to use my XF86AudioMicMute key because its key code is 256 (it was a pain to get working in Sway, requiring xkb_file with a manually-tweaked keymap, but as far as I can tell it’s simply not possible under Xorg which has a hard limit of 255). Plus scaling simply doesn’t work at all well in X/i3, to say nothing of flat-out not supporting mixed scaling, whereas it all just works in Sway (the only apps I have to tweak are ones run via Xwayland, which sometimes but don’t always get it right out of the box).


> I haven't seen anything to motivate me to try to switch since the "new shiny" wore off a decade or so ago.

I get some terrible screen tearing on X, especially when using OBS. On Wayland it's solid. With X/Wayland the best thing it can do is be invisible.


I haven't seen screen tearing anywhere in more than a decade, so that doesn't have much weight with me, but I can see how it could be an issue for video editing (if OBS is what Google suggests it is). Now I'm interested in why X and Wayland would be different - screen tearing seems like it would be at a lower level (mesa?) than the display server? I don't understand how any of a modern graphics stack actually works though.


Same, I've wanted to dig into the modern linux graphics stack for a while now but never seem to have time for it. So much complexity these days


There is some rough functionality for operations with wlroots via wlrctl https://git.sr.ht/~brocellous/wlrctl and more generically (via uinput) ydotool https://github.com/ReimuNotMoe/ydotool


>Isn’t it a bit late to do something like this tied to X?

At the moment, Wayland is used only by a minority of users, and most are still on X11. Sure, distros may often default to Wayland, yet almost no distro has switched the user's default upon upgrading.

Most users aren't new users or installing on new systems, and even for new users X is occasionally still default (e.g. every existing BSD, some KDE distros).

I estimate it's another 5 years before another new computer cycle and Wayland maturing will make X less relevant. 15-25-more years before we could even consider removing support.


Wayland will be an interesting challenge, but I think there's still plenty of time before that happens.

If we figure out that this kind of automation is impossible on Wayland, I think it should still be possible to hook into the raw keyboard events with kbd and such, no? Requires root though.



Reading that thread is infuriating and still full of the ridiculous idea that every single use case should be defined through specific protocols


Love the project but I really wish projects (especially linux projects) would include a "plain english" description of what this does at the top. Like, even one sentence.

1. Not everyone knows what AutoHotkey is and what it does. I had to google it.

2. Not everyone speaks english. Diving into a description that includes words like "fault tolerant, extensible, high availability" so infuriatingly confusing.


I don't think that (2) is OP's fault.

Look the words up in a dictionary. It's ok not to know and to learn but you shouldn't expect everyone to adapt to your lack of knowledge.


A dictionary probably won't give you a useful definition for most of those terms. They're jargon.

But then again I would add (3) they're just buzzwords that don't actually tell you anything...


Perhaps not HA, but 'fault tolerant' and 'extensible' are pretty apt.


....explain it like I'm 5 years old?


If everyone wrote software to be used by 5 years old without any interest in looking up things they don't understand (yet) themselves, we'd never move the field forward.


AutoHotKey is not normally aimed at developers but highly skilled administrators. If you target your advertising at this market you would likely get more traction.

Moving the field of software development is about solving more problems for people without putting a large learning curve in their way.

Yes some aspects will always require expertise but that is not an advantage.


Dictionary is good for learning words.


1. Thank you for the suggestion. I added the one-liner description from the Windows website to the top of the Readme.

2. Those words aren't in the Readme so I'm a bit confused. I guess these were just examples though. I am not a native speaker either though... if you really think there's uncommon words to be found, I would welcome a PR.


  > I added the one-liner description from the Windows website.
Bad idea, you don't need an infringement case. Try this instead: "An easy to learn language for automating your Linux computer". People who don't know what AHK is don't care that this project is a port of it, so no need to mention that in the description.


tbh your suggestion would confuse me as well, i think the scope or better said usp of ahk is more specific, and the proposed description could very well mean bash or other comparable tools


#2 is pretty common software lingo. If you’re esl you definitely are going to want to look that one up.


Pretty ironic to write "esl" instead of the expanded form.


what's ironic about it?


If someone "is ESL", it's less likely that they will know what "ESL" means, so it's another layer for them to parse.


That doesn't really fit the definition of irony...


Oh no. I guess it's my ESL showing.


Bit of a tangent/slightly off topic:

how do HNers make Windows not evil/terrible? Like, it tries to force you into creating a Microsoft account to install Windows, I hear it has ads in the OS, forces reboots when you don't want them, and appears to generally treat its users with contempt and as a resource to exploit.

Do you need an MSDN subscription to get LTSC? Any third-party tools that try to fix the horror show?

Such a shame to have a solid platform run for evil motives.


- chocolatey for installing apps (most of what I need anyway)

- WSL for command line stuff, python, rust

- bleachbit for clean up of files, be careful and pay attention obviously

- Immediately switch over to firefox (from Edge) and have brave installed as a backup

- I just don't use the microsoft store at all unless I need to

- malwarebytes for nightly scans, avira free as a virus program instead of defender

- Shutup10++ to turn off most of the privacy invasion antifeatures of windows

- Make notepad++ or gvim default editor

- You can delay or turn off updates in the settings. I just live with the reboots though. I have a fast machine and most programs will have session save features anyway.

That's mostly enough to make it a decent environment.


You can now use winget to install ms store tools via CLI.


I've used winget to fetch a couple of things (because of windows support pages). I still use choco as my default though. Unless someone shows a huge advantage for winget then I think I'll continue on. I like using programs that aren't under Microsoft's umbrella if I can. Much the same way as avoiding google and amazon.


When it comes to Windows 10, I basically "pave over" as much of the shell as possible. This includes replacing the Start menu with Classic Shell/Open Shell, potentially replacing the taskbar with RetroBar [0], erasing as many UWP apps (Cortana and Edge included) as possible from the system, basically anything goes.

For things that are relatively useful to keep like Windows Explorer, I try the numerous registry tweaks that remove features I don't like (for example, OneDrive integration). Couple this with Tron and similar scripts.

Installing an LTSC edition would obviously be nice, but it's hard and expensive to get to get (see [1] - the general consensus seems to be that you need to go through a VAR, and then buy an LTSC license alongside some cheap shopping basket fillers to get above the minimum number of licenses limit). When it's available, it makes the above "paving over" process much easier than a retail/pro edition.

[0]: https://github.com/dremin/RetroBar

[1]: https://answers.microsoft.com/en-us/windows/forum/all/how-do...





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: