Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
An intro to finding things in Linux (madebygps.com)
178 points by mooreds on Nov 23, 2021 | hide | past | favorite | 82 comments


    find /home -type f -name test.*
This will silently fail if you happen to have one file that matches test.* (like test.txt) in the current working directory, because test.* will be replaced by the name of this file and that's what find will see.

This will fail in zsh if no files match in the current working directory, because globbing will fail.

You need to quote this to avoid unintuitive results, in any shell.

    find /home -type f -name 'test.*'
Not doing this will probably work most of the time, but it will probably confuse you the one time it won't and drive you crazy if you don't realize what is going on.

I've become very aware of this kind of things by using zsh which is stricter with failing globs than bash. Also quote anything that contains {, }, [, ], (, ), ?, ! or $ for similar reasons. Beware of ~ too. And & or | obviously, and also ;.

Do yourself a favor, quote the hell everything in shells that's not simple, alphanumeric strings or option names. Even if it's only alphanumeric actually, if it is a parameter value, quote it. This way, when you edit your command and somehow add a special character, you are already covered. It also makes your values stand out, making your command arguably easier to read because they might look more uniform.

edit: and quote with single quotes, unless you need variable expansion or to quote simple quotes (but be extra careful then)


> Do yourself a favor, quote the hell everything in shells that's not simple, alphanumeric strings or option names.

This should be the first sentence in any shell manual. It will bite you sooner or later.


This is before glob expansion happens before running the program.

> The order of expansions is: brace expansion; tilde expansion, parameter and variable expansion, arithmetic expansion, and command substitution (done in a left-to-right fashion); word splitting; and filename expansion.

Straight from the manual

https://www.gnu.org/software/bash/manual/html_node/Shell-Exp....


shellcheck warns about constructs like these, and many more.


Absolutely.

One doesn't use shellcheck when using the shell interactively though, so it's still good habits to take.


Globs in an interactive shell should get expanded and then force you to press enter to confirm the command with expanded globs. Are there any shells that implement this?


In zsh, you can press tab after something that will get expanded, and it expands in place. I actually use it quite often to check whether the expansion is right.

However, this does not cover the case where an unexpected globbing happens like in this find situation.


That would become unreadable in many cases, like du -sh *


I looked through zshoptions(1) and didn’t see anything like this. However you can turn off globbing entirely with “unsetopt glob”. There are however other expansions that occur outside of globbing, such as parameter substitution.

It might be possible to implement this using zsh’s line editor, but it would take some digging through the man pages to figure this out (or finding a zsh expert). Try zshzle(1).

Yeah this is probably a much more complicated answer than you were looking for.


What you really should do is to stick

shopt -s failglob

On top of shell scripts (and on ~/.bashrc), so that a glob that doesn't expand will always be an error. That way, you won't have that unquoted test.* work by accident, and thus it's easier to train yourself to always quote.

Or, just use zsh, which has this behavior by default.

Anyway here are other shell scripting tips: https://sipb.mit.edu/doc/safe-shell/


Also = when prefixed with a space, which works basically like `command -v`, expanding to the full path of the executable. For example `echo =grep` results in `/usr/bin/grep`.


Also:

find /home -name test.\*


find /home -type f -name test.*


Personally, I find find(1) to be an utter offender against the Unix philosophy of "doing one thing well". It can do a dozen things or so, and not particularly well either. (Execute actions? Delete stuff? Look for a string within files?). Just look at this post and how long the find(1) section is wrt. the ones about which(1), whereis(1), and locate(1).

I am sure that well-versed users can achieve wonderful things with it; myself, I either use fd or pipe "du -a" into grep (or rg), and move on with my life.


Agreed. My most common use case for "finding stuff" on linux boxes is me looking for particular strings in clear text files, so I often end up doing "grep string -Ri {/etc,/var} | less". Also, though find seems to be POSIX compliant, I just don't like it's syntax and how flags are handled.


For that I like to install ripgrep or `rg` for short, very nice defaults and formats the output nicely too.


Installing the tools you prefer is not always an option, but I agree, ripgrep is great.


You can always put some binaries in $HOME/bin.


You cannot "always" do this because some of us frequently or occasionally work in environments where external, non-reviewed software simply isn't allowed.


`fd` seems pretty useful, but I tend to just `find . -name '*whatevs*'`. It's a bit longer, but one less tool to make sure you have available (and find is everywhere), and one less tool to learn. Even just `find . | grep whatevs` is fine.


I never bothered to learn all the find options, so find|grep is also my usual goto.


POSIX find is a lot less daunting than GNU if you're looking for somewhere to start.

https://pubs.opengroup.org/onlinepubs/9699919799/utilities/f...


My problem with find -name x is that it’s painfully slow


I think this debate is missing one of my favorites,

locate.


I find it funny that desktop search engines are not mentioned with a word here. This would be the first place to start when using Linux on a desktop (but of course not a server, i.e. without any desktop environment).

These search engines are very powerful because they do deep file scanning, are based on mature frameworks such as Apache Lucene.

For KDE, this was called Nepomuk https://userbase.kde.org/Nepomuk and is nowadays called Baloo https://community.kde.org/Baloo, and can be used in fact from the command line (baloosearch)

For GNOME, this is (apparently, not using GNOME in the last years) called Tracker https://gitlab.gnome.org/GNOME/tracker


Recoll is also worth a shout-out.

https://www.lesbonscomptes.com/recoll/


My speedy system tanks when Baloo (and fstrim) tries to run after booting because everything runs at default priority. They really need to set nice and ionice on those processes by default. Thus locate is what I'll usually end up using.


I tried Plasma ( with i3 to replave kwin, didn't turned out to be a goo experience btw, back to bare i3) some times ago and directly deactivated this baloofile after seeing it was taking 1.7gb at boot ( maybe it was temporary ). I just use fzf when I don't know where something is.


I find the current Nautilus search to be very powerful, especially in searching inside texts & PDFs. It certainly does a far better job than Mac Spotlight or Windows Search. Under the hood, I am pretty certain it runs an extended command with regexes


I will never understand why the find command needs its flags after the positional argument, in contrast to every other unix tool.


The find command existed when UNIX was in its infancy, long before the existence of CLI argument conventions such as double-dash long options, or positional vs. non positional order.


Most of the arguments one would think of as flags are components of the search expression, which can include the operators '!', '(', and ')'.


And '-o', for 'OR'.


If you forget what all those system directories are for (/bin, /sbin, /usr/bin, etc), another useful command is "man hier". It's a reference page for the filesystem.


I can't believe I didn't know this for all these years. I was just guessing/remembering bits of info.


These are all great, but the one I find myself using constantly on source code and other text-oriented files is The Silver Searcher (ag)[1]. It’s not as useful for file _names_, but most of the time, I care about contents and this searches, in realtime, at an incredible speed. Add the -l flag to list only filenames and you’ve got an amazing code location tool.

[1] https://github.com/ggreer/the_silver_searcher


You can search file names (instead of contents). Add the -g or -G to your -l flags.

     $ ag -l -g which /usr/bin/
     /usr/bin/which
     /usr/bin/kpsewhich
     /usr/bin/sgmlwhich

     $ ag -l -g /which$ /usr/bin/
     /usr/bin/which


Nice! Hadn't noticed that.


This is what I use:

https://github.com/sharkdp/fd

Which you can install as a binary, or via cargo. fd is spectacular.

To that you can add: https://github.com/junegunn/fzf

Which you can bind to a key in your shell for convenience.


Here's my fzf and fd config:

  52   │ export FZF_DEFAULT_OPTS='--reverse --border --exact --height=50%'
  53   │ export FZF_ALT_C_COMMAND='fd --type directory'
  54   │ export FZF_CTRL_T_COMMAND="mdfind -onlyin . -name ."
Obviously mdfind is mac only.


mdfind also requires to enable spotlight on the target locations, which can slow down stuff.


This tool combined w/ FZF sounds like a dream. Gonna give it a try


Missing "command -v", which was suggested as a replacement to which in debian base system. https://lwn.net/Articles/874049/


> For all that, which is not a standardized component on Unix-like systems; POSIX does not acknowledge its existence.

Given that `which` is not POSIX compliant you wonder why it is so popular compared with command -v which is just slightly more complicated.


Not for finding filenames, but for finding files containing stuff, there's the magical `grep` ! Probably my most used command for finding stuff in a massive legacy codebase.

    grep --include='*.js' -rn methodName 
will recursively search the current directory and all subdirectories for .js files containing the word 'methodName'

Obviously IDEs can do this as well, but since you can supply regex, you can get some pretty complicated cases.

Random arbitrary example but difficult to do with an IDE -

    grep --include='*.js' -rn methodName\([^,]*,[^,]*,[^,]\*\)
same as above, but only find methods with 3 arguments


You often have to remember to use -print0 for find and then -0 to xargs:

    find . -print0|grep -z something|xargs -0 ls -l


I also highly recommend plocate as an alternative to locate. It's much faster and accepts all the same flags as locate.

https://plocate.sesse.net/


locate is, for whatever reason, tragically slow. The database format it uses is nonsensical and completely optimized for size on very outdated assumptions.

I use an implementation I have written in the shell itself whose database format is nothing more than every file path on the system separated by null bytes, that is simply grepped to find files; the speed difference is absurd.

  —— — time locate */meme.png
 /storage/home/user/pictures/macro/meme.png

 real  0m0.885s
 user  0m0.806s
 sys 0m0.010s
  —— — time greplocate /meme.png$
 /storage/home/user/pictures/macro/meme.png

 real  0m0.089s
 user  0m0.079s
 sys 0m0.011s
This implementation is highly naïve and simplistic, and offloads all the searching to GNU Grep, yet outperforms the actual `locate` command by an order of magnitude.


And plocate is yet orders of magnitude faster than GNU grep. :-) And updates its database faster. You don't specify which locate you're using, but mlocate and BSD locate are basically obsolete by now.

(Disclosure: I'm the author of plocate.)


I'm a huge fan and use your program basically daily. I don't know why it's not the default in most systems.


Thank you! It's becoming the default now, slowly (e.g., it will be the default in Debian and Ubuntu from the next releases, and Fedora is in a process to make it replace mlocate right now). It's just a tad too new, only about a year since 1.0.0. :-)


Nice. I do something similar. Why null byte instead of new line?


Presumably to support filenames with newlines in them.


Yeah but… gee


Why take the risk? I'm fairly certain I have no files with newlines on my system but I see no reason to take the risk.


Unrelated, but can I ask about your prompt?


They are simply em-dashes, if an error should occur, they are replaced with numbers to indicate the error code; they are also color-coded to indicate when I'm not for instance in an SSH session by changing to another color:

  —— — true
  —— — false
  —— 1 sh -c 'exit 120'
   120 sh -c 'exit 20'
  — 20

It doesn't show the colors here of course.


It's astonishing that searching for a file name by substring across a file system is still not instantaneous on most systems. On my laptop (a 2GHz Quad-Core Intel Core i5), `find / -iname 'quux' takes 2 minutes to find matches across an APFS partition with 2m files. Grepping for a substring in a file with 2m lines takes a few milliseconds. Why don't modern file systems implement something like the `locate` database, except one that is always up-to-date, so that scanning for a file does not require expensive traversal?


Spotlight does just that and has since 2005.

mdfind in the terminal if you prefer it that way.


Exactly, I use it with fzf and ctrl-t:

  54   │ export FZF_CTRL_T_COMMAND="mdfind -onlyin . -name ."
Unfortunately it's not instant - it takes about a second for the tens of thousands of file entries to populate. But fzf searching that list is practically instant. After selecting the file I want, enter just returns me to the command line with the full path to the file. I can then ctrl-a and type 'vim' or 'code' or whatever. It's not a perfect workflow, but it's pretty good for finding files in complex folder structures.


From the title, I thought this was about finding the implementation for various components in the Linux kernel.


Somewhat related is Recoll (desktop full-text search tool), recently discussed here: https://news.ycombinator.com/item?id=28950947


If you want to speed up the find command, you can use -prune to skip over directories that are full of files that are known not to be of interest.

To skip the .git directory:

`find . -type d -a -name .git -prune -o -type f -a -iname '*.json' -print`


I find searching for a package installed files very useful, as to locate the eventual configuration, templates, assets …

dpkg -L packagename | grep doc

On Debian for instance.

Or even finding out which package installed a specific file with apt-file.


For those using dnf

  dnf repoquery -l packagename
To list the files within a package.

  dnf whatprovides "**/bin/executable"
To find what package provides a certain file. Definitely very useful when trying to build packages from source and not sure in what packages the reported missing libs/headers are located.


And the most obvious

  dnf search <something>


You could dpkg-query -S if apt-files is not installed.


One thing I sometimes to is to "tree|less". Then use the less '/' search to try to find what I need.


apropos is missing, and it's really helpful if you don't rename the exact spell of the invocation


My favorite trick to finding file which I know some app is using but I don't know where it is is:

   strace some_app 2>&1 | grep open
And it will tell me what files is it opening. You will also see files it tried but failed to open, i.e. missing files.


You could also do `strace --trace=open some_app` or `strace --trace=%file some_app` to have the strace do the filtering by itself (the latter will match any syscall that takes a filename)


As hobbyist Linux [server] user, I've been using which, with hit and miss results, and 'find' never really worked for me (I clearly don't have the magical aptitude for it). I've just given whereis a go - and that's perfect for me. Nice.


I more or less stopped using find the day I discovered mlocate.


Ls and Grep should be in this list

To include dired would be nice, but I understand that is not the everybody's sauce.


here’s a version of ‘find’ I use quite often …… I usually download it to run as “fstring”. it outputs the file name and the string match:

https://github.com/figital/fstring


This entire HN comments page visualizes so effectively why I don't use Linux.

Even when someone gathers enough knowledge to write an introductory guide to using the damn thing, it turns out it's all wrong, has bugs, fails in unexpected ways or is a deprecated method that's being phased out.

Respect to those of you who work with it.


It's not a problem with linux.

It's a problem with a tutorial written by someone who is underqualified to be doing so, so that she can get page hits to boost her "brand."

She's a fucking streamer, employed by Microsoft to be a .NET "advocate" https://developer.microsoft.com/en-us/advocates/gwyneth-pena...

Her site is just regurgitating the usual novice guides, of which you can find a dozen better examples.


You can't make these gotchas affecting Linux distributions go away by blaming someone who present things the way many people use POSIX interactive shells.

Being a streamer and a .NET advocate does not make one incompetent in POSIX shell. Thing is, POSIX shell is tricky and it's all too easy to fall into traps, even for experienced people.


This kind of things can be found in any OS, especially widespread ones, which take their roots from long ago. We now would have enough experience to make things with fewer gotchas, but this would break compatibility and habits, and we can't get rid of the huge existing ecosystem anyway.

In the shell area, there are attempts to fix the syntax, like fish [1]. I'm afraid this kind of things is doomed to remain niche, because when you deeply understand what it is trying to do, you are also probably used to classic POSIX shell enough that you may not want to change. Fish users also need to go back to bash or zsh to follow a number of tutorials and documentations, so even them can't avoid POSIX shells.

I myself adopted zsh to have the features of fish and keep the bash-like syntax since I need to deal with this syntax either way.

In the Windows world, and actually Unix too, there is also Powershell. Can't comment that much since I don't know it, but it does not seem to have taken off on Unix, and many people on Windows seem to use bash through WSL anyway.

My conclusion is that bash and bash-like shells are ruling the world, even on Windows, and we are stuck with the POSIX shell syntax for now and probably a long time.

[1] https://fishshell.com/


Actually they work on most UNIXes not only Linux.


fd + rg


Sure, but its good to also be able to use the default tools. Because there's going to be times where 'fd + rg' are (for whatever reason) not available.

Some reasons could be: not installed on machine, and not allowed (or not supposed to) install it globally or locally, embedded machines which don't have space to install new software.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: