An intro to finding things in Linux

jraph · on Nov 23, 2021

    find /home -type f -name test.*

This will silently fail if you happen to have one file that matches test.* (like test.txt) in the current working directory, because test.* will be replaced by the name of this file and that's what find will see.

This will fail in zsh if no files match in the current working directory, because globbing will fail.

You need to quote this to avoid unintuitive results, in any shell.

    find /home -type f -name 'test.*'

Not doing this will probably work most of the time, but it will probably confuse you the one time it won't and drive you crazy if you don't realize what is going on.

I've become very aware of this kind of things by using zsh which is stricter with failing globs than bash. Also quote anything that contains {, }, [, ], (, ), ?, ! or $ for similar reasons. Beware of ~ too. And & or | obviously, and also ;.

Do yourself a favor, quote the hell everything in shells that's not simple, alphanumeric strings or option names. Even if it's only alphanumeric actually, if it is a parameter value, quote it. This way, when you edit your command and somehow add a special character, you are already covered. It also makes your values stand out, making your command arguably easier to read because they might look more uniform.

edit: and quote with single quotes, unless you need variable expansion or to quote simple quotes (but be extra careful then)

hdjjhhvvhga · on Nov 23, 2021

> Do yourself a favor, quote the hell everything in shells that's not simple, alphanumeric strings or option names.

This should be the first sentence in any shell manual. It will bite you sooner or later.

oweiler · on Nov 23, 2021

This is before glob expansion happens before running the program.

> The order of expansions is: brace expansion; tilde expansion, parameter and variable expansion, arithmetic expansion, and command substitution (done in a left-to-right fashion); word splitting; and filename expansion.

Straight from the manual

https://www.gnu.org/software/bash/manual/html_node/Shell-Exp....

IiydAbITMvJkqKf · on Nov 23, 2021

shellcheck warns about constructs like these, and many more.

jraph · on Nov 23, 2021

Absolutely.

One doesn't use shellcheck when using the shell interactively though, so it's still good habits to take.

pabs3 · on Nov 23, 2021

Globs in an interactive shell should get expanded and then force you to press enter to confirm the command with expanded globs. Are there any shells that implement this?

jraph · on Nov 23, 2021

In zsh, you can press tab after something that will get expanded, and it expands in place. I actually use it quite often to check whether the expansion is right.

However, this does not cover the case where an unexpected globbing happens like in this find situation.

louiskottmann · on Nov 23, 2021

That would become unreadable in many cases, like du -sh *

massysett · on Nov 23, 2021

I looked through zshoptions(1) and didn’t see anything like this. However you can turn off globbing entirely with “unsetopt glob”. There are however other expansions that occur outside of globbing, such as parameter substitution.

It might be possible to implement this using zsh’s line editor, but it would take some digging through the man pages to figure this out (or finding a zsh expert). Try zshzle(1).

Yeah this is probably a much more complicated answer than you were looking for.

nextaccountic · on Nov 24, 2021

What you really should do is to stick

shopt -s failglob

On top of shell scripts (and on ~/.bashrc), so that a glob that doesn't expand will always be an error. That way, you won't have that unquoted test.* work by accident, and thus it's easier to train yourself to always quote.

Or, just use zsh, which has this behavior by default.

Anyway here are other shell scripting tips: https://sipb.mit.edu/doc/safe-shell/

HellsMaddy · on Nov 23, 2021

Also = when prefixed with a space, which works basically like `command -v`, expanding to the full path of the executable. For example `echo =grep` results in `/usr/bin/grep`.

sys_64738 · on Nov 23, 2021

Also:

find /home -name test.\*

hulitu · on Nov 23, 2021

find /home -type f -name test.*

lou1306 · on Nov 23, 2021

Personally, I find find(1) to be an utter offender against the Unix philosophy of "doing one thing well". It can do a dozen things or so, and not particularly well either. (Execute actions? Delete stuff? Look for a string within files?). Just look at this post and how long the find(1) section is wrt. the ones about which(1), whereis(1), and locate(1).

I am sure that well-versed users can achieve wonderful things with it; myself, I either use fd or pipe "du -a" into grep (or rg), and move on with my life.

lazyweb · on Nov 23, 2021

Agreed. My most common use case for "finding stuff" on linux boxes is me looking for particular strings in clear text files, so I often end up doing "grep string -Ri {/etc,/var} | less". Also, though find seems to be POSIX compliant, I just don't like it's syntax and how flags are handled.

npteljes · on Nov 23, 2021

For that I like to install ripgrep or `rg` for short, very nice defaults and formats the output nicely too.

hnarn · on Nov 23, 2021

Installing the tools you prefer is not always an option, but I agree, ripgrep is great.

alex_smart · on Nov 27, 2021

You can always put some binaries in $HOME/bin.

hnarn · on Nov 27, 2021

You cannot "always" do this because some of us frequently or occasionally work in environments where external, non-reviewed software simply isn't allowed.

toxik · on Nov 23, 2021

`fd` seems pretty useful, but I tend to just `find . -name '*whatevs*'`. It's a bit longer, but one less tool to make sure you have available (and find is everywhere), and one less tool to learn. Even just `find . | grep whatevs` is fine.

gpderetta · on Nov 23, 2021

I never bothered to learn all the find options, so find|grep is also my usual goto.

norvvryo · on Nov 23, 2021

POSIX find is a lot less daunting than GNU if you're looking for somewhere to start.

https://pubs.opengroup.org/onlinepubs/9699919799/utilities/f...

edgyquant · on Nov 23, 2021

My problem with find -name x is that it’s painfully slow

toxik · on Nov 23, 2021

I think this debate is missing one of my favorites,

locate.

ktpsns · on Nov 23, 2021

I find it funny that desktop search engines are not mentioned with a word here. This would be the first place to start when using Linux on a desktop (but of course not a server, i.e. without any desktop environment).

These search engines are very powerful because they do deep file scanning, are based on mature frameworks such as Apache Lucene.

For KDE, this was called Nepomuk https://userbase.kde.org/Nepomuk and is nowadays called Baloo https://community.kde.org/Baloo, and can be used in fact from the command line (baloosearch)

For GNOME, this is (apparently, not using GNOME in the last years) called Tracker https://gitlab.gnome.org/GNOME/tracker

alex_smart · on Nov 23, 2021

Recoll is also worth a shout-out.

https://www.lesbonscomptes.com/recoll/

nitrogen · on Nov 23, 2021

My speedy system tanks when Baloo (and fstrim) tries to run after booting because everything runs at default priority. They really need to set nice and ionice on those processes by default. Thus locate is what I'll usually end up using.

archeroed · on Nov 23, 2021

I tried Plasma ( with i3 to replave kwin, didn't turned out to be a goo experience btw, back to bare i3) some times ago and directly deactivated this baloofile after seeing it was taking 1.7gb at boot ( maybe it was temporary ). I just use fzf when I don't know where something is.

srvmshr · on Nov 23, 2021

I find the current Nautilus search to be very powerful, especially in searching inside texts & PDFs. It certainly does a far better job than Mac Spotlight or Windows Search. Under the hood, I am pretty certain it runs an extended command with regexes

assbuttbuttass · on Nov 23, 2021

I will never understand why the find command needs its flags after the positional argument, in contrast to every other unix tool.

10000truths · on Nov 23, 2021

The find command existed when UNIX was in its infancy, long before the existence of CLI argument conventions such as double-dash long options, or positional vs. non positional order.

norvvryo · on Nov 23, 2021

Most of the arguments one would think of as flags are components of the search expression, which can include the operators '!', '(', and ')'.

ropeladder · on Nov 23, 2021

And '-o', for 'OR'.

nofinator · on Nov 23, 2021

If you forget what all those system directories are for (/bin, /sbin, /usr/bin, etc), another useful command is "man hier". It's a reference page for the filesystem.

philbarr · on Nov 24, 2021

I can't believe I didn't know this for all these years. I was just guessing/remembering bits of info.

incanus77 · on Nov 23, 2021

These are all great, but the one I find myself using constantly on source code and other text-oriented files is The Silver Searcher (ag)[1]. It’s not as useful for file _names_, but most of the time, I care about contents and this searches, in realtime, at an incredible speed. Add the -l flag to list only filenames and you’ve got an amazing code location tool.

[1] https://github.com/ggreer/the_silver_searcher

aesh2Xa1 · on Nov 23, 2021

You can search file names (instead of contents). Add the -g or -G to your -l flags.

     $ ag -l -g which /usr/bin/
     /usr/bin/which
     /usr/bin/kpsewhich
     /usr/bin/sgmlwhich

     $ ag -l -g /which$ /usr/bin/
     /usr/bin/which

incanus77 · on Nov 24, 2021

Nice! Hadn't noticed that.

29athrowaway · on Nov 23, 2021

This is what I use:

https://github.com/sharkdp/fd

Which you can install as a binary, or via cargo. fd is spectacular.

To that you can add: https://github.com/junegunn/fzf

Which you can bind to a key in your shell for convenience.

bloopernova · on Nov 23, 2021

Here's my fzf and fd config:

  52   │ export FZF_DEFAULT_OPTS='--reverse --border --exact --height=50%'
  53   │ export FZF_ALT_C_COMMAND='fd --type directory'
  54   │ export FZF_CTRL_T_COMMAND="mdfind -onlyin . -name ."

Obviously mdfind is mac only.

29athrowaway · on Nov 24, 2021

mdfind also requires to enable spotlight on the target locations, which can slow down stuff.

jesuspiece · on Nov 23, 2021

This tool combined w/ FZF sounds like a dream. Gonna give it a try

Erlangen · on Nov 23, 2021

Missing "command -v", which was suggested as a replacement to which in debian base system. https://lwn.net/Articles/874049/

arendtio · on Nov 23, 2021

> For all that, which is not a standardized component on Unix-like systems; POSIX does not acknowledge its existence.

Given that `which` is not POSIX compliant you wonder why it is so popular compared with command -v which is just slightly more complicated.

mmazing · on Nov 23, 2021

Not for finding filenames, but for finding files containing stuff, there's the magical `grep` ! Probably my most used command for finding stuff in a massive legacy codebase.

    grep --include='*.js' -rn methodName

will recursively search the current directory and all subdirectories for .js files containing the word 'methodName'

Obviously IDEs can do this as well, but since you can supply regex, you can get some pretty complicated cases.

Random arbitrary example but difficult to do with an IDE -

    grep --include='*.js' -rn methodName\([^,]*,[^,]*,[^,]\*\)

same as above, but only find methods with 3 arguments

guerby · on Nov 23, 2021

You often have to remember to use -print0 for find and then -0 to xargs:

    find . -print0|grep -z something|xargs -0 ls -l

cf · on Nov 23, 2021

I also highly recommend plocate as an alternative to locate. It's much faster and accepts all the same flags as locate.

https://plocate.sesse.net/

Blikkentrekker · on Nov 23, 2021

locate is, for whatever reason, tragically slow. The database format it uses is nonsensical and completely optimized for size on very outdated assumptions.

I use an implementation I have written in the shell itself whose database format is nothing more than every file path on the system separated by null bytes, that is simply grepped to find files; the speed difference is absurd.

  —— — time locate */meme.png
 /storage/home/user/pictures/macro/meme.png

 real  0m0.885s
 user  0m0.806s
 sys 0m0.010s
  —— — time greplocate /meme.png$
 /storage/home/user/pictures/macro/meme.png

 real  0m0.089s
 user  0m0.079s
 sys 0m0.011s

This implementation is highly naïve and simplistic, and offloads all the searching to GNU Grep, yet outperforms the actual `locate` command by an order of magnitude.

Sesse__ · on Nov 23, 2021

And plocate is yet orders of magnitude faster than GNU grep. :-) And updates its database faster. You don't specify which locate you're using, but mlocate and BSD locate are basically obsolete by now.

(Disclosure: I'm the author of plocate.)

cf · on Nov 24, 2021

I'm a huge fan and use your program basically daily. I don't know why it's not the default in most systems.

Sesse__ · on Nov 24, 2021

Thank you! It's becoming the default now, slowly (e.g., it will be the default in Debian and Ubuntu from the next releases, and Fedora is in a process to make it replace mlocate right now). It's just a tad too new, only about a year since 1.0.0. :-)

tambourine_man · on Nov 23, 2021

Nice. I do something similar. Why null byte instead of new line?

Asooka · on Nov 23, 2021

Presumably to support filenames with newlines in them.

tambourine_man · on Nov 24, 2021

Yeah but… gee

Blikkentrekker · on Nov 24, 2021

Why take the risk? I'm fairly certain I have no files with newlines on my system but I see no reason to take the risk.

gnubison · on Nov 23, 2021

Unrelated, but can I ask about your prompt?

Blikkentrekker · on Nov 24, 2021

They are simply em-dashes, if an error should occur, they are replaced with numbers to indicate the error code; they are also color-coded to indicate when I'm not for instance in an SSH session by changing to another color:

  —— — true
  —— — false
  —— 1 sh -c 'exit 120'
   120 sh -c 'exit 20'
  — 20

It doesn't show the colors here of course.

atdt · on Nov 23, 2021

It's astonishing that searching for a file name by substring across a file system is still not instantaneous on most systems. On my laptop (a 2GHz Quad-Core Intel Core i5), `find / -iname 'quux' takes 2 minutes to find matches across an APFS partition with 2m files. Grepping for a substring in a file with 2m lines takes a few milliseconds. Why don't modern file systems implement something like the `locate` database, except one that is always up-to-date, so that scanning for a file does not require expensive traversal?

tambourine_man · on Nov 23, 2021

Spotlight does just that and has since 2005.

mdfind in the terminal if you prefer it that way.

bloopernova · on Nov 23, 2021

Exactly, I use it with fzf and ctrl-t:

  54   │ export FZF_CTRL_T_COMMAND="mdfind -onlyin . -name ."

Unfortunately it's not instant - it takes about a second for the tens of thousands of file entries to populate. But fzf searching that list is practically instant. After selecting the file I want, enter just returns me to the command line with the full path to the file. I can then ctrl-a and type 'vim' or 'code' or whatever. It's not a perfect workflow, but it's pretty good for finding files in complex folder structures.

CyberShadow · on Nov 23, 2021

From the title, I thought this was about finding the implementation for various components in the Linux kernel.

albertzeyer · on Nov 23, 2021

Somewhat related is Recoll (desktop full-text search tool), recently discussed here: https://news.ycombinator.com/item?id=28950947

tejohnso · on Nov 23, 2021

If you want to speed up the find command, you can use -prune to skip over directories that are full of files that are known not to be of interest.

To skip the .git directory:

`find . -type d -a -name .git -prune -o -type f -a -iname '*.json' -print`

jeromenerf · on Nov 23, 2021

I find searching for a package installed files very useful, as to locate the eventual configuration, templates, assets …

dpkg -L packagename | grep doc

On Debian for instance.

Or even finding out which package installed a specific file with apt-file.

mhitza · on Nov 23, 2021

For those using dnf

  dnf repoquery -l packagename

To list the files within a package.

  dnf whatprovides "**/bin/executable"

To find what package provides a certain file. Definitely very useful when trying to build packages from source and not sure in what packages the reported missing libs/headers are located.

ape4 · on Nov 23, 2021

And the most obvious

  dnf search <something>

ttyprintk · on Nov 23, 2021

You could dpkg-query -S if apt-files is not installed.

marcodiego · on Nov 23, 2021

One thing I sometimes to is to "tree|less". Then use the less '/' search to try to find what I need.

avereveard · on Nov 23, 2021

apropos is missing, and it's really helpful if you don't rename the exact spell of the invocation

dvh · on Nov 23, 2021

My favorite trick to finding file which I know some app is using but I don't know where it is is:

   strace some_app 2>&1 | grep open

And it will tell me what files is it opening. You will also see files it tried but failed to open, i.e. missing files.

FreeFull · on Nov 23, 2021

You could also do `strace --trace=open some_app` or `strace --trace=%file some_app` to have the strace do the filtering by itself (the latter will match any syscall that takes a filename)

Jaruzel · on Nov 23, 2021

As hobbyist Linux [server] user, I've been using which, with hit and miss results, and 'find' never really worked for me (I clearly don't have the magical aptitude for it). I've just given whereis a go - and that's perfect for me. Nice.

hnarn · on Nov 23, 2021

I more or less stopped using find the day I discovered mlocate.

pvaldes · on Nov 23, 2021

Ls and Grep should be in this list

To include dired would be nice, but I understand that is not the everybody's sauce.

figital · on Nov 24, 2021

here’s a version of ‘find’ I use quite often …… I usually download it to run as “fstring”. it outputs the file name and the string match:

https://github.com/figital/fstring

hypertele-Xii · on Nov 23, 2021

This entire HN comments page visualizes so effectively why I don't use Linux.

Even when someone gathers enough knowledge to write an introductory guide to using the damn thing, it turns out it's all wrong, has bugs, fails in unexpected ways or is a deprecated method that's being phased out.

Respect to those of you who work with it.

KennyBlanken · on Nov 23, 2021

It's not a problem with linux.

It's a problem with a tutorial written by someone who is underqualified to be doing so, so that she can get page hits to boost her "brand."

She's a fucking streamer, employed by Microsoft to be a .NET "advocate" https://developer.microsoft.com/en-us/advocates/gwyneth-pena...

Her site is just regurgitating the usual novice guides, of which you can find a dozen better examples.

jraph · on Nov 24, 2021

You can't make these gotchas affecting Linux distributions go away by blaming someone who present things the way many people use POSIX interactive shells.

Being a streamer and a .NET advocate does not make one incompetent in POSIX shell. Thing is, POSIX shell is tricky and it's all too easy to fall into traps, even for experienced people.

jraph · on Nov 24, 2021

This kind of things can be found in any OS, especially widespread ones, which take their roots from long ago. We now would have enough experience to make things with fewer gotchas, but this would break compatibility and habits, and we can't get rid of the huge existing ecosystem anyway.

In the shell area, there are attempts to fix the syntax, like fish [1]. I'm afraid this kind of things is doomed to remain niche, because when you deeply understand what it is trying to do, you are also probably used to classic POSIX shell enough that you may not want to change. Fish users also need to go back to bash or zsh to follow a number of tutorials and documentations, so even them can't avoid POSIX shells.

I myself adopted zsh to have the features of fish and keep the bash-like syntax since I need to deal with this syntax either way.

In the Windows world, and actually Unix too, there is also Powershell. Can't comment that much since I don't know it, but it does not seem to have taken off on Unix, and many people on Windows seem to use bash through WSL anyway.

My conclusion is that bash and bash-like shells are ruling the world, even on Windows, and we are stuck with the POSIX shell syntax for now and probably a long time.

[1] https://fishshell.com/

pjmlp · on Nov 23, 2021

Actually they work on most UNIXes not only Linux.

ElDji · on Nov 23, 2021

fd + rg

Fnoord · on Nov 23, 2021

Sure, but its good to also be able to use the default tools. Because there's going to be times where 'fd + rg' are (for whatever reason) not available.

Some reasons could be: not installed on machine, and not allowed (or not supposed to) install it globally or locally, embedded machines which don't have space to install new software.