Cquery: highly-scalable, low-latency language server for C++

krylon · on Nov 17, 2017

Wow!

I consider myself lucky that I practically do not have to deal with C++, but if I were a C++ kind of person, right now I would deeply appreciate how much hard work goes into a tool like this. [1]

In order to be any good, it has to be even more clever than the compiler, at least in a few ways.

And I have a hunch that in the field of programming languages that are used these days, C++ is probably not the easiest one to write such an analyzer for. ;-)

TL;DR - Respect! I can only - but vividly - imagine how hard it must be to create a tool like this.

[1] I have mixed feelings regarding C++, because I have only used it in tiny doses. I treat the language the same way I would treat a bear, if one suddenly appeared: Keep my distance, avoid any sudden motions, but also, under no circumstances, dare I turn my back on it because of the slight chance that it might come after me eventually. I do not actively dislike C++, I just have not found the time to learn it, and I get the impression that if I really want to learn it, it will take much more time than I can currently afford to invest. Maybe some day it will come to pass (that will be a good day also to finally read Moby Dick all the way through from end to end.)

phaedrus · on Nov 17, 2017

Some advice if you do decide to learn C++: there has been a steady stream of improvements in standard C++ the past 8 years (C++11-17), and some (Bjarne Stroustroup, Herb Sutter, others) promote it as a better starting point for beginners under the name 'Modern C++'.

From: https://msdn.microsoft.com/en-us/library/hh279654.aspx

"[...] Over the years, features have been added to the language, together with highly-tested standard libraries of data structures and algorithms. It's these additions that have made the modern C++ style possible. Modern C++ emphasizes:

Stack-based scope instead of heap or static global scope.

Auto type inference instead of explicit type names.

Smart pointers instead of raw pointers.

std::string and std::wstring types (see <string>) instead of raw char[] arrays.

Standard template library (STL) containers like vector, list, and map instead of raw arrays or custom containers. See <vector>, <list>, and <map>.

STL algorithms instead of manually coded ones. Exceptions, to report and handle error conditions.

Lock-free inter-thread communication using STL

std::atomic<> (see <atomic>) instead of other inter-thread communication mechanisms.

Inline lambda functions instead of small functions implemented separately.

Range-based for loops [...]"

weberc2 · on Nov 17, 2017

C++ as a language has certainly modernized, but its tools are still stuck in the 90's. Contrary to the increasingly bizarre rationalizations offered up by its proponents, CMake and friends are not acceptable package managers or build tools, at least to those of us who have used newer languages. I don't need a build tool assumes all projects are unique snowflakes in need of their own hand-crafted build system; I just want to get up and running quickly without sticking a bunch of maintenance into my build system.

I don't mean to dump on C++; I like the language and I take this post as a sign that things are beginning to improve, and I hope this pace of improvement accelerates. However, I have been disappointed by the head-in-the-sand attitude of some proponents.

flohofwoe · on Nov 18, 2017

The funny thing about cmake is though that simple things are indeed simple, you can describe an executable made of several compile units in a single line of cmake script. The problem is that it doesn't enforce any standardization beyond that of how to structure complex projects, especially if external dependencies are involved.

PS: I don't think you will find a single "fan" of cmake who would say it's the best thing since sliced bread. Most people know it sucks, but grudgingly admit that its better to have one bad standard than a dozen competing solutions.

weberc2 · on Nov 18, 2017

> I don't think you will find a single "fan" of cmake who would say it's the best thing since sliced bread.

This happens in all the time. At one point here on HN I lamented that C++ didn't have a build system like Rust's Cargo, and lots of C++ folks came out of the woodwork to argue that Cargo is a shitty tool because it can't support certain byzantine projects. Similar arguments were made in a thread about a new C++ package manager that attempts to modernize C++ package management. Almost all of the pro-CMake arguments focus on how capable it is to handle bizarre projects, without realizing that most such projects are bizarre only because they each had to implement their own build system instead of using one that encouraged them toward some standard structure. I can't recall anyone making an argument that resembled "it's better to have one bad standard than a dozen competing standards", even in response to something like "CMake is the best of all C++ tools, but still a far cry from other languages' tooling." Some people would allude to C++ projects being uniquely difficult to build, and so it would be impossible to emulate another language's tools; however, this too seemed to be confusing "C++ projects lack a standard structure" with "C++ projects can't have a standard structure". I could go on and on.

pjmlp · on Nov 17, 2017

Depends on which tools we are talking about.

QtCreator and Visual C++ (with UWP), are only now allowing what was already quite comfortable with C++ Builder in the 90's.

As I posted in another thread, Lucid and IBM had quite advanced environments based on Lisp and Smalltalk environments, that failed on the market due to hardware requirements and heavy price tags.

Other than that, I feel your pain.

C++ is no longer my daily tool, but I still enjoy using it, and would also like to have better tooling available for when I do.

The anti-modules discussion post-CppCon is a good example of what you are stating.

krylon · on Nov 17, 2017

Thank you!

Like I said, I am not turning my back on it. I have noticed that C++'s evolution has been gaining momentum over last ten years. I have even read Bjarne's book on the design and evolution of C++ - it was fascinating, and I think it is worth reading even to people who never have used C++.

The thing is that from all I have heard and read I get the impression that learning C++ - really learning it, not just C-with-classes - is a long-term commitment. That is why, as I said, I am keeping a safe distance between me and C++. ;-)

When the day comes, I will not be entirely unprepared.

yoz-y · on Nov 17, 2017

Although C++ has lots of warts, I still really like the language mainly thanks to the tooling around it. It has had very good IDEs for years and when I see some of the advancements in debuggers and profilers for more modern languages I feel that they are still quite behind.

Note that this server is based on libclang. This is good because having a good language support means that you have to practically write a compiler for it, using an actual compiler is the best way.

However, it is true that I have always used a full fledged IDE – mostly Qt Creator. I never had courage to configure Vim for it and now I will not invest in that. But if a good and fast language server for it existed a few years back I would definitely jump on it.

pjmlp · on Nov 17, 2017

It existed, but sadly both were too early for the market.

The first one was from Lucid, after pivoting away from Lisp Machines.

https://www.dreamsongs.com/Cadillac.html

Here is a demo on YouTube.

https://youtu.be/pQQTScuApWk

IBM also tried to create a Smalltalk style repository in Visual Age for C++ v4.0, which has quite resource hungry for mid-90's PCs.

So we had to wait until the idea of compiler as library to catch on, to get those ideas back.

lispm · on Nov 18, 2017

I think I mentioned that before, Lucid as a company has never done anything with Lisp Machines. Their target market was Unix and various UNIX platforms. What they were famous for was Lucid CL as a very high quality implementation/compiler/runtime of Common Lisp for UNIX. They sold licenses to various companies (like IBM, DEC, SUN, ...), which sold it under their brand with their enhancements. Lucid CL was used for application delivery by commercial customers, because the compiler and runtime were fairly good.

The Lucid CL IDE wasn't THAT advanced. Fairly standard CL stuff.

Example: SUN sold Lucid CL as Sun Common Lisp and the IDE was called Symbolic Programming Environment:

http://3e8.org/pub/scheme/doc/lisp-pointers/v2i2/p5-endelman...

That's nice, but not too advanced and with a basic look&feel.

Lucid as a company failed, because the C++/Energize product failed in the market and it consumed all the money they had earned from milking their Lisp business, which itself was in a shrinking market.

pjmlp · on Nov 19, 2017

Thanks for clarifying it.

Still if such products had been successful earlier, most likely our C++ IDEs would be much joyful to use even without plugging into clang.

It always feels so many products have been a market failure just because people weren't yet ready for them. Only to have others 10, 20 years later to conquer exactly the same market.

lispm · on Nov 21, 2017

Prices of Energize IDE in 1992:

Single user: $5425 Five users: $2975 each Ten users: $3250 each

A single Energize C++ compiler was slightly more than $1000.

On top of that Energize was a relatively complex piece of software. Probably it would also have been a good idea to have a capable UNIX desktop machine, which even in 1992 wasn't that cheap.

The whole thing was aimed at professional C++ programmers in a relatively limited market of 'mission critical applications'.

cookboo · on Nov 17, 2017

It is really odd that most of the people who complain c++ are the people who never or seldom use it.

sient · on Nov 17, 2017

Author here! Wasn't quite ready to post to HN yet since cquery is still in development, and I plan to eventually publish on the vscode marketplace so using cquery should be as simple as using the existing C/C++ extension.

Let me know if you have any questions.

archgoon · on Nov 17, 2017

Hi! I see you're using compile_commands.json. How are you handling header files? I've found that header files present problems for compile_commands since a number of tools using compiler_commands (like bear) only look at the compiler commands for .c files, and don't notice that the header files should be added as well.

I know that some tools, like YCM, attempt to intelligently map a header file to it's associated .cpp/.cc/.c file to guess what the compiler commands are, but this doesn't always work.

What is the state of generating compiler_commands.json for Chromium? I remember running into this issue a few years back.

sient · on Nov 18, 2017

> Hi! I see you're using compile_commands.json. How are you handling header files? I've found that header files present problems for compile_commands since a number of tools using compiler_commands (like bear) only look at the compiler commands for .c files, and don't notice that the header files should be added as well.

When a cc file is indexed cquery will index the associated header files. There is some logic to deduplicate multiple header file parsing so it only happens once, but that is fundamentally how it works. cquery then knows which header files are associated to which cc files.

> I know that some tools, like YCM, attempt to intelligently map a header file to it's associated .cpp/.cc/.c file to guess what the compiler commands are, but this doesn't always work.

cquery does this as well, because you can, for example, create a new file that is not in compile_commands.json. cquery has sophisticated logic here, as it will also try to infer if the file is test or platform specific (use general postfix matching) as those often have a very different set of arguments.

> What is the state of generating compiler_commands.json for Chromium? I remember running into this issue a few years back.

Chrome compiles using ninja, which natively supports compile_commands.json, so generating the file works well and is easy to do. I have not run into any issue here.

izym · on Nov 17, 2017

Does this support Objective-C++? If not, do you have any plans to implement support for it?

sient · on Nov 17, 2017

I have not tested it, but it should be a relatively straightforward to get working. Feel free to file an issue on github.

systems · on Nov 17, 2017

Any plans for Cmake support?

sient · on Nov 17, 2017

In what sense? cmake can generate compile_commands.json so projects built using cmake are supported.

jhasse · on Nov 17, 2017

Wow! I've just tested this and it's awesome. Much better than the C/C++ extension, since it's more robust (no false-positives for go-to-definition), faster and has more features (e.g. refactoring, better auto-complete, ...). It basically turns VS Code into the best C++ IDE possible for me.

makapuf · on Nov 17, 2017

It's not really linked to VSCode, is it ? i was under the impression that by example Sublime integration for LSP was possible.

iainmerrick · on Nov 17, 2017

Yep, the idea is that the same protocol can work with any editor: https://langserver.org

rombix · on Nov 17, 2017

This project seems to be very similar to clangd. It is also similar to YouCompleteMe, though YCM does not support LSP yet. How does Cquery compare to those projects in terms of features?

In particular, should it be considered as an alternative to clangd? Could it make sense to combine efforts between clangd and Cquery?

sient · on Nov 17, 2017

At the moment, clangd and ycm are very similar projects; they are very limited compared to cquery. clangd and ycm support code completion, diagnostics, fixits, and goto declaration (but not definition), whereas cquery supports references, derived types, callers, etc. Basically, if the feature requires knowledge across multiple translation units, clangd/ycm do not support it.

cquery is designed to support very large projects, so it makes very specific design decisions w.r.t. the data model, indexing pipeline, and multithreading model. I hope clangd can match the performance - but so far every project I've seen simply does not run nearly fast enough on a code-base the size of Chrome/ChromeOS.

rombix · on Nov 17, 2017

Thanks for the explanations!

In the meantime I tried it out with VSCode on my MacBook. Works like a charm! Awesome job!!!

snowAbstraction · on Nov 17, 2017

Is this like https://github.com/Andersbakken/rtags but for vscode?

Any key differences?

jchw · on Nov 17, 2017

For one thing, it sounds like it implements the Language Server protocol, which will make it usable for other IDEs and text editors with support for this protocol (a quickly growing list.)

Not aware of Rtags but I'll assume it's similar to the omnipresent C tags but with an actual C++ parser. If that's the case, in theory cquery is a lot more powerful; it would include the symbol indexing of Rtags but also code refactoring tools ("fixits",) fully aware context-sensitive auto-complete, ability to detect dead code (i.e. #if 0 or #ifndef _WIN32) and support for the type hierarchy. Some of that stuff is definitely not supported by LSP so I imagine it's custom stuff for VSCode only.

All in all, it sounds like it would provide most of the IDE experience to VS Code. The only annoying part would be that you'll have to extract the compile flags from your build system yourself, but that's not usually too big of a deal.

to3m · on Nov 17, 2017

One interesting thing that rtags does is provide a gcc/clang wrapper script. You make symlinks to it called cc/c++/gcc/g++/etc. in a folder that's in PATH ahead of wherever gcc/clang proper live. When invoked, the script submits an appropriate tags job to the rtags server, complete with compile flags actually used, then passes the command line on to the appropriate tool so that the compile actually gets done.

So quite often you won't need to do anything. (If you use CMake, you will have to rebuild your Ninja/make files after first setting this up.)

I didn't find rtags perfect. There were a few files that it simply never seemed to create tags for, and the rtags setup is complicated enough that I couldn't quickly figure out why, so I just put up with it. But when it works, it works very well...

(This sort of thing seems to be par for the course with this kind of tool. I've seen a number of people complain about how terrible Visual Studio's code browsing is, for example, but it's been a long time since I've had a problem with it...)

zarkov99 · on Nov 17, 2017

Rtags is great for code browsing, but completion is slow and dumb.

to3m · on Nov 17, 2017

I found it fine once I'd set it up to do completion only by request, so I'd know that I might have to wait. The default, whereby it tries to pop up as you type, was absolutely maddening...

sient · on Nov 17, 2017

> All in all, it sounds like it would provide most of the IDE experience to VS Code. The only annoying part would be that you'll have to extract the compile flags from your build system yourself, but that's not usually too big of a deal.

This can be done similar to what rtags does (hooking into gcc/clang invocations). Alternatively if you use ninja this relatively simple by generating a compile_commands.json file.

    ninja -C out/Release -t compdb cxx cc > compile_commands.json

My eventual plan is to automate compile_commands.json generation if using ninja, so cquery is install and go.

snowAbstraction · on Nov 17, 2017

rtags is also using libclang for parsing, fixits, etc.

Here is a demo of rtags can do with a proper emacs setup: CppCon 2015: Atila Neves "Emacs as a C++ IDE" https://youtu.be/5FQwQ0QWBTU

jchw · on Nov 17, 2017

edit: Comment above has updated with more information, making this reply redundant. :)

sient · on Nov 17, 2017

I tried using rtags before developing cquery, but found it did not perform well enough for Chrome when doing a huge number of semantic operations (I was hacking in support for code lens). I spent some time trying to figure out if it code be fixed but I believe it would have been too large of an architectural change.

- cquery interacts with an editor via the language server protocol, letting it work with any editor with relatively minimal work - cquery handles larger repositories better (ie, indexing all of Chrome takes 20-30 minutes on a high-end workstation) - cquery responds to semantic requests within 10ms or so

There are some other differences but those are related to the features implemented, ie, cquery supports code lens.

I'm not sure how code completion is in rtags, but I've spent a fair amount of time making it work as fast as possible in cquery. There is quite a bit of caching built on-top of the clang API which often makes it feel instantaneous.

qyron · on Nov 18, 2017

I've been happy user of ycmd+rtags tandem for a couple of years. A killer feature of rtags me is its ability to run server on remote machine (of course source code must be mirrored too). This allows me to do develop on my weak 4-core laptop and offload indexing to fast 32-core workstation.

Regarding indexing time, all of these tools seem to parse source code using either libclang (C API) or "native" C++ API (RecursiveASTVisitor etc.), so IMHO any difference in indexing time between rtags and cquery should come from such factors as number of parsing threads, database for storing tags, caching etc.

Anyway I'm really excited about cquery and even consider moving to VSCode just because of it (being a long-term VIM user). Reliable "Find references" feature is (IMHO) a must-have functionality for large codebases and currently (thanks to cquery and rtags) is supported much better in modern C++ than in other system languages (such as Go and Rust).

sient · on Nov 18, 2017

> A killer feature of rtags me is its ability to run server on remote machine (of course source code must be mirrored too).

I have a similar use case in mind, so I'm planning on trying to get this working by writing a simple script that proxies language server messages over SSH/TCP. Ideally it should work with any language server.

> Regarding indexing time, all of these tools seem to parse source code using either libclang (C API) or "native" C++ API (RecursiveASTVisitor etc.), so IMHO any difference in indexing time between rtags and cquery should come from such factors as number of parsing threads, database for storing tags, caching etc.

Yea, it is amazing how big of a difference the architecture around indexing makes - any sort of global lock/shared state really hurts performance. I spent a significant amount of time finding the right architecture to make each index job as independent as possible. Most of the design decisions in cquery are oriented towards either latency or throughput at the cost of things like memory and total system load (I've since reduced memory usage, but at one point cquery used 30gb after indexing Chrome - now it is around 5gb).

> Anyway I'm really excited about cquery and even consider moving to VSCode just because of it (being a long-term VIM user). Reliable "Find references" feature is (IMHO) a must-have functionality for large codebases and currently (thanks to cquery and rtags) is supported much better in modern C++ than in other system languages (such as Go and Rust).

I'd like to see cquery support in vim as well using a vim LSP implementation :). But yes, I agree with you - cquery makes me want to continue using C++ over Rust simply because the tooling works a lot better.

chillee · on Nov 19, 2017

> I have a similar use case in mind, so I'm planning on trying to get this working by writing a simple script that proxies language server messages over SSH/TCP. Ideally it should work with any language server.

I think I know at least 2 efforts to do this by now. One at Facebook with Nuclide, and I believe VS Live Share does something similar.

I think it's an idea whose time has come, so I think it's pretty cool that so many people are doing it.

sient · on Nov 19, 2017

I'm happy with whatever works, but what I have in mind is very simple so I don't expect it to take a lot of engineering time :)

yalph · on Nov 17, 2017

Use c++ on a daily basis. I think the direction the language is going is really really troublesome. It was a mistake to follow the boost’s lead. At this point language is more or less a clusterfuck. While python peeps are saving the world here on this dark side of the world we wrestle with move semantics. Simplicity and beauty of C is long gone.

sobellian · on Nov 17, 2017

I used to use C++ a lot for a mixture of reasons. I avoid it like the plague now, because I'm not confident that anyone completely understands the semantics of a non-trivial C++ program.

Even if I did, the cognitive overhead of this stuff detracts from real programming.

kraghen · on Nov 17, 2017

I have a suspicion that all non-trivial C++ programs contain undefined behaviour (e.g. not checking for or preventing overflow at every arithmetic operation involving signed integers), so trying to understand the semantics of these programs is kind of a moot point.

On the other hand, if anyone tried to formalise the semantics of C++ and prove some kind of soundness theorem they would probably find that the language definition is technically inconsistent.

Paraphrasing Feynman, if you think you understand C++ then you don't understand C++.

fhood · on Nov 17, 2017

Also use c++ on a daily basis. It is a dark place, full of nooks and crannies inhabited by venomous insects that only get more dangerous with age. Also you cant declare a variable without constructing it or making it a raw pointer. Which. is. bullshit.

Edit: also, the memory issue for these types of services is a huge problem for me, since I am usually working on multiple projects at once, and when combined with all 32 of my chrome tabs, my work laptop starts to indicate some unhappiness.

gpderetta · on Nov 18, 2017

You can trivially declare a variable without constructing it by wrapping it in an anonymous union.

You'll have to destroy it yourself though because the compiler can't guarantee to be able to prove that it has been constructed at the end of scope. You can wrap it in something like boost optional to get safety.

mmjaa · on Nov 19, 2017

>> Simplicity and beauty of C is long gone.

Lives on in Lua. Everything I want to do in C++, I can do with Lua instead. Plus: still C under it all.

speps · on Nov 17, 2017

Is there one similar to this but that doesn't require your code to compile under Clang? I have plenty of projects at home that don't instantly work under Clang. So far, I've never seen any C++ language server that doesn't rely on Clang unfortunately.

orbifold · on Nov 17, 2017

Unlikely because GCC deliberately made it hard to reach into its internals for a long time. The only other implementation is the one by Microsoft based (presumably) on their C++ compiler.

emmelaich · on Nov 18, 2017

There is a LSP in progress for clang C/C++.

https://reviews.llvm.org/diffusion/L/browse/clang-tools-extr...

snowAbstraction · on Nov 17, 2017

I remember an old project to use the Eclipse CDT C++ support via vim instead of in the Eclipse IDE: http://eclim.org/index.html

I think there is/was some attempt to do it for emacs as well.

feanaro · on Nov 17, 2017

What are the usual culprits that prevent it from working under clang?

jhasse · on Nov 17, 2017

Clang doesn't support

    template<class ...Args>
    void foo(int x = 0, Args&&... args)

for example (default argument before variadic templates args).

lfowles · on Nov 17, 2017

Wow, I was going to guess that was illegal anyways, but cppref validates that behavior[0].

0: http://en.cppreference.com/w/cpp/language/default_arguments

jdright · on Nov 17, 2017

Run away from Jonathan Blow, as of today he was full rant mode on language server and similar stuff including on people that think this is a acceptable stuff, notably Hacker News audience.

FYI, I 150% disagree with his useless rants.

j_s · on Nov 17, 2017

https://twitter.com/Jonathan_Blow/status/931147749306896385

iainmerrick · on Nov 17, 2017

I saw that on Twitter but I couldn't figure out what his actual complaint was. I think I'd probably disagree but I'm still interested in knowing.

cgag · on Nov 17, 2017

That doing all of this stuff via rpcs vs function calls is crazy I believe.

iainmerrick · on Nov 17, 2017

He's not totally wrong, but casting it as a 100% clear-cut decision -- with the RPC version as the obviously bad, "highly damaging" choice -- is just wilfully stupid and dishonest. It's clearly a tradeoff.

Too · on Nov 18, 2017

Maybe if your code lives in a small bubble with everybody using the same IDE but definitely not for larger projects.

For huge projects you want the build/index performance from a fast build server rather than a laptop, buildi environment inside a docker container, IDE in windows and build in linux, free choice of IDE rather than tighlty tied plugins.

Though i fully agree that json might not be the best packaging.

pjmlp · on Nov 17, 2017

Specially from security point of view.

There is a reason why most of us learned our lesson and moved plugins out-of-process.

Performance can still be achieved via shared memory, exposing much less data from the host process.

emmelaich · on Nov 18, 2017

Not only via rpc but json over rpc, because latency.

He has a point but there's nothing to stop you bringing some of this in-process or caching aggressively.

blaket · on Nov 17, 2017

What's the word on using this with emacs?

sient · on Nov 17, 2017

Eventually this will be supported with lsp-mode. If you check https://gitter.im/cquery-project/Lobby @topisani made good progress here quickly and already has cquery up and running.

snowAbstraction · on Nov 17, 2017

Search for the discussion about rtags. It does something similar for emacs using libclang.

jdlyga · on Nov 17, 2017

This looks nice. Reminds me a lot of the Clang Code Model that I use with QtCreator.

photonios · on Nov 17, 2017

I can only applaud this effort!

pollow · on Nov 17, 2017

Looks pretty, I wonder how it different from ycmd?

sient · on Nov 17, 2017

See https://news.ycombinator.com/item?id=15725119