Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Funny how you can take the "optimized" C++ and make it significantly faster (vs. gnu/linux standard libraries) with little effort. A lookup table for to_lower, instead of bit math, helps slightly. Building with the LLVM-libc string functions helps a lot, because the GNU std::string_view::operator== is calling an AVX-optmized memcmp via the PLT, which is a terrible strategy for short strings. LLVM-libc has an inlined bcmp (note: bcmp can be significantly faster than memcmp, but on GNU they are aliases) that is resolved for the target platform at build time instead of run time.

Edit:

Even if you ignore LLVM-libc, just slapping this into the optimized.cpp and replacing the critical `==` with `our_bcmp` makes it 10% faster. IFUNC calls to micro-optimized SIMD functions are counterproductive. It is far, far better that the compiler can see all the code at build time.

  int our_bcmp (const char* a, const char* b, size_t sz) {
    for (size_t i = 0; i < sz; i++) {
      if (a[i] != b[i]) return 1;
    }
    return 0;
  }


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: