Hacker Newsnew | past | comments | ask | show | jobs | submit | CalmStorm's commentslogin

For the first time, it introduced native sparse attention into the full training process, achieving up to 11× inference speedup while maintaining model performance.


LLaSA is a simple framework for speech synthesis that employs a single-layer vector quantizer (VQ) codec and a single Transformer architecture to fully align with standard LLMs such as LLaMA.


Probably the title should have the correct capitalization then. Cause I was fully expecting a speech synthesis tool that sounded like llamas talking human language and now I'm bummed out!


I don’t quite understand why OpenAI would pay so much when there’s a solid open-source alternative like Cline. I tried both, and feel that Cline with DeepSeek v3 is comparable to Cursor and more cost-effective.


Plus they already have a coding agent. This is beginning to feel like Yahoo 2.0.

When you have just raised $40 billion and you spend $3 billion on a company that has a product that you also build that is dumb as rocks.


People do this exact thing all the time. Facebook paid $1 billion for Instagram whenever Facebook's cash and marketable securities was only $9-10 billion, even though Facebook already had a mobile social media app.


That was a mostly stock deal, right?


I couldn't remember, thanks for raising that point. So, I searched and found this CNET article[0] which states Facebook ended up only paying $715 million: "Although Instagram's original price tag was $1 billion, Facebook ends up paying $521 million in cash and $194 million in stock for the popular photo-sharing network."

I guess not mostly stock, but still half a billion cash, although not $1 billion. I guess my original point still stands, though it isn't quite as impactful an example. :)

[0] https://www.cnet.com/tech/services-and-software/facebooks-fi...


That describes Google buying Youtube and Facebook buying WhatsApp, those seem to have turned out okay.


WhatsApp is not profitable yet. At least not as October 2024.

YouTube had the advantage of being able to post pirated videos something that I'm not sure Google would have been able to do. YouTube gained traction in ways Google couldn't duplicate.


People clearly want the subscription model where they don’t have to worry about API keys and such. I bet a huge chunk of this market is non technical people who can’t code and don’t realize how bad the code they’re writing is when Windsurf and Cursor chop off the context to make it cheaper to run.


Maybe enterprise customers. Open-source alternatives aren't really an alternative for many huge enterprises, due to the lack of support.


they get the people too... they are still hiring like crazy.


If it achieves its goal of both “safe” and “super intelligence”, it may have commercial value. e.g. enterprises may want to use it instead of, e.g. OpenAI’s ChatGPT. Think of recent AirCanada’s chatbot law suit.


One of the most popular HSM is Thales Luna Network HSM, which can perform 20,000 ECC operations per second [1]. Even with the size of Azure AD, Microsoft may not need a lot of HSMs for signing purpose. HSMs are not particularly easy to manage though, maybe that is one of reasons they are not used as much as they should be.

[1] https://cpl.thalesgroup.com/encryption/hardware-security-mod...


For modern CPU, is microcode still used for division or there are better and faster alternatives?


The 80186 already implemented the loop for multiplication and (unsigned) division in dedicated logic that could do 1 bit per cycle.

The microcode only had to set up the registers, test for overflow, and undo the final subtraction if it underflowed (since it now used a non-restoring algorithm).

Starting with the 486 there is likely no longer any microcode involved, and newer chips can handle multiple bits per step. See end of article, the first algorithm to do this comes from 1957!


It's way more complex than this on modern CPUs, so it's harder to explain.

> is microcode still used for division?

Yes, almost everything on a modern CPU uses "microcode" of some kind, although the term gets kind of hazy, since everything is out-of-order and a lot of instructions are issued in parallel. In a typical modern CPU, the "frontend" will decompose instructions into uOps, which then get pushed into a "reservation station" / "scheduler." The scheduler queues and reorders various uOps in various surprising and complicated ways to try to account for interdependencies and memory latency. Eventually, a uOp is issued into to an "execution port," which is connected to a fixed-function piece of logic that actually performs part or all of an operation (for example, an Arithmetic Unit / ALU).

But, while microcode will be _involved_ still, most modern CPUs will have fixed-function hardware for the meaty parts of the division instruction - they generally speaking won't implement division _purely_ using microcode like the algorithm documented in the article.

> are there better and faster alternatives?

They're not "alternatives" per se, but there are a _lot_ of ways to implement division algorithmically, and a _lot_ of ways to trade size for speed.

Improving integer division performance has been a fairly big focus in newer CPU microarchitectures, with major improvements arriving in the latest Intel, AMD, and Apple Silicon architectures.

https://uops.info/table.html will show how many uOps a given x86 instruction decomposes to, what ports it uses, and rough latency estimates for the instruction's execution.

Here's some reading I found, with a lot of references:

* A discussion of various modern division implementations: https://stackoverflow.com/questions/71420116/why-is-there-on...

* Performance comparison of integer division in modern architecutures with some implementation speculation: https://news.ycombinator.com/item?id=27133804

* An in-depth look at the division unit in mid-old Intel CPUs (Penryn), look for "Radix-16" : https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&d...


>almost everything on a modern CPU uses "microcode" of some kind, although the term gets kind of hazy

µOps are different from the kind of microcode described here. Older x86 CPUs basically had a "bytecode interpreter" in microcode ROM, every instruction (except for some trivial set/clear flag operations) would go to a specific entry point, and even something simple like addition would take at least two µ-instrs.

The 80486 was the first generation that could decode some opcodes directly into one-cycle µOps.

edit

The term "interpreter" is of course a simplified description. The decoding itself is done outside of microcode, and there is logic to select different registers or ALU operations etc. But conceptually it's similar in that almost every opcode transfers control to some sequence of microinstructions ending in "RNI", which acts like a jump back to the main interpreter loop.

The 8086 is actually the closest to the "RISC-like microcode" meme, in that even address computation is done by a series of µ-instrs.


Edit: just saw your edit, that's something I'd never really thought about before - 8086 is the "purest" microcoded processor from the x86 series, in that every instruction runs through an actual interpreter rather than some form of fixed-function instruction issue unit!

In the case of integer division, I think that it's also the "true" kind of microcoded instruction on many modern CPUs. That is to say, the instruction goes through the actual microcode interpreter to issue the uOps, rather than the fixed function decoder. Although, it's been awhile since I had to worry about microcode switches, and it looks like maybe this isn't true anymore in the very newest microarchitectures?

I think this is for two reasons: so that the microcode can switch between "fast" and "slow" division and issue a different uOp program for fast division, and because most division is longer than the fixed-function decoder width (I think on Intel it used to be anything longer than 4 uOps?).

Anyway, I figured this was probably a bit more detail than what OP needed for their question about division algorithms, so here are my takeaways:

* Yes, integer division is implemented as multiple operations on many modern CPUs, although it is increasingly moving towards hardware (fewer uOps).

* Sometimes the micro-instruction programs for division are encoded in a fixed operation decoder and sometimes they are themselves generated by microcode.

* But, at the same time no, division is not implemented algorithmically using purely non-division functions, there is usually some fixed-function division logic of various types.

* This is a hot topic so change is rapid.


While division may still decode to multiple uOps, I seriously doubt that there's a loop in microcode on modern processors. The pipeline latency makes that infeasible.

The looping logic is almost certainly a bit of fixed function hardware in the execution unit.


Hmm. This gets into the fuzzy definition of "loop in microcode" depending on how you look at the system. I don't think the actual looping happens in microcode, that is, it's not like the ucode unit jumps to earlier ucode - this wouldn't make sense architecturally for a variety of reasons.

However, in the case of 64-bit integer division on mid-aged Intel processors (for example, Kaby Lake), I do think that division is both iterative and microcoded (versus fixed-function logic), but that the ucode emits an _unrolled_ loop into the scheduler.

IDIV with 64-bit operands on Kaby Lake takes 56/57 uOps (!) vs the still-huge 11 uOps for 32-bit IDIV. (for comparison, we're down to 5/4 uOps for 64-bit division on Alder Lake).


Take a look at Agner Fog's https://www.agner.org/optimize/instruction_tables.pdf

For example, Zen4 64-bit DIV is listed as: 2 uOps, 10-18 cycles latency, 7-12 cycles inverse throughput.

This suggests uOps with variable execution lengths, i.e. iteration happening in the execution unit and not just a fixed unrolled loop streamed by the microcode part of the frontend.

You may be right that there were some CPUs that did the fixed unrolling, but it doesn't seem that common.


My understanding is that there can be both. That the execution pipes themselves on some implementations have a 'nanocode' for stuff like cordics and maybe division who's execution streams are kicked off from the one or two high level uOps that the instruction decoder emits.


Sort of. uOps don't have to come out of ucode. There's fixed function hardware that cracks most instructions into uOps and only falls back to ucode for particularly complex operations.



The article doesn’t mention whether SQLite WAL mode is on. That could make SQLite significantly faster: https://sqlite.org/wal.html


It mentions it at the end: ./tmp.db?_timeout=5000&_journal=WAL&_sync=1

Not sure if it was updated after your comment.


Oracle’s ownership of MySQL AB seems to have negative impact on its development. My concern, which is also raised in the article, is whether similar things will happen to Postgresql.


I doubt that's possible, for a number of reasons.

1) Neither company owns any part of the PostgreSQL code / docs. We do have various proprietary tools, of course, but so do other companies.

2) The companies employ quite a few PostgreSQL hackers, but if you look at the development stats (e.g. http://rhaas.blogspot.com/2020/05/who-contributed-to-postgre...) you'll see there are many committers and developers in other companies.

That being said, I agree with the concerns expressed by Bruce in his blog post. Or more precisely - I do understand why people in the community have those concerns, but I believe (hope) it's not going to harm the community.


This is why I am thankful that Percona [1] is already involved with PG.

Percona created their own mysql distribution [2] with a number of improvements, which was quite helpful while awaiting (very long) release cycles from Oracle, for MySQL community.

1 - https://www.percona.com/software/postgresql-distribution

2 - https://www.percona.com/software/mysql-database


What programming languages does it support? I couldn’t find out from the press release.


I had to click around a bit from the PR but:

CodeQL code scanning automatically detects code written in the supported languages

   C/C++
   C#
   Go
   Java
   JavaScript/TypeScript
   Python
Source: https://docs.github.com/en/free-pro-team@latest/github/findi...


Interesting. Consider that Github uses lots of Ruby but the tool does not support Ruby from the get go.


PM from GitHub here.

We're adding Ruby support to CodeQL (the scanning engine used in code scanning by default). It's our top requested language, and one we use extensively internally. Adding each new language to CodeQL takes about 6-9 months and needs a team to maintain it in perpetuity, which is why we don't have it yet, but we're starting that work now.

The other languages we hear the most demand for CodeQL support on are PHP, Kotlin and Swift. We'll get to all of those - it will just take a little time.

In the meantime, all of the code scanning experiences are extensible, so you can use other scanning engines with it, like Brakeman for Ruby.


Is the CodeQL project itself open source? I would love to contribute support for Elixir.



CodeQL is based on an existing product from a company called Semmle which GitHub acquired in late 2019 [1]

They have been part of GitHub for barely a year so it's not too surprising, especially given they are continuing to support the product for the enterprise customers they had previously not just GitHub.

[1] https://techcrunch.com/2019/09/18/github-acquires-code-analy...


I have been noticing that a lot as well on other places. Ruby is no longer a "Default supported" language in many new projects.

OpenTelemtry for example doesn't include Ruby in its initial beta program announcement.

".NET, Java, JavaScript, Python, Go, and Erlang!"


>Ruby is no longer a "Default supported" language in many projects.

Well it has always been like that. Amazon and Google has always had a thing about Ruby, and it is a minority market so I am not surprised and it doesn't make sense from Business perspective. But Github is a heavy Ruby users so I would have thought Ruby would be a first class citizen. I wonder if it has something to do with the language complexity.

Edit: From Github.

https://news.ycombinator.com/item?id=23094160

We (GitHub) absolutely plan to expand the list of languages CodeQL supports, and Ruby is a language we'd love to add (we're heavy users of it internally). In the meantime, because code scanning is extensible you can plug in third party analysis engines to scan the languages that CodeQL doesn't support.


Kind of surprised PHP isn't a front runner here


OK, so I won’t be using it. I write in Swift, ObjC, and PHP.

I really think that PHP should be on that list.


That's a pretty good start! I'm happy to see security checking be democratized.


Code scanning itself isn't language specific. There are open source and commerical tools for many languages available. CodeQL is GitHub's first party tool, and it supports C, C++, C#, Go, Java, JavaScript, Typescript, and Python.


I'm not sure but from [1]:

> This open-source repository contains the extractor, CodeQL libraries, and queries that power Go support in LGTM and the other CodeQL products that GitHub makes available to its customers worldwide.

[1]: https://github.com/github/codeql-go#go-analysis-support-for-...


If I read correctly code scanning is based on CodeQL, and i found this https://help.semmle.com/codeql/supported-languages-and-frame...

So it looks like it supports C/C++/C#/Go/Java/JS/Python/TS


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: