More

mrfox321 · 2025-11-29T17:28:32 1764437312

... the toppings contains potassium benzoate

bn-l · 2025-11-30T05:05:17 1764479117

Is that bad?

mrfox321 · 2025-07-08T18:41:56 1752000116

I know right?

They have always been behind. Why would this time be any different?

mrfox321 · 2025-06-15T13:26:43 1749994003

Read the paper.

They control for the data being in-distribution

Their dataset also has examples of the problem being solved.

mrfox321 · 2025-05-21T17:50:32 1747849832

Sounds like excuses

mrfox321 · 2025-03-07T10:51:21 1741344681

They didn't show this, they just increased the length where accuracy breaks down.

energy123 · 2025-03-07T11:00:41 1741345241

Explain? OpenAI showed the new scaling law in December 2024 that performance keeps increasing proportional to ln(N reasoning tokens)

mentalgear · 2025-03-07T12:17:09 1741349829

link?

mrfox321 · 2025-02-19T15:07:22 1739977642

What's your leg circumference at, now?

Mine is also smaller, due to patella tendinopathy.

wkirby · 2025-02-19T16:08:31 1739981311

They are within 1 inch of each other, which is fine with me. I haven’t measured in over a year, I know there’s still a strength imbalance but that’s not what feels limiting to me anymore.

mrfox321 · 2025-01-02T19:10:23 1735845023

Who do you think the shareholders are?

What demographic are they?

sosborn · 2025-01-02T19:12:39 1735845159

Do you think it's a generational issue, or a class issue?

superkuh · 2025-01-03T14:25:51 1735914351

Exactly. It's not the fault of the youths and it's not the fault of the old. Both it and mine are obviously false claims yet one is accepted as a WSJ article and one is pointed at as silly.

My "alternately" was not to say that the alternate was an accurate description of reality but only to highlight how silly such broad statements are when you look from the other end.

mrfox321 · 2024-12-20T15:14:34 1734707674

I mean, it's their serving infra, but not their features. See my reply in another parent comment.

mrfox321 · 2024-12-20T13:44:24 1734702264

Not true. Why do IG reels and YouTube shorts suck, then?

They clearly built something superior. And it can't seem to be matched by the biggest tech companies.

xnx · 2024-12-20T13:52:20 1734702740

More data beats better algorithms. TikTok has vastly more interaction data by nature of its design. IG and YouTube shorts don't have nearly the volume of engaged users and are reluctant to disrupt the cash cow of their traditional interfaces.

mrfox321 · 2024-12-20T14:08:51 1734703731

IG has plenty of data (I did ML at IG). Don't be naive.

PaulHoule · 2024-12-20T14:41:24 1734705684

It is the users and not the company.

I haven't worked on sites as big as YouTube but on sites with 100,000 members who are very much engaged with one "game" you usually find they are mainly indifferent when you offer them another "game" to play.

I like YouTube for what it is. I have interacted very little with shorts but Google has scarily seen into my imagination. I don't want to go into that rabbit hole.

mrfox321 · 2024-12-20T15:07:06 1734707226

IG users and Tiktok users were / are quite similar. Especially when Tiktok wasn't yet eating IGs lunch.

PaulHoule · 2024-12-20T17:26:30 1734715590

The definition of "similar" is the problem w/ vector search isn't it?

Two populations can be similar in terms of conventional demographics such as age, gender, race, what kind of clothes they wear, etc. but be different in their behavior. IG users are "players of the Instagram game" and TikTok are "players of the TikTok game" and a whole system of values and behaviors are involved.

To take an example playing the "engagement farming" game on Bluesky I can follow people and know some fraction of people will follow me back, but who do I want to follow?

I postulated that the people I want are people who will repost my photos so I can try following people who repost photos but I find that reposters are not "followers" whereas I get a much better response rate if I follow people who follow another social media photographer since those people are "followers". People have an online behavior signature like that which for me matters more than the color of your skin.

newsclues · 2024-12-20T14:05:02 1734703502

I question the assertion that TikTok has more interaction data than Google.

xnx · 2024-12-20T15:21:58 1734708118

Google has a ton of interaction data to be sure, but the app design decisions of TikTok (auto play, auto loop, easy swipe, easy like, etc.) extract so much more usable/actionable interaction data. The size of the like button on YouTube is a tiny percent of the screen. On TikTok the like button is the whole video.

leobg · 2024-12-20T21:39:33 1734730773

Not just that. The whole UI is designed for behavioral data aggregation.

It’s not just “did you click the like button”. It’s “did you swipe it away? How long did you watch until you swiped it away? Did you come back afterwards? Did you let it loop multiple times before moving on?”.

They’ll capture likes and dislikes you yourself probably didn’t even knew you had, just from tens and hundreds of these micro actions. And they’ll do it in the very first hour of you using the app, whereas YouTube won’t know too much about you even after months of you using it.

wiseowise · 2024-12-20T14:18:06 1734704286

More data? Seriously? What has more data than YouTube?

xnx · 2024-12-20T15:17:49 1734707869

A TikTok user may watch hundreds of videos and like dozens of them in a single viewing session. A YouTube user might watch ... 4? YouTube tried to force 10+ minute videos so they could insert television-style commercials.

4ad · 2024-12-20T14:55:25 1734706525

IG reels and YouTube shorts are crap because creators create good content only for the place where all the audience is, which is TikTok. When users open TikTok they expect TikTok-style content. When users open YouTube they don't expect TikTok-style content, in fact they hate it. Same with IG reels.

It has nothing to do with the quality of the algorithm. In fact the YT algorithm has gotten worse since they introduced shorts because they shove shorts into people's faces.

A better question would be why is the regular YouTube algorithm so bad. And the answer is because it doesn't optimize at all for the consumers, but for the producers (producers of ads, that is). TT has figured out it doesn't matter what people consume as long as they consume, whereas YT is bullish into controlling what people consume.

davidjfelix · 2024-12-20T13:51:31 1734702691

My take: it's a mix of brand bundling and lack of data. They're roughly equivalent but shorts is bundled with youtube which has its own brand perception and reels are bundled with IG/FB and have their own brand perception. Additionally fewer users means less algorithmic data to keep viewers.

Tiktok was allowed to establish its own brand and develop a community while shorts and reels are intrinsically tied to their past. They may be able to escape that history but I don't think it's helping them be fast movers or win "cool" points.

Majromax · 2024-12-20T13:57:56 1734703076

> My take: it's a mix of brand bundling and lack of data. They're roughly equivalent but shorts is bundled with youtube which has its own brand perception and reels are bundled with IG/FB and have their own brand perception. Additionally fewer users means less algorithmic data to keep viewers.

My intuition would work the other way around. I'd expect offerings from more established companies to have a big leg up in terms of usable data. Youtube should be able to use a viewer's entire watch/subscription history to inform itself about what shorts a user might like, even before they've interacted with their first short. Bytedance, on the other hand, has to start from scratch with each truly new user.

The coolness or stodginess of the company would be secondary to its effects. If boring-old-Youtube could promise shorts creators great exposure to an enthusiastic audience, it would win the platform regardless of its brand.

PaulHoule · 2024-12-20T14:58:01 1734706681

I'll argue that TikTok's structure which offers you one video at a time gives you much more useful information than YouTube's interface, which looks like

https://www.threads.net/@mikeynerd/post/DB7DS7LzsVU

TikTok gets a definite thumbs up or thumbs down for every video it shows you whereas if you click on one particular sidebar video YouTube can make no conclusion about how you felt about the other videos in the sidebar. The recommendation literature talks about "negative sampling" to overcome this, I never could really believe in it, I think now it doesn't really work.

I built a system like that and found that, paradoxically, you have to make it blend in a good amount of content that it doesn't think you'd like for it to be able to calibrate itself.

trogdor · 2024-12-20T15:37:20 1734709040

> If boring-old-Youtube could promise shorts creators great exposure to an enthusiastic audience, it would win the platform regardless of its brand.

Just a guess, as someone who makes their living from YouTube: YouTube creators are driven to create content that earns them money. As compared to long-form content, YouTube shorts earn next-to-nothing, and it’s not clear that they drive significant new traffic to more-valuable content.

Most large creators on YouTube are focused on the bottom line, not exposure.

fragmede · 2024-12-20T15:40:59 1734709259

The reason shorts don't earn any money, as compared to Instagram and TikTok, is that they don't advertise crap for me to buy (I have YT premium), so I don't end up buying shit there like I do the other two.

spamizbad · 2024-12-20T15:00:04 1734706804

Having read the paper, what's unique about Bytedance's approach is how relatively simple it is at its core - obviously there's a lot of complexity around it to do it at scale, but I feel like it's simpler than the social-graph based approaches.

mrfox321 · 2024-12-20T15:12:07 1734707527

It's simpler intuition but more complex from a data / ml perspective.

Their algorithm is really built around their features. Specifically, temporal representations of user interest:

https://ieeexplore.ieee.org/document/9458799/

The features used by their algorithm tells you what a user is interested, historically.

Contrast this to Meta, which uses the social graph as their features. Imagine features like the number of times a user likes another author's / cluster's content.

Tiktok will serve you $TOPIC because you have $INTERACTED with $TOPIC historically.

Meta will serve you $TOPIC because you have $INTERACTED with $PEOPLE who post $TOPIC, historically.

Meta only coincidentally gives you what you like.

Tiktok knows what you like.

This is the difference. This is why IG is losing.

xnx · 2024-12-20T15:14:17 1734707657

That's a crazy design choice by meta if true. The interests of those in my social graph have very little connection to my interests.

mrfox321 · 2024-12-20T15:28:39 1734708519

It's because they originally built their recommendation system to recommend friends and their content. Here, the social graph makes complete sense as the foundation for their simple search algorithm. But as they expanded their recommendation capabilities, the features stuck around. It's the same reason why tech debt accumulates. Data sticks around in the same way code does. But data is even higher friction, since it's a superset of the code.

ldjkfkdsjnv · 2024-12-20T14:41:26 1734705686

Reels isnt getting the organic growth data, tiktok has a data moat.

solarkraft · 2024-12-20T17:29:46 1734715786

I haven’t used TikTok, but I find both Reels and Shorts addictive enough.

deeviant · 2024-12-20T14:56:01 1734706561

What part of right place at the right time did you miss?

mrfox321 · on Nov 29, 2024

At a big co I worked at, the lack of consistency between trading systems caused money to (dis)appear (into)out of thin air.

Prior to one of these hiccups, I hypothesized, given how shitty the codebase was, that they must be tracking this stuff poorly.

This led to an argument with my boss, who assumed things magically worked.

Days later, we received an email announcing an audit one one of these accounting discrepancies.

JPMC proposed using crypto, internally, to consistently manage cash flow.

Not sure if it went anywhere.

HolyLampshade · on Nov 29, 2024

At all of the exchanges and trading firms I’ve worked with (granted none in crypto) one of the “must haves” has been a reconciliation system out of band of the trading platforms. In practice one of these almost always belongs to the risk group (this is usually dependent on drop copy), but the other is entirely based on pcaps at the point of contact with every counterparty and positions/trades reconstructed from there.

If any discrepancies are found that persist over some time horizon it can be cause to stop all activity.

ajb · on Nov 29, 2024

Wait, pcap as in wireshark packet capture?

tnlnbn · on Nov 29, 2024

I'm not the commenter, but yes, often trading firms record all order gateway traffic to from brokers or exchanges at the TCP/IP packet level, in what are referred to as "pcap files". Awkwardly low-level to work with, but it means you know for sure what you sent, not what your software thought it was sending!

pclmulqdq · on Nov 29, 2024

The ultimate source of truth about what orders you sent to the exchange is the exact set of bits sent to the exchange. This is very important because your software can have bugs (and so can theirs), so using the packet captures from that wire directly is the only real way to know what really happened.

generic92034 · on Nov 29, 2024

But then the software capturing, storing and displaying the packets can also have bugs.

bostik · on Nov 29, 2024

Among all the software installed in a reputable Linux system, tcpdump and libpcap are some of the most battle tested pieces one can find.

Wireshark has bugs, yes. Mostly in the dissectors and in the UI. But the packet capture itself is through libpcap. Also, to point out the obvious: pcap viewers in turn are auditable if and when necessary.

wjholden · on Nov 29, 2024

Cisco switches can mirror ports with a feature called Switch Port Analyzer (SPAN). For a monitored port, one can specify the direction (frames in, out, or both), and the destination port or VLAN.

SPAN ports are great for network troubleshooting. They're also nice for security monitors, such as an intrusion detection system. The IDS logically sees traffic "on-line," but completely transparent to users. If the IDS fails, traffic fails open (which wouldn't be acceptable in some circumstances, but it all depends on your priorities).

generic92034 · on Nov 30, 2024

When I think "Cisco" I think error-free. /s

No, really, I get where you and your parent are coming from. It is a low probability. But occasionally there is also thoroughly verified application code out there. That is when you are asking yourself where the error really is. It could be any layer.

kortilla · on Nov 29, 2024

They can, but it’s far less likely to be incorrect on the capture side. They are just moving bytes, not really doing anything with structured data.

Parsing the pcaps is much more prone to bugs than capturing and storing, but that’s easier to verify with deserialize/serialize equality checks.

chairmansteve · on Nov 29, 2024

The result of bitter lessons learnt I'm sure. Lessons the fintechs have not learned.

baq · on Nov 29, 2024

That makes sense - but it's still somewhat surprising that there's nothing better. I guess that's the equivalent of the modern paper trail.

HolyLampshade · on Nov 29, 2024

It’s the closest to truth you can find (the network capture, not the drop copy). If it wasn’t on the network outbound, you didn’t send it, and it’s pretty damn close to an immutable record.

ajb · on Nov 29, 2024

It makes sense. I'm a little surprised that they'd do the day to day reconciliation from it but I suppose if you had to write the code to decode them anyway for some exceptional purpose, you might as well use it day to day as well.

thomasjudge · on Nov 29, 2024

The storage requirements of this must be impressive

bostik · on Nov 29, 2024

Storage is cheap, and the overall figures are not that outlandish. If we look at a suitable first page search result[0], and round figures up we get to about 700 GB per day.

And how did I get that figure?

I'm going to fold pcap overhead into the per-message size estimate. Let's assume a trading day at an exchange, including after hours activity, is 14 hours. (~50k seconds) If we estimate that during the highest peaks of trading activity the exchange receives about 2M messages per second, then during more serene hours the average could be about 500k messages per second. Let's guess that the average rate applies 95% of the time and the peak rate the remaining 5% of the time. That gives us an average rate of about 575k messages per second. Round that up to 600k.

If we assume that an average FIX message is about 200 bytes of data, and add 50 bytes of IP + pcap framing overhead, we get to ~250 bytes of transmitted data per message. At 600k messages per second, 14 hours a day, the total amount of trading data received by an exchange would then be slightly less than 710GB per day.

Before compression for longer-term storage. Whether you consider the aggregate storage requirements impressive or merely slightly inconvenient is a more personal matter.

0: https://robertwray.co.uk/blog/the-anatomy-of-a-fix-message

tetha · on Nov 29, 2024

And compression and deduplication should be very happy with this. A lot of the message contents and the IP/pcap framing overheads should be pretty low-entropy and have enough patterns to deduplicate.

It could be funny though because you could be able to bump up your archive storage requirements by changing an IP address, or have someone else do that. But that's life.

oblio · on Nov 30, 2024

Why? They're not streaming 4k video, it's either text protocol or efficient binary protocols.

w23j · on Nov 29, 2024

I would also really like to know that!

It generally seems to be a thing in trading: https://databento.com/pcaps

There is also this (though this page does not specify what pcap means): https://www.lseg.com/en/data-analytics/market-data/data-feed...

alexwasserman · on Nov 29, 2024

Look up Corvil devices by Pico.

Commonly used in finance.

https://www.pico.net/corvil-analytics/

alexwasserman · on Nov 29, 2024

Typically not a literal pcap. Not just wireshsrk running persistently everywhere.

There are systems you can buy (eg by Pico) that you mirror all traffic to and they store it, index it, and have pre-configured parsers for a lot of protocols to make querying easier.

Think Splunk/ELK for network traffic by packet.

cjalmeida · on Nov 29, 2024

Except it is literal “pcap” as they capture all packets at layer 3. I don’t know the exact specifications of Pico appliances, but it would not surprise me they’re running Linux + libpcap + some sort of timeseries DB

alexwasserman · on Nov 29, 2024

Well, probably, but I meant more like it's not typically someone running tcpdump everywhere and someone analyzing with Wireshark, rather than a systems configured to do this at scale across the desktop.

kelnos · on Nov 30, 2024

I don't think that's what anyone was assuming. A "pcap" is a file format for serialized network packets, not a particular application that generates them.

hn_version_0023 · on Nov 29, 2024

The Corvil devices used by Pico have IME largely been replaced by Arista 7130 Metamux platforms at the capture “edge”

HolyLampshade · on Dec 1, 2024

Which is great for the companies that have made the switch because those corvils were truly terrible.

HolyLampshade · on Nov 30, 2024

Looks like tnlnbn already answered, but the other benefit to having a raw network capture is often this is performed on devices (pico and exablaze just to name two) that provide very precise timestamping on a packet by packet basis, typically as some additional bytes prepended to the header.

Most modern trading systems performing competitive high frequency or event trades have performance thesholds in the tens of nanos, and the only place to land at that sort of precision is running analysis on a stable hardware clock.

Loic · on Nov 29, 2024

I suppose Pre-Calculated Aggregated Positions, but I am not an expert in the field.

SnorkelTan · on Nov 29, 2024

Looking at the order messages sent to and received from another trading system was not uncommon when I worked in that neck of the woods

chairmansteve · on Nov 29, 2024

The crypto firms are moving fast and breaking things. No need for that kind of safety shit, right? Would slow things down. Reminds me of Boeing.

nobodyandproud · on Nov 29, 2024

So is this capture used to reconstruct FIX messages?

HolyLampshade · on Nov 30, 2024

Yeah, FIX or whatever proprietary binary fixed-length protocols (OUCH or BOE for example) the venue uses for order instructions.

Some firms will also capture market data (ITCH, PITCH, Pillar Integrated) at the edge of the network at a few different cross connects to help evaluate performance of the exchange’s edge switches or core network.

phire · on Nov 30, 2024

Fun fact, centralized crypto exchanges don't use crypto internally, it's simply too slow.

As a contractor, I helped do some auditing on one crypto exchange. At least they used a proper double-entry ledger for tracking internal transactions (built on top of an SQL database), so it stayed consistent with itself (though accounts would sometimes go negative, which was a problem).

The main problem is that the internal ledger simply wasn't reconciled with with the dozens of external blockchains, and problems crept in all the time.

oblio · on Nov 30, 2024

> Fun fact, centralized crypto exchanges don't use crypto internally, it's simply too slow.

I know you're not arguing in their favor, just describing a reality, but the irony of that phrase is through the roof :-)))

Especially the "centralized crypto".

phire · on Nov 30, 2024

Yeah, that fact alone goes a long way to proving there is no technical merit to cryptocurrencies.

The reason they are now called "centralised crypto exchanges" is that "decentralised crypto exchanges" now exist, where trades do actually happen on a public blockchain. Though, a large chunk of those are "fake", where they look like a decentralised exchange, but there is a central entity holding all the coins in central wallets and can misplace them, or even reverse trades.

You kind of get the worst of both worlds, as you are now venerable to front-running, they are slow, and the exchange can still rug pull you.

The legit decentralised exchanges are limited to only trading tokens on a given blockchain (usually ethereum), are even slower, are still vulnerable to front-running. Plus, they spam those blockchains with loads of transactions, driving up transaction fees.

naasking · on Nov 29, 2024

> JPMC proposed using crypto, internally, to consistently manage cash flow.

Yikes, how hard is it to just capture an immutable event log. Way cheaper than running crypto, even if only internally.

imglorp · on Nov 29, 2024

Harder than you'd think, given a couple of requirements, but there are off the shelf products like AWS's QLDB (and self hosted alternatives). They: Merkle hash every entry with its predecessors; normalize entries so they can be consistently hashed and searched; store everything in an append-only log; then keep a searchable index on the log. So you can do bit-accurate audits going back to the first ledger entry if you want. No crypto, just common sense.

Oddly enough, I worked at a well known fintech where I advocated for this product. We were already all-in on AWS so another service was no biggie. The entrenched opinion was "just keep using Postgres" and that audits and immutability were not requirements. In fact, editing ledger entries (!?!?!?) to fix mistakes was desirable.

to11mtm · on Nov 29, 2024

> The entrenched opinion was "just keep using Postgres" and that audits and immutability were not requirements.

If you're just using PG as a convenient abstraction for a write-only event log, I'm not completely opposed; you'd want some strong controls in place around ensuring the tables involved are indeed 'insert only' and have strong auditing around both any changes to that state as well as any attempts to change other state.

> In fact, editing ledger entries (!?!?!?) to fix mistakes was desirable.

But it -must- be write-only. If you really did have a bug fuck-up somewhere, you need a compensating event in the log to handle the fuck-up, and it better have some sort of explanation to go with it.

If it's a serialization issue, team better be figuring out how they failed to follow whatever schema evolution pattern you've done and have full coverage on. But if that got to PROD without being caught on something like a write-only ledger, you probably have bigger issues with your testing process.

rdpintqogeogsaa · on Nov 29, 2024

Footnote to QLDB: AWS has deprecated QLDB[1]. They actually recommend using Postgres with pgAudit and a bunch of complexity around it[2]. I'm not sure how I feel about such a misunderstanding of one's own offerings of this level.

[1] https://docs.aws.amazon.com/qldb/latest/developerguide/what-...

[2] https://aws.amazon.com/blogs/database/replace-amazon-qldb-wi...

imglorp · on Nov 29, 2024

Yeah. I'm surprised it didn't get enough uptake to succeed, especially among the regulated/auditable crowds, considering all the purpose built tech put into it.

naasking · on Nov 29, 2024

I think you're forgetting how many businesses are powered by Excel spreadsheets. This solution seems too advanced and too auditable.

baq · on Nov 29, 2024

I'll just leave that here for no particular reason at all:

https://www.sec.gov/enforcement-litigation/whistleblower-pro...

voidfunc · on Nov 29, 2024

Better hurry, Elon is gonna dismantle the SEC in about 45 days

fragmede · on Nov 29, 2024

Importantly, the SEC is empowered to give 10-30% of the money siezed via whistleblowing too the whistle blower.

chipsa · on Nov 30, 2024

> Merkle hash every entry with its predecessors; normalize entries so they can be consistently hashed and searched; store everything in an append-only log;

Isn’t this how crypto coins work under the hood? There’s no actual encryption in crypto, just secure hashing.

limit499karma · on Nov 29, 2024

Theoretically they even have a better security environment (since it is internal and they control users, code base and network) so the consensus mechanism may not even require BFT.

hooverd · on Nov 29, 2024

It's all merkle trees under the hood. I feel like the crypto coin stuff has overshadowed the useful bits.

trog · on Nov 29, 2024

Is a Merkle tree needed or is good old basic double ledger accounting in a central database sufficient? If a key requirement is not a distributed ledger then it seems like a waste of time.

Onavo · on Nov 29, 2024

Merkle tree is to prevent tampering, not bad accounting practices

nly · on Nov 29, 2024

It only prevents tampering if the cost of generating hashes is extremely high.

Internally in your company you're not going to spend millions of $'s a year in GPU compute just to replace a database.

xorcist · on Nov 29, 2024

"Prevents tampering" lacks specificity. git is a blockchain that prevents tampering in some aspects, but you can still force push if you have that privilege. What is important is understand what the guarantees are.

limit499karma · on Nov 29, 2024

? If I use something like Blake3 (which is super fast and emits gobs of good bits) and encode a node with say 512 bits of the hash, you are claiming that somehow I am vulnerable to tampering because the hash function is fast? What is the probable number of attempts to forge a document D' that hashes to the very same hash? And if the document in structured per a standard format, you have even less degrees of freedom in forging a fake. So yes, a Merkel tree definitely can provide very strong guarantees against tampering.

oconnor663 · on Nov 29, 2024

Fwiw, increasing the BLAKE3 output size beyond 256 bits doesn't add security, because the internal "chaining values" are still 256 bits regardless of the final output length. But 256 bits of security should be enough for any practical purpose.

limit499karma · on Nov 29, 2024

Good to know. But does that also mean that e.g. splitting the full output to n 256 chunks would mean there is correlation between the chunks? (I always assumed one could grab any number of bits (from anywhere) in a cryptographic hash.)

oconnor663 · on Nov 30, 2024

You can take as many bytes from the output stream as you want, and they should all be indistinguishable from random to someone who can't guess the input. (Similar to how each of the bytes of a SHA-256 hash should appear independently random. I don't think that's a formal design goal in the SHA-2 spec, but in practice we'd be very surprised and worried if that property didn't hold.) But for example in the catastrophic case where someone found a collision in the default 256-bit BLAKE3 output, they would probably be able to construct colliding outputs of unlimited length with little additional effort.

benlivengood · on Nov 30, 2024

Certificate transparency logs achieve tamper-resistance without expensive hashes.

agentultra · on Nov 29, 2024

Write-Once, Read Many drives also prevent tampering. Not everything needs crypto.

chaboud · on Nov 29, 2024

In a distributed setting where a me may wish to join the party late and receive a non-forged copy, it’s important. The crypto is there to stand in for an authority.

trog · on Nov 30, 2024

> In a distributed setting where a me may wish to join the party late and receive a non-forged copy, it’s important. The crypto is there to stand in for an authority.

Yeh, but that's kinda my point: if your primary use case is not "needs to be distributed" then there's almost never a benefit, because there is always a trusted authority and the benefits of centralisation outweigh (massively, IMO) any benefit you get from a blockchain approach.

chaboud · on Dec 1, 2024

100% agreed there. A central authority can just sign stuff. Merkle trees can still be very valuable for integrity and synchronization management, but burning a bunch of energy to bogo-search nonces is silly if the writer (or federated writers) can be cryptographic authorities.

jchanimal · on Nov 29, 2024

We launched Fireproof earlier this month on HN. It’s a tamperproof Merkle CRDT in TypeScript, with an object storage backend for portability.

See our Show HN: https://news.ycombinator.com/item?id=42184362

We’ve seen interest from trading groups for edge collaboration, so multi-user apps can run on-site without cloud latency.

hluska · on Nov 29, 2024

What disrespectful marketing. We don’t care that you use Merkle trees because that’s irrelevant. I guess I can add Fireproof to my big list of sketchy products to avoid. It’s embarrassing.

jchanimal · on Nov 29, 2024

I figured the responses would be more interesting. Questions about CRDT guarantees etc.

Perhaps worth seeding the convo with a remark about finality.

hluska · on Nov 29, 2024

While your intentions may have been around discussion, I don’t want to be marketed to when I’m trying to understand something unrelated. I have a business degree so I intimately understand that HN is technically free and it’s nice to get free eyeballs, but we are people too. I’m so much more than a credit card number, yet you’ve reduced me to a user acquisition in the most insulting way possible.

Perhaps instead of your ideas, it’s worth seeding your own personal make up with a firm statement of ethics??

Are you the kind of person who will hijack conversations to promote your product? Or do you have integrity?

Just purely out of concern for your business, do you have a cofounder who could handle marketing for you? If so, consider letting her have complete control over that function. It’s genuinely sad to see a founder squander goodwill on shitty marketing.

jchanimal · on Nov 29, 2024

In founder mode, I pretty much only think about these data structures. So I am (admittedly) not that sensitive to how it comes across.

Spam would be raising the topic on unrelated posts. This is a context where I can find people who get it. The biggest single thing we need now is critical feedback on the tech from folks who understand the area. You’re right I probably should have raised the questions about mergability and finality without referencing other discussions.

Because I don’t want to spam, I didn’t link externally, just to conversation on HN. As a reader I often follow links like this because I’m here to learn about new projects and where the people who make them think they’ll be useful.

ps I emailed the address in your profile, I have a feeling you are right about something here and I want to explore.

wasabi991011 · on Nov 30, 2024

> Spam would be raising the topic on unrelated posts.

I think you need to reread the conversation, because you did post your marketing comment while ignoring the context, making your comment unrelated.

If you want it distilled down from my perspective, it went something like this:

> Trog: Doubts about the necessity of Merkle trees. Looking for a conversation about the pros and cons of Merkle trees and double ledger accounting.

> You: Look at our product. Incidentally it uses Merkle trees, but I am not going to mention anything about their use. No mention of pros and cons of Merkle trees. No mention of double ledger accounting.

nearting · on Nov 29, 2024

This doesn't address the question in any way except to note that you also use Merkle Trees. Do you reply to any comment mentioning TypeScript with a link to your Show HN post as well?

gr4vityWall · on Dec 1, 2024

Sorry, but your post came off as blatant advertising. There is no need to link to your company announcement just because it benefits you.

jchanimal · on Dec 2, 2024

Thanks y'all -- feedback taken. If I were saying it again I'd say something like:

Merkle proofs are rad b/c they build causal consistency into the protocol. But there are lots of ways to find agreement about the latest operation in distributed systems. I've built an engine using deterministic merge -- if anyone wants to help with lowest common ancestor algorithms it's all Apache/MIT.

While deterministic merge with an immutable storage medium is compelling, it doesn't solve the finality problem -- when is an offline peer too out-of-date to reconcile? This mirrors the transaction problem -- we all need to agree. This brings the question I'm curious about to the forefront: can a Merkle CRDT use a Calvin/Raft-like agreement protocol to provide strong finality guarantees and the ability to commit snapshots globally?

Apologies for the noise.

csomar · on Nov 29, 2024

Crypto/Blockchain makes it harder to have an incorrect state. If you fk up, you need to take down the whole operation and reverse everything back to the block in question. This ensures that everything was accounted for. On the other hand, if you fk in a traditional ledger system you might be tempted to keep things running and resolve "only" the affected accounts.

delfinom · on Nov 29, 2024

It's a question of business case. While ensuring you are always accounted correctly seems like a plus, if errors happen too often potentially due to volume, it makes more business sense sometimes to handle it while running rather than costing the business millions per minute having a pause.

necovek · on Nov 29, 2024

It's mostly a different approach to "editing" a transaction.

With a blockchain, you simply go back, "fork", apply a fixed transaction, and replay all the rest. The difference is that you've got a ledger that's clearly a fork because of cryptographic signing.

With a traditional ledger, you fix the wrong transaction in place. You could also cryptographically sign them, and you could make those signatures depend on previous state, where you basically get two "blockchains".

Distributed trust mechanisms, usually used with crypto and blockchain, only matter when you want to keep the entire ledger public and decentralized (as in, allow untrusted parties to modify it).

koolba · on Nov 29, 2024

> With a traditional ledger, you fix the wrong transaction in place.

No you don’t. You reverse out the old transaction by posting journal lines for the negation. And in the same transactions you include the proper booking of the balance movements.

You never edit old transactions. It’s always the addition of new transactions so you can go back and see what was corrected.

necovek · on Nov 30, 2024

Right, thanks for the correction: I wanted to highlight the need for "replaying" all the other transactions with a blockchain.

olddustytrail · on Nov 30, 2024

In git terms, it's like `revert` Vs `rebase`.

lelanthran · on Nov 30, 2024

> With a traditional ledger, you fix the wrong transaction in place.

That's not how accounting works. You post a debit/credit note.

solatic · on Nov 30, 2024

> With a blockchain, you simply go back, "fork", apply a fixed transaction, and replay all the rest.

You're handwaving away a LOT of complexity there. How are users supposed to trust that you only fixed the transaction at the point of fork, and didn't alter the other transactions in the replay?

necovek · on Nov 30, 2024

My comment was made in a particular context. If you can go back, it's likely a centralized blockchain, and users are pretty much dependent on trusting you to run it fairly anyway.

With a proper distributed blockchain, forks survive only when there is enough trust between participating parties. And you avoid "editing" past transactions, bit instead add "corrective" transactions on top.

im3w1l · on Nov 29, 2024

If its for internal why not just use a normal append only log. x amount transferred from account y to account z. A three column csv oughta do it.

sneak · on Nov 29, 2024

Any time your proposal entails a “why not just”, it is almost certainly underestimating the mental abilities of the people and teams who implemented it.

A good option is “what would happen if we” instead of anything involving the word “just”.

qazxcvbnmlp · on Nov 29, 2024

“Just” usually implies a lack of understanding of the problem space in question. If someone says “solution X” was considered because of these factors which lead to these tradeoffs however since then fundamental assumption Y has changed which allows this new solution then it’s very interesting.

jknoepfler · on Nov 29, 2024

Sure. When I ask "why don't we just" I'm suggesting that the engineering solutions on the table sound over-engineered to the task, and I'm asking why we aren't opting for a straightforward, obvious, simple solution. Sometimes the answer is legitimate complexity. Equally as often, especially with less experienced engineers, the answer is that they started running with a shiny and didn't back up and say "why don't we just..." themselves.

PittleyDunkin · on Nov 29, 2024

Counterfactuals strike me as even less useful than underestimating competency would be. Surely basic double-entry accounting (necessarily implying the use of ledgers) should be considered table stakes for fintech competency.

foobarbecue · on Nov 29, 2024

Lots of threads on this here, most recently https://news.ycombinator.com/item?id=42038139#42038572 . I think this example is perfect, with the "oughta do it"

tonyhart7 · on Nov 29, 2024

it literally ledger, its only show where money went but not showing "why" the money move

double entry with strong check that ensure its always balance fix this

foobarbecue · on Nov 29, 2024

https://news.ycombinator.com/item?id=42038139#42038572

DanielHB · on Nov 29, 2024

> I hypothesized, given how shitty the codebase was, that they must be tracking this stuff poorly.

That is like half of the plot of Office Space

mlloyd · on Dec 1, 2024

This sounds like a situation that I know about at the placed identified by name in your comment. It took months to track down the issue.