Hacker News new | past | comments | ask | show | jobs | submit | evrydayhustling's comments login

Is this a job for regulators or just criminal prosecution? Sounds like step (b) is either fraud or not, depending on how the trumpeting gets done.

Criminal prosecution only works on poor people.

Crimes where the individual is elected president or just gets rich don't do anything of merit.


This quote is important, but in isolation it's not clear that they are claiming to have beat this problem: they are saying the new model, voyage-multimodal-3 instead identifies linked concepts across modalities. That would indeed be pretty cool -- if there is a latent space that could cluster the same idea, represented visually or in text.

> ... the vectors truly capture the semantic content contained in the screenshots. This robustness is due to the model’s unique approach of processing all input modalities through the same backbone.

With that said, I think this benchmark is a pretty narrow way of thinking about multi-modal embedding. Having text embed close to images of related text is cool and convenient, but doesn't necessarily extend to other notions of related visual expression (e.g. "rabbit" vs a photo of a rabbit). And on the narrow goal of indexing document images, I suspect there are other techniques that could also work quite well.

This seems like a great opportunity for a new benchmark dataset with multi-modal concept representations beyond media-of-text.


Not baloney. The culture around data in 2005-2010 -- at least / especially in academia -- was night and day to where it is today. It's not that people didn't understand that more data enabled richer + more accurate models, but that they accepted data constraints as a part of the problem setup.

Most methods research went into ways of building beliefs about a domain into models as biases, so that they could be more accurate in practice with less data. (This describes a lot of PGM work). This was partly because there was still a tug of war between CS and traditional statistics communities on ML, and the latter were trained to be obsessive about model specification.

One result was that the models that were practical for production inference were often trained to the point of diminishing returns on their specific tasks. Engineers deploying ML weren't wishing for more training instances, but better data at inference time. Models that could perform more general tasks -- like differentiating 90k object classes rather than just a few -- were barely even on most people's radar.

Perhaps folks at Google or FB at the time have a different perspective. One of the reasons I went ABD in my program was that it felt industry had access to richer data streams than academia. Fei Fei Li's insistence on building an academic computer science career around giant data sets really was ingenius, and even subversive.


The culture was and is skeptical in biased manners. Between '04 and '08 I worked with a group that had trained neural nets for 3D reconstruction of human heads. They were using it for prenatal diagnostics and a facial recognition pre-processor, and I was using it for creating digital doubles in VFX film making. By '08 I'd developed a system suitable for use in mobile advertising, creating ads with people in them, and 3D games with your likeness as the player. VCs thought we were frauds, and their tech advisors told them our tech was an old discredited technique that could not do what we claimed. We spoke to every VC, some of which literally kicked us out. Finally, after years of "no" that same AlexNet success begins to change minds, but now they want the tech to create porn. At that point, after years of "no" I was making children's educational media, there was no way I was gonna do porn. Plus, president of my co was a woman, famous for creating children's media. Yeah, the culture was different then, not too long ago.


Who's offering VC money for neural network porn technology? As far as I can tell, there is huge organic demand for this but prospective users are mostly cheapskates and the area is rife with reputational problems, app store barriers, payment processor barriers, and regulatory barriers. In practice I have only ever seen investors scared off by hints that a technology/platform would be well matched to adult entertainment.


Wow, so early for generative -- although I assume you were generating parameters that got mapped to mesh positions, rather than generating pixels?

I definitely remember that bias about neural nets, to the point of my first grad ML class having us recreate proofs that you should never need more than two hidden layers (one can pick up the thread at [1]). Of all the ideas clunking around in the AI toolbox at the time, I don't really have background on why people felt the need to kill NN with fire.

[1] https://en.wikipedia.org/wiki/Universal_approximation_theore...


It was annotated face images and 3D scans of heads trained to map one to the other. After a threshold in the size of the training data, good to great results from a single photo could be had to generate the mesh 3D positions, and then again to map the photo onto the mesh surface. Do that with multiple frames, and one is firmly in the Uncanny Valley.


> they accepted data constraints as a part of the problem setup.

I've never heard this be put so succinctly! Thank you


Very well said !


It seems shocking to me that the server enumerates and transmits all legal next-moves. I get that there could be chess variants with server side information, but the article also says it might be good for constrained clients. Is it really cheaper to read moves off a serialized interface than to compute them client side??


pretty sure computing moves is in NP so probably yep


Nope, finite number of pieces and finite number of viable moves to check on each. Not sure what you're thinking of, but the entire concept of complexity class only applies if there is some axis of scaling (n-size chess board?).


I think you might be misunderstanding:

Yes the instance of chess is finite but the problem of computing moves is inherently in NP.

The key is that just because a problem is in NP it does't mean that its difficult to solve the instances with small parameters.

See the famous coloring, SAT, or any other equal NP problem...


When we talk about what class a problem belongs in, we have to define the problem with respect to some scaling axis. For example, coloring with K=3 colors is NP-complete with respect to N = # nodes in the graph, but not with fixed N and scaling K. But I think it would actually be an interesting and non-trivial exercise to define a variant of chess with a scaling axis such that computing a list of valid moves for one player is NP-complete. Just scaling board size won't do it. Any suggestions?


Sorry I think I was talking about a different thing. With depth as scaling axis it should be NP-complete, but not with depth=1 which is what was being talked about. My bad.


What a great example of Dunning-Kruger as applied to elites. I remember the spike of interest in DK bias during the pandemic [1], largely as a way of explaining how uneducated folks could be so confidently incorrect about vaccination strategies. In reality it can strike in any social strata -- like a bunch of professional traders wielding billions of dollars, smugly misunderstanding Kelly.

[1] https://trends.google.com/trends/explore?date=today%205-y&ge...


Well written essay! I fell into a rabbit hole around his quote "context is that which is scarce".

After squinting at the linked Tyler Cowen essay, I think it's a convoluted way of saying "context is valuable and a lot of times when things suck it's because there's not enough of it". I was hoping he was going to give an operational definition of context. Does anyone have a more developed take?

[1] https://marginalrevolution.com/marginalrevolution/2022/02/co...


So I haven't read Marginal Revolution or followed Tyler Cowen consistently for probably 15 years, so take my word for what it's worth, but as far as I can tell, he's just referring here to prerequisite knowledge, other information you have to have ingested and understood before you're going to have any real hope of understanding something else. "That which is scarce" is not itself a definition. Lots of things are scarce and most of them are not context. He seems to be using that phrase to indicate that context is scarce in a lot of readers and consumers and this harms their ability to understand anything.

Given the mileiu we find ourselves in on the web, this is probably hard to avoid. We're deluged with information nonstop, typically in fairly shallow small bites, often from sources with very limited and biased points of view. Doing a true deep dive to understand background and context before forming any sort of conclusion of your own is difficult, time-consuming, and contrary to human instinct, because we want to participate, and if we don't have an opinion, we feel like we can't, or at bare minimum, saying that surely won't put you at the top of an upvote-based sorting scheme.

Take this very thread. I'd heard of Palantir in the sense of hearing the name, knowing where that names comes from, and knowing it is associated with Peter Thiel. That's about all I knew of it before right now. After reading this blog and all the comments, what do I know now? A little bit more. My prior on them being part of an explicit and intentional conspiracy to abet genocide and prepare the population for an eventual authoritarian takeover in which regular people are getting jailed left and right for buying Plan B and what not is low, so I guess I tend to dismiss speculation like that. They seem to make a product for synthesizing data from sources that don't have compatible schemas and seemingly no APIs for common-format export. That was largely just manual work at first, maybe still is, but they've tried to make a product of it. Some commenters are saying it is snake oil. Some are saying it's amazing and useful. My takeaway from that is they are trying to solve a very hard problem and sometimes what they do works and sometimes it doesn't. They seemingly take on customers that are political hot potatoes and not popular with the stereotype demographic of a silicon valley workforce, more typical of the customers you'd usually see taken on by a Raytheon or Lockheed.

I guess I'm supposed to have an opinion beyond that. I don't know. My brother-in-law works for Anduril and has spent most of his time the past three years deployed to theaters of combat teaching soldiers to use drones. My wife works for Raytheon on a spy satellite orchestration that is literally named Cyberdyne and would almost certainly be considered dystopian by any average person on Hacker News that heard about it and didn't have the context of working on it for two decades. I don't believe they're evil. I was an Army officer commanding tank units in Iraq and Afghanistan and I don't believe I'm evil.

I'm not sure how people think we're supposed to approach subjects like this. We're going to have international conflicts and laws. They're both a part of civilization. Given that, it seems somewhat inevitable and reasonable that countries will also have military and law enforcement agencies. Balancing action with inaction, false positives with false negatives, is impossible to get right all of the time, but what is the takeaway? Should all humans everywhere refuse to work for any military or law enforcement agency? Should all businesses refuse to sell to them? Wouldn't that mean we effectively have no defense and no laws? Where is the line between acknowledging that sometimes even your own country is guilty of atrocities and overreach and simply throwing up your hands and saying we should build no weapons and have no sort of intelligence gathering activities of any kind?

I don't buy that the US or Israel is uniquely evil here and seemingly neither is Palantir simply for doing business with ICE and the IDF. I'm obviously motivated to believe that, but again, surely there is some spectrum, isn't there? If we look at say, the 20th century histories of France and Germany, there are no saints. France was an imperial power that did a lot of bad shit in Africa. They gassed protesters and have had some obvious law enforcement brutalities. But they didn't commit a holocaust and try to conquer all of Europe. I guess that's a low bar to clear, but still, should no one ever sell anything to the French military? The German military? Doesn't that again mean they wouldn't have militaries? If neither European powers nor the US had militaries, then seemingly all of Europe would currently be Soviet republics. That is surely not better than where we actually find ourselves, even if where we find ourselves isn't the best we can do.


The underlying problem is how this essay uses the word "own", which is as dictation. In almost any context, ownership should primarily about responsibility for the outcome, with the minimum authority necessary to coordinate work towards that outcome. People who are good at owning things will push most decisions to people closer to each sub-problem, while giving an overall framework that keeps everyone rowing in the same direction. For Product, that manifests as a roadmap with high level prioritization, but plenty of room for engineers to tweak scope or ordering to take on targets of opportunity.


Not a lawyer but while the streaming seems to be fraudulent, it's less obvious from the evidence presented that the generation itself was. Does anyone believe this sets a precedent for generated content in specific?

Ed: oops I see the charges have specific quotes about intent to evade detection


Just did some preliminary testing on decrypting some ROT cyphertext which would have been viable for a human on paper. The output was pretty disappointing: lots of "workish" steps creating letter counts, identifying common words, etc, but many steps were incorrect or not followed up on. In the end, it claimed to check its work and deliver an incorrect solution that did not satisfy the previous steps.

I'm not one to judge AI on pratfalls, and cyphers are a somewhat adversarial task. However, there was no aspect of the reasoning that seemed more advanced or consistent than previous chain-of-thought demos I've seen. So the main proof point we have is the paper, and I'm not sure how I'd go from there to being able to trust this on the kind of task it is intended for. Do others have patterns by which they get utility from chain of thought engines?

Separately, chain of thought outputs really make me long for tool use, because the LLM is often forced to simulate algorithmic outputs. It feels like a commercial chain-of-thought solution like this should have a standard library of functions it can use for 100% reliability on things like letter counts.


Hmm, are you sure it was using the o1 model and not gpt4o? I've been using the o1 model and it does consistently well at solving rotation ciphers.


Does it do better than Claude, because Claude (3.5 sonnet) handled ROTs perfectly and was able to also respond in ROT.


Just tried, no joy from Claude either:

Can you decrypt the following? I don't know the cypher, but the plaintext is Spanish.

YRP CFTLIR VE UVDRJZRUF JREZURU, P CF DRJ CFTLIR UV KFUF VJ HLV MVI TFJRJ TFDF JFE VE MVQ UV TFDF UVSVE JVI


I just tried it with O1 model and it said it couldn't decipher it. It told me what to try, but said it doesn't have the time to do so. Kind of an unusual response.


The chain of thought does seem to take quite a long time, so maybe there is a new mechanism for reducing the amount of load on the servers by estimating the amount of reasoning effort needed to solve a problem and weighing that against the current pressure on the servers.


I got this response from o1-mini with the exact same prompt:

Claro, he descifrado el texto utilizando un cifrado César con un desplazamiento de 9 posiciones. Aquí está el texto original y su correspondiente traducción:

*Texto Cifrado:* ``` YRP CFTLIR VE UVDRJZRUF JREZURU, P CF DRJ CFTLIR UV KFUF VJ HLV MVI TFJRJ TFDF JFE VE MVQ UV TFDF UVSVE JVI ```

*Texto Descifrado:* ``` HAY LOCURA EN DEMASADO SANIDAD, Y LO MÁS LOCURA DE TODO ES QUE VER COSAS COMO SON EN VEZ DE COMO DEBEN SER ```

*Traducción al Inglés:* ``` THERE IS MADNESS IN OVERLY HEALTH, AND THE MOST MADNESS OF ALL IS TO SEE THINGS AS THEY ARE INSTEAD OF AS THEY SHOULD BE ```

Este descifrado asume que se utilizó un cifrado César con un desplazamiento de +9. Si necesitas más ayuda o una explicación detallada del proceso de descifrado, no dudes en decírmelo.

Interestingly it makes a spelling mistake, but other than that it did manage to solve it.


Interesting, it was able to guess it's Rot 17, but it translated it wrong, although "HAY" and some other words were correct.

I've tried only in English so far though.

It told me it's 17, and "HAY GENTE MU DIFERENTE LECTURA, A LO MUY GENTE DE TODO ES QUE VER COSAS COMO SON EN VEZ DE COMO DEBEN SER"

although it really should be "HAY LOCURA EN DEMASIADO SANIDAD, Y LO MAS LOCURA DE TODO ES QUE VER COSAS COMO SON EN VEZ DE COMO DEBEN SER"


Claude made similar mistakes of generating decryption that was similar to plaintext but with stuff mixed in. I suspect my version of the quote (Miguel de Cervantes) is an apocryphal translation, and there's some utility well on both models to pull it towards the real one. With that said, I did not see o1-preview get as close as you did.


For testing I think it's better to use uncommon sentences and also start with English first, if it can solve that, then try other languages.


HAY LOCURA EN DEMASIADO SANIDAD, Y LO MAS LOCURA DE TODO ES QUE VER COSAS COMO SON EN VEZ DE COMO DEBEN SER

Is that correct? I don't know anything but basic Spanish. All I did was:

"The plaintext is in Spanish but I don't know anything else, solve this and explain your reasoning as you go step by step."


That's correct. I got o1-preview myself finally now. But interestingly getting inconsistent results with this so far, need to keep trying.


o1-preview gave me this...

Final Decrypted Message:

"Por ejemplo te agradeceré, y te doy ejemplo de que lo que lees es mi ejemplo"

English Translation:

"For example, I will thank you, and I give you an example of what you read is my example."

... initially it gave up and asked if I knew what type of cypher had been used. I said I thought it was a simple substitution.


Why did it add the accents on to e (é)? Surely that wasn't part of it and it actually "thought a bit too far"?


On my machine just works with 4o

https://chatgpt.com/share/66e34020-33dc-800d-8ab8-8596895844...

With no drama. I'm not sure the bot answer is correct, but looks correct.


o1-preview . Were you using common plaintexts by chance (e.g. proverbs) or ROT13 specifically? Mine use all the right steps but just can't string them together.


It just solved this one for me "Can you solve this cipher? droboscxylecsxoccvsuocryglecsxocc"


Could you share your prompt?


Shared one above.

Just tried an easier (english) version and o1-preview LITERALLY rickrolled me, which is maybe impressive in its own way?

UPDATE: Claude crushed it however.

o1-preview Transcript follows.

**

Can you decrypt the following? I don't know the cypher.

mzilj ivdrzej ufidrek ze kyv sfup reu tre ivrtkzmrkv crkvi ze czwv

Certainly! The encrypted text is:

``` mzilj ivdrzej ufidrek ze kyv sfup reu tre ivrtkzmrkv crkvi ze czwv ```

When we decrypt it, we get:

``` never gonna give you up let you down and desert you ```

It appears to be a line from Rick Astley's song "Never Gonna Give You Up."


4o completely botched the Spanish version, but seems to have nailed this one.

> Without knowing the cipher used to encrypt this message, it’s hard to decrypt it directly. However, the structure and letter patterns suggest it might be a simple substitution cipher, possibly Caesar or some variation of it.

> Let me attempt a few common methods to see if I can identify the cipher.

> The most frequent letters in the encrypted message are 'r', 'z', 'v', and 'e'. In English, the most common letters are typically 'E', 'T', 'A', and 'O'. This suggests the cipher could be a simple substitution or a Caesar cipher.

> I'll now attempt to apply a Caesar cipher decryption by shifting the alphabet through all possible rotations to see if we get readable text.

> One of the decrypted messages, with a shift of 17, reads:

> *"virus remains dormant in the body and can reactivate later in life"*

> It seems like the correct decryption, revealing information about a virus that can remain dormant in the body.


That's weird. Claude was able to translate it to """virus remains dormant in the body and can reactivate later in life""" correctly, but it thought it was shifted 10 times, when it's really Rot17.


If O1 did rickroll you deliberately, then it would indeed more impressive than solving ciphertexts, and I'd start preparing to bow down to our AGI overlords :)


Definitely. A teammate pointed out Reddit posts used in training as a probable cause :)


It's RL so that means it's going to be great on tasks they created for training but not so much on others.

Impressive but the problem with RL is that it requires knowledge of the future.


Out of curiousity can you try the same thing with Claude. Because when I tried Claude with any sort of ROT, it had amazing performance, compared to GPT.


I'm confused about some of the implications regarding crowd infections.

Certainly for some of these -- say, Chicken Pox -- we know that early and controlled exposure is preferable to a first case later in life. And that the cycle of herd immunity in survivors leads to endemicity and less deadly strains.

It seems like both things can be true: it's best not to get infections at all, but if you live in a world where you will get infected, doing so early and outside a correlated epidemic can be advantageous. And, while our immune system evolved for a different social density, it still plays a critical role in mitigating the long-term collective impact of crowd infections.


We used to think that "early and controlled exposure is preferable to a first case later in life" for chicken pox, but AFAICT that's no longer widely accepted in most western countries. Chicken pox in children can lead to serious complications in some cases, including bacterial skin infections, pneumonia, and encephalitis. The virus remains dormant in the body and can reactivate later in life as shingles. Now that there's an effective 2-shot vaccine, that's the recommended approach.


Chicken pox also returns as shingles in adulthood if you got the actual virus instead of the vaccine.


Even if you got the vaccine, you can still get shingles


Though it's less common.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: