Looking through the code, it looks like this uses your personal Apple Mail entitlements to pull the locations that get collected by devices on the FindMy network:
Everyone is hating on gRPC in this thread, but I thought I'd chime in as to where it shines. Because of the generated message definition stubs (which require additional tooling), clients almost never send malformed requests and the servers send a well understood response.
This makes stable APIs so much easier to integrate with.
> Because of the generated message definition stubs (which require additional tooling), clients almost never send malformed requests and the servers send a well understood response.
Sure. Until you need some fields to be optional.
> This makes stable APIs so much easier to integrate with.
Only on your first iteration. After a year or two of iterating you're back to JSON, checking if fields exist, and re-validating your data. Also there's a half dozen bugs that you can't reproduce and you don't know why they happen, so you just work around them with retries.
There’s also a gaping security hole in its design.
They don’t have sane support for protocol versioning or required fields, so every field of every type ends up being optional in practice.
So, if a message has N fields, there are 2^N combinations of fields that the generated stubs will accept and pass to you, and its up to business logic to decide which combinations are valid.
It’s actually worse than that, since the other side of the connection could be too new for you to understand. In that case, the bindings just silently accept messages with unknown fields, and it’s up to you to decide how to handle them.
All of this means that, in practice, the endpoints and clients will accumulate validation bugs over time. At that point maliciously crafted messages can bypass validation checks, and exploit unexpected behavior of code that assumes validated messages are well-formed.
I’ve never met a gRPC proponent that understands these issues, and all the gRPC applications I’ve worked with has had these problems.
I have yet to see a good way to do backward compatibility in anything. The only thing I've found that really works is sometimes you can add an argument with a default value. Removing an argument only works if everyone is using the same value of it anyway - otherwise they are expecting the behavior that other value causes and so you can't remove it.
Thus all arguments should be required in my opinion. If you make a change add a whole new function with the new arguments. If allowed the new function can have the same time (if overloading should be done this way is somewhat controversial - I'm coming out in favor but the arguments against do make good points which may be compelling to you). That way the complexity is managed since there is only a limited subset of the combinatorial explosion possible.
> every field of every type ends up being optional in practice.
This also means that you cant write a client without loads of branches, harming performance.
I find it odd that grpc had a reputation for high performance. Its at best good performance given a bunch of assumptions about how schemas will be maintained and evolved.
An easy way to make all LLMs somewhat good at chess is to make a Chess Eval that you publish and get traction with. Suddenly you will find that all newer frontier models are half decent at chess.
I thought the same based on the title, but the article feels different. The theme system is about making a commitment, and making failure harder to prevent demoralization, and promote adaptibility. Quests, as presented here, still have a measurable goal, they are specific. That should never be the case with a yearly theme.
My guess is that this is a post training (rlhf) artifact on world model prompts. There were likely many “logical inconsistency” prompts which humans coerced to the above response.
No. In the common use of the word fine-tuning, one is in the supervised learning scenario. One has an input prompt, and an output sentence. One teaches the model to say that output in response to that prompt. In the reinforcement learning scenario, one has a prompt, and a way of rewarding the model for different outputs. One can have, for instance, a reward model, that assigns a reward for a given model output. One could also have a pairwise reward model, where the learner is sampled with that prompt twice (with different RNGs), and the reward model gives a reward based on the better of the two samples. You could also have humans give these pointwise or pairwise rewards.
In essence, one is not telling the model "This. This is what you should output next time." but rather "I liked this reply. Have a cookie." The behaviors that you can learn in RL are more subtle, but you get a lot less information per step. That's because, in a causal language modeling objective, when I tell you "For the prompt X, you should output exactly Y[0...m)", you get a gradient for P(Y[0] | X), another one for P(Y[1] | X Y[0..1)), another for P(Y[2] | X Y[0..2)), another for P(Y[3] | X Y[0..3)), and so on. It's a lot more of a step-by-step guidance, than it is a sentence-wise reward that you get in the RL framework. In RL, I'd give you a cookie for P(Y | X). What part of Y made me give you that cookie? Was there even such a part? Was it perhaps some internal representation that made everything in Y better? That's for the model to learn.
One wrinkle, is that it is now common to fine-tune on previously derived RL datasets, with the tested inputs and preferred sample outputs as the training data.
This seems great, I would use something like this day to day if it were a vscode plugin. Often times two side by side panes are not enough to get the intertwined context of a large program that spans both code / subprocess / api boundaries. Currently, I solve this problem by keeping a set of tab groups for different "workflows" that I need to edit frequently
https://github.com/seemoo-lab/openhaystack/blob/8d214aa5eb68...
I wonder if this were also possible by making an Apple developer account.