More

drdaeman · 2025-03-26T20:27:18 1743020838

Note that despite any requests the genetic data and some personal information (DOB and sex) probably won't be deleted, at least because of CLIA requirements: https://news.ycombinator.com/item?id=41781879 (more details in https://bourniquelaw.com/2024/10/09/data-23-and-me/, linked from the thread there)

nelox · 2025-03-26T21:48:39 1743025719

23andMe does not operate as a laboratory itself but contracts with U.S.-based labs that are certified under CLIA and accredited by the College of American Pathologists (CAP). According to their website, all saliva samples are processed in CLIA-certified and CAP-accredited labs, ensuring compliance with federal standards for accuracy and reliability. This certification is crucial, as it aligns with FDA requirements for certain health-related genetic tests. This distinction is significant, as CLIA primarily regulates labs, not the companies that contract them, potentially affecting the applicability of retention requirements to 23andMe’s broader operations.

CLIA’s record retention requirements, as per Section 493.1105, states labs must retain test requisitions, authorizations, and reports for at least 2 years, with longer periods for specific tests like pathology (10 years for slides).

CLIA Laboratory Record Retention Requirements:

- Test requisitions and authorizations: 2 years minimum. - Test reports: 2 years minimum, 10 years for pathology reports. - Cytology slide preparations: 5 years. - Histopathology slides: 10 years. - Pathology specimen blocks: 2 years. - Tissue: Until diagnosis is made.

Notably, these requirements focus on test-related records, such as requisitions (which may include patient details like date of birth and sex) and reports (which for genetic tests would include interpreted results). However, there is no explicit mention of retaining raw genetic data, such as the full genotype data, in the CLIA regulations. This raises questions about whether 23andMe’s assertion to retain raw genetic information is strictly required by CLIA or if it extends beyond the regulation for other reasons, such as research or quality control.

CharlesW · 2025-03-26T20:30:58 1743021058

Here's a great post by a lawyer, linked to further down in that thread: https://bourniquelaw.com/2024/10/09/data-23-and-me/ It suggests a way to challenge them on their assertions that they must keep your data and samples.

bpodgursky · 2025-03-26T20:46:43 1743022003

I'm sorry but this lawyer has absolutely no idea what he is talking about with regards to CLIA compliance. And he even admits as much, but keeps talking anyway.

CharlesW · 2025-03-26T20:52:51 1743022371

CLIA is one of the excuses 23andMe uses to explain why they retain your genetic information, date of birth, and sex. The author cites the code sections he believes 23andMe are referencing to make this claim, then explains why he believes it doesn't apply. As a CLIA expert, do you mind explaining what he's getting wrong for our benefit?

drdaeman · 2025-03-19T19:15:29 1742411729

I'm not sure working in parallel is always a good decision.

An anecdote: just the other day I've had to implement batching instead of concurrent parallel processing because PostgreSQL really hated me having thousands of concurrent transactions on the same table.

My particular workflow was essentially this - I get a batch (full state dump) with some products, and I need to update my `products` table to keep track of them (soft-deleting what had disappeared, inserting new, updating existing):

    BEGIN;
    -- Quickly load the batch into a temporary table
    CREATE TEMPORARY TABLE products_tmp (LIKE products INCLUDING ALL) ON COMMIT DROP;
    COPY products_tmp FROM STDIN;
    -- Soft-delete products missing from the current batch
    UPDATE products SET is_active = FALSE WHERE is_active AND store_id = ANY($1) AND id NOT IN (SELECT id FROM products_tmp WHERE store_id = ANY($1));
    -- Upsert products from the current batch (add new, update existing)
    INSERT INTO products (...) SELECT ... FROM products_tmp ON CONFLICT (id) DO UPDATE SET ...;
    COMMIT;

With just a few thousands of concurrent writers things started to look quite ugly, with constant serialization failures (I started at SERIALIZABLE, then downgraded to REPEATABLE READ, was reluctant to use READ COMMITTED) and deadlocks preventing me from performing some DDL (schema migrations) on the products table.

So I've started to batch those batches elsewhere and dump them at periodic intervals - and things started to look a lot better. Maybe that was a naive/bruteish approach and I should've done some parameter tweaking and/or fancy table partitioning or something else (idk) for congestion control instead, but at least it worked.

drdaeman · 2025-03-19T18:45:09 1742409909

In your code, in the `ensure` part you want `SET synchronous_commit TO DEFAULT` instead of the explicit `on` (https://www.postgresql.org/docs/current/sql-set.html)

Then, I suspect you may probably rather want to use `SET LOCAL` rather than just `SET`, so it applies to a single transaction only. So maybe something more like

    def asynchronous_transaction(&block)
      ActiveRecord::Base.connection.transaction do
        ActiveRecord::Base.connection.execute("SET LOCAL synchronous_commit = off")
        yield
      end
    end

Please note that I don't really know Rails (in particular, I've no idea whenever `Base.connection` is guaranteed to be exactly the same throughout this function's lifecycle or if e.g. there's a connection pool underneath), so I could be introducing some subtle bugs here.

drdaeman · 2025-03-16T19:27:18 1742153238

I'm curious, why have you concluded that it's naive to build a formal model out of a legal document?

I'm not sure about making laws better, but surely a machine can replace lawyers for trivial and well-defined things (where you don't really need a lawyer - and there are a lot of such cases), saving time for a layman person finding appropriate clauses and navigating them while investigating some well-defined (minimally ambiguous and least likely to raise any exceptions) happy paths. Some companies make a lot of money from formalizing tax codes, after all.

kelseyfrog · 2025-03-16T20:03:32 1742155412

To be specific I believe it's naive to think that formally specified laws will be an improvement on the legal system.

I believe laws ultimately are subject to Godel's theorem. They can either be complete or consistent, but not both (and likely neither). We should recognize the difference. A complete legal system can render a verdict in every case. A consistent legal system has no contradictions. We prioritize the former.

What would it look like for a case to be presented to a judge and for them to respond, "We simply cannot determine if this is legal or not." What would you even do in such a scenario? Rather, we have judges render decisions and deal with the consequences that they are inconsistent and we have further cases to untangle those inconsistencies, in an ever growing patchwork of legal epicycles.

drdaeman · 2025-03-16T21:03:02 1742158982

Thank you!

I'm not sure I agree with the idea of us prioritizing completeness. I don't think the laws can or need to cover any possible life situation. I mean, there aren't any laws about whenever someone can or cannot wave a toy chainsaw or whenever guests of the Oval Office must wear a costume. We just look at this, make our opinions, but when we need to make a legal one we check out that laws say nothing on the matter, then dismiss it as "nothing against the law" (of course, we can always make or amend the new laws if we think we need some).

I'm not a lawyer, but from my crude understanding I believe it's rather the norm for the laws to be incomplete and have intentional ambiguity so the lawmakers aren't spending their time down the rabbit hole of determining the theory of truth (which they cannot do, if Godel's theorem is applicable - I'm not entirely sure because laws are not really self-defining and we don't have or need it to be a closed system with its own set of axioms and stuff - that'd probably mess up the reasons we have laws), but rather write something that works for the majority of their intent, and leave the edge cases to whoever hits them and goes to the court or legislators. I'd say the whole legal system is deliberately designed to "crash" at an undefined behavior at any time and require a human in the loop to look at it.

So I suspect that consistency makes more sense for a legal system than completeness. We probably want rule of the law to be universal and fair (and thus, more consistent) than complete, because no one needs a law to describe every single life situation. People don't normally think of such law (save for thought experiments), and it would probably violate Kant's categorical imperative (as we'll all become just the means for such law) to design it so. I could argue that it's because life changes with every passing moment, with someone doing something novel and never done before.

I hope I understood you correctly (my apologies if I didn't get something right), and my arguments here make sense.

kelseyfrog · 2025-03-16T21:46:47 1742161607

I want to clarify that I'm considering the legal system as a whole. When I say completeness I'm referring to the legal system's ability to render a judgement: is something legal or illegal? That's different than having a law for every case. We can have ambiguous laws and still have a judge say that something is legal or illegal.

What we don't have or can't have is a court case where the judge says, "The legality of this is indeterminate." Our ability to always render a judgement is completeness.

drdaeman · 2025-03-17T04:11:02 1742184662

Hm. I was also thinking about the whole system, and I'm really not sure why a legal system needs to be able to answer such questions.

> What we don't have or can't have is a court case where the judge says, "The legality of this is indeterminate."

Why not? In reality judges can say "the legality of this is not for this court to decide" after all, can't they? And while (IIRC) the Supreme Court must use the law to establish new law, even then they can say "the law says nothing about this - go bug the legislative, not judicial". And then the legislative doesn't really need to use the existing law itself to codify new laws (as I get it, they can even pass laws that contradict the Constitution - it'll be up to the judicial to strike them down as such).

The way I understand how things are in real world, a legal system should be able to state if something is illegal, but it has no need to be able to universally state whenever something is legal or (and doesn't normally do so, save for doing it as a way to make exceptions from broader rules) or not. Kinda like how criminal law decrees one as "guilty" (proven to have done something illegal) or "not guilty" (not proven to have done something illegal) but never declares one innocent (and it doesn't need to - it works without it). Maybe this is where I'm possibly getting confused?

Can you please help me understand why do you think a legal system needs to be able to always answer that legality question, and particularly what's the use case for it (what does such property tries to achieve in practice)?

drdaeman · 2025-03-16T18:52:50 1742151170

> "unless the contrary is shown". Shown by whom? To what standard? What must be shown?

I have no idea about British law, but... As I get it, here we see the fallback logic that defines some defaults when dealing with lack of information (an abandoned newborn). Here we don't know newborn's nationality, so we must make an assumption - the lawmakers decided they didn't want to have proverbial NULLs in here.

- "What must be shown?" - I could be misunderstanding things (especially because English is not my native language), but I believe the intent here is pretty clear. To me it reads as applicability scope, constraining when the fallback kicks in, so we don't create logical contradictions. My understanding is that it says that if there's a known fact that contradicts (a) or (b) it would make those defaults inapplicable - which is logical, because we can't proceed to assume A if we know that ¬A is true. E.g. if there's a known fact (say, a DNA test or a witness testimony or some other evidence) that confirms e.g. that the infant's parent wasn't and isn't a British citizen or permanent resident, this piece won't apply and we won't consider infant to have been born of such father. I think the program at the bottom of the page 374 (page 5 in the PDF) agrees with me on this.

- "Shown by whom?" - as this is clearly unspecified, can't we safely assume that the existence of the fact alone is what should affect the execution flow, not who brought this fact to the light? That is, I believe it reads so anyone or anything can show it, but only the fact (and its truthfulness) itself matters. Again, don't see the issue here - but I'm curious if you have alternative interpretations how it could be read?

- "To what standard?" - this is the only real ambiguity here. Probably because the lawmakers writing the act weren't exactly capable or even aiming to figure out the theory of truth for a natural language. As I get it, for the legal stuff it's pretty normal to have undefined behaviors and return to them when they are triggered and someone is unhappy about the outcome and goes to the court to follow up and clarify the details. When we're rendering laws as machine programs, we mustn't forget what they apply to - to the society governed by those laws. Unless we're running a simulation, the society is external to the program - and it's the only sane design (that doesn't violate Kant's categorical imperative, so the laws should not treat the society as means to some goal but as the goal itself - if I understand the idea correctly, please correct me if I don't) that any or almost any statement may raise an exception-like situation that would require a human to look into it and fix (thus always deferring ultimate legislating to a rational being's will). In other words, it's our usual "I don't want to think too much about this edge case, gotta ship this thing already, so we'll get to this later if we get an error report". This is the universal mechanism that applies not just to this particular question, but to the other questions above that I thought are clear, if someone disagrees.

drdaeman · 2025-03-15T22:46:43 1742078803

> It's the extra parens, semi-colons, keywords and type annotations.

I always thought such minor syntactic differences are unimportant, except for the folks who still learn syntax and haven't seen too many languages out there to stop caring much about it.

YMMV of course, but whenever I need to jump hoops with some API or have things conveniently returned to me in a single call matters a lot for my developer happiness. Whenever my code needs semicolons or indentation or parens feels such a negligibly tiny nuance to me but things like this don't even blip on my mental radar... I always think about what the code does, and don't even see those details (unless I have a typo lol).

Maybe my opinion on this is just the echoes from the ancient C vs Pascal vs BASIC syntax holy wars while I was still a schoolkid, idk. I mean, when I wrote Scheme or Lisp I haven't really "seen" all those parentheses (but then, I just checked some Lisp code and syntax looks off and takes time to get through, since I haven't practiced it in a long while and it's pretty different from anything I've used any recently).

Again, YMMV, but `const chat = new LLM.Chat();` and `chat = RubyLLM.chat` are exactly the same thing to me - I don't remember actual tokens from the screen, I immediately mentally process those both as something like "instantiate a chat object and assign `chat` to it" (without really verbalizing it much, but as an concept/idea). And I don't think a little syntactic noise like `const` or `;` is making things worse or better for me. Although, to be fair, I could be wrong here - I haven't really did any experiments in this regards, with properly defined methodology and metrics, and my subjective perception could be deceptive. Sadly, I'm no scientist and not even sure how to set up one correctly...

drdaeman · 2025-03-15T22:10:59 1742076659

Surely you can have the same API elegancy and overall semantics in Python?

    chat = python_llm.Chat()
    _ = chat.ask"What's the best way to learn Python?")

    # Analyze images
    _ = chat.ask("What's in this image?", image="python_conf.jpg")

    # Generate images
    _ = python_llm.paint("a sunset over mountains in watercolor style")

    # Stream responses
    for chunk in chat.ask("Tell me a story about a Python programmer"):
        print(chunk.content)

    # Can be a class if necessary, but for this weather thingy we can probably do with a simple callable
    # Requires Python 3.9+ for typing.Annotated
    def get_weather(
        latitude: Annotated[Decimal, "Latitude of the location"], 
        longitude: Annotated[Decimal, "Longitude of the location"]
    ) -> str:
        """
        Gets current weather for a location.
        """
        ...

    _ = chat.with_tool(get_weather).ask("What's the weather in Berlin? (52.5200, 13.4050)")

(The `_ =` bits are mine, to emphasize we have a meaningful result and we're knowingly and willingly discarding it. Just a habit, I hope it doesn't bug people.)

Ruby has significantly more capable metaprogramming facilities, but they aren't used in RubyLLM, it's all just objects and methods (biggest difference being use of iterable in Python vs providing a block in Ruby, as I felt an iterable would be more Pythonic here), which is nothing Ruby-specific.

And IMHO advanced metaprogramming should be used carefully, as it may make code pretty but really hard to comprehend and analyze. My largest issue with Rails is difficulty to tell where things are coming from and what's available (lack of implicit imports and ability to re-open any class or module and inject more stuff in there so there's no single place that defines it is a double-edged sword that may lead to chaos if wielded carelessly - YMMV, of course, I'm merely stating my personal preferences here).

drdaeman · 2025-03-15T21:59:32 1742075972

Surely you can have semantically the same API in Go:

    // Must[T](T, error) T is necessary because of Go error handling differences
    chat := Must(gollm.Chat().WithModel("claude-3-7-sonnet-20250219"))
    
    resp := Must(chat.Ask("What's the difference between an unexported and an exported struct field?"))
    resp = Must(chat.Ask("Could you give me an example?"))

    resp = Must(chat.Ask("Tell me a story about a Go programmer"))
    for chunk := range resp {  // Requires Go 1.23+ for iterators
        fmt.Print(chunk.Content)
    }

    resp = Must(chat.WithImages("diagram1.png", "diagram2.png").Ask("Compare these diagrams"))

    type Search struct {
        Query string `description:"The search query" required:"true"`
        Limit int    `description:"Max results" default:"5"`
    }
    func (s Search) Execute() ([]string, error) { ... }

    resp = Must(chat.WithTool[Search]().Ask("Find documents about Go 1.23 features"))

And so on. Syntax is different, of course, but semantics (save for language-specific nuances, like error handling and lack of optional arguments) are approximately the same, biggest difference being WithSomething() having to precede Ask()

drdaeman · 2025-03-15T04:05:06 1742011506

I think it's the very nice-looking and clean high-level API that should be a pleasure to use (when it fits the job, of course).

I'm pretty sure this API semantics (instance builder to configure, and then it's ask/paint/embed with language-native way to handle streaming and declarative tools) would look beautiful and easy to use in many other languages, e.g. I can imagine a similar API - save, of course, for the Rails stuff - in Python, C# or Erlang. While this level of API may be not perfectly sufficient for all possible LLM use cases, it should certainly speed up development time when this level of API is all that's possible needed.

drdaeman · 2025-03-15T03:46:40 1742010400

Individual package.json files say ISC (e.g. https://github.com/JigsawStack/jigsawstack-mcp-server/blob/c...), although I've no idea whenever author mentioning the license name but not providing the full text of the license legally counts.