Sure: https://chat.openai.com/share/0a3e52c6-1db8-422a-a98c-cb35006e5066 I laid ...

klipt · on Nov 14, 2023

Not bad, but suppose the dictionary has n lines and you only want to randomly sample k=100 of them, where n is so huge that you don't want to scan over the whole file at all.

Can you use random access into the file to sample k lines in O(k) time instead of O(n) time?

coder543 · on Nov 14, 2023

That is a problematic request for multiple obvious reasons, and for those same reasons, ChatGPT resisted providing an implementation that didn't require indexing the file. By telling it "no indexing is allowed, provide a best effort solution" it relented and provided a best effort solution.

Here is the provided solution and some discussion of the problems with the problem itself: https://chat.openai.com/share/54807663-17ca-4e7d-bc76-cd3cf3...

klipt · on Nov 14, 2023

> That is a problematic request for multiple obvious reasons

I'd prefer to think it's more like a real engineering problem, and less like a simple interview question :-)

And it definitely shows the limits of GPT here: it pointed out that the ends of the file might be tricky, but ignored the very conceptually simple solution of considering the file as circular (if you go past either end you simply wrap around).

And it misses the real problem with its implementation: the probability of sampling each line is now directly proportional to the length of the line before it (because it seeks into that line first and then skips it!)

So the word after "begins" is twice as likely to come up as the word after "and".

PS in the case of dictionary words with a length limit of say 30 letters, there is still an O(k) general solution using rejection sampling.

coder543 · on Nov 14, 2023

If you had actually read what it wrote:

"Remember, this is a probabilistic approach and works well if the lines in your file are roughly the same length. If the line lengths vary significantly, some lines will have a higher or lower chance of being selected."

It had already addressed "the real problem with its implementation" that you pointed out.

> PS in the case of dictionary words with a length limit of say 30 letters, there is still an O(k) general solution using rejection sampling.

Again, what ChatGPT wrote:

"In a typical scenario where lines can have variable lengths, true O(k) random sampling isn't feasible without some prior knowledge about the file."

Knowing that the limit is 30 characters without question counts as "some prior knowledge".

As an interviewer, it sounds like you're not hearing what the candidate is saying.

> And it definitely shows the limits of GPT here

I don't think anyone here is claiming that ChatGPT is limitless. The topic is "a coder considers the waning days of the craft", not "a coder considers the bygone days of the craft." ChatGPT is capable of solving many real world problems already. If it continues improving, some people are concerned about what that could mean, especially for less experienced developers.

How many people have you interviewed with that brainteaser that have actually provided the complete solution you're looking for? Vanishingly few, I would imagine, unless you were dropping some serious hints. It's not a real world problem. Most brainteasers have solutions that are "conceptually simple" once you already know the solution.

> I'd prefer to think it's more like a real engineering problem, and less like a simple interview question

It's absolutely not, though. It's exactly like the infamous trick questions that many tech interviews are known for, which have nothing to do with real engineering that you would encounter on the job.

You might as well have someone invert a binary tree for all the value that it provides.

klipt · on Nov 14, 2023

> How many people have you interviewed with that brainteaser

Zero, I just wanted to push the limits of the question in this thread to see what GPT did.

But you seem to not be enjoying that so let's call this quits.