More

bmichel · 2025-03-26T12:34:42 1742992482

There is an Open-Source tool named Aider that can use Gemini: https://aider.chat/

bmichel · on Oct 2, 2021

CouchDB 4 may change a few things in this space. It will be based on top of FoundationDB, which should make it a lot better for running "at scale PaaS".

WorldMaker · on Oct 4, 2021

Though from what I'm reading the FoundationDB internals are adding even more compatibility breaks from databases that work just fine in CouchDB 2 and PouchDB.

It's nice to see "we're trying for internals that make it easier to run it at scale", but it's not really a "at scale PaaS" play if there aren't any vendors on the major clouds that aren't IBM's interested in actually running whatever those internals are as a service.

bmichel · on June 7, 2020

The name SQLancer suggests that it is made for SQL databases. Do you think it would be easy to adapt it for documents databases like CouchDB or MongoDB?

mrigger · on June 7, 2020

I believe that the approaches that we designed could be implemented/extended to document-oriented DBMS. Since I don't have any experience with those DBMS, I have no intuition on whether the approaches would uncover any bugs.

Implementation-wise, supporting CouchDB and MongoDB in SQLancer would be feasible. SQLancer shares little infrastructure between the testing implementations for the individual DBMS, and thus imposes only few restrictions on them. We would definitely welcome implementations for any NoSQL DBMS in SQLancer!

bmichel · on May 31, 2020

The former proposal was made by 2 members of the Go core team. And they are also in the team of the latter proposal. So, no, the first proposal wasn't good enough for the Go authors and the new proposal is in no way a guerilla attempt.

jhoechtl · on June 1, 2020

Thanks for clarification. I followed the discussion in https://go-review.googlesource.com/c/go/+/187317 loosely, thought that this is the only way to go and wasn't aware of a second investigation stream.

bmichel · on May 31, 2020

The syntax really makes me think to Zig - https://ziglang.org/documentation/master/#Introducing-the-Co.... I'd be curious to know if the authors of this paper have looked at it.

bmichel · on May 29, 2020

By curiosity, did you try Lezer?

https://marijnhaverbeke.nl/blog/lezer.html

carapace · on May 29, 2020

That is keen!

bmichel · on April 21, 2020

Thanks for sharing your experience on this rewrite. I have a few questions:

1. I'm a bit surprised that you don't persist the scenarios of the tests after a failure, only its seed. Does it mean that when you want to replay it, you have to redo the minimization phase? Or, do you have a way to find a seed for the minimized scenario?

2. Do you have some tests where you generate a set of operations and play them twice: one time with the mocks and one time on the real servers to check that they have the same results?

3. The article says "Note also the importance of the commit hash, as another type of “test input” alongside the seed: if the code changes, the course of execution may change too!". How are you ensuring that a commit really fixes a bug, and not just change the execution path to a happy path where the conditions of the bug are not met? By playing again a lot of tests, or do you write a new unit test that exhibit the bug to ensure the reproductability?

4. Do you think we can say that CanopyCheck is applying randomized testing at the unit tests level and Trinity is applying it at the integration tests level?

rbtying · on April 21, 2020

I'm not Sujay, but I worked on this system once upon a time too! (not anymore)

1. It does redo the minimization phase, but the actual execution is extremely fast, so this cost is minimal. Storing test outputs gets pretty expensive when you are running millions of tests, and since there are very few failures, recomputing this is worthwhile

2. Yes and no! the article talks about this a little, but the "heirloom" system does essentially this, and the "native" filesystem variant of Trinity runs the same Trinity tester code against a real filesystem. The "no" is due to the issues with randomized testing -- since any operation that you do can affect the RNG, the exact operation that is run for a particular seed can change if you swap any part of the system. For regression tests, the operation sequence can be put into a separate, non-randomized test.

3. both

4. Testing in Nucleus is a sliding scale from "unit-test-like" to "integration-test-like" -- Trinity is mocking plenty of functionality; CanopyCheck is simultaneously testing many different components. It would probably be more accurate to say that CanopyCheck is testing a smaller subset of components with much greater fidelity, and Trinity is testing as much of the sync engine is practical.

bmichel · on April 22, 2020

Thanks for the answers, I appreciate that.

If you have time, I have another round of questions:

1. Did you try formal methods like TLA+ on the client? I think that the logic covered by CanopyCheck may be a nice target.

2. Do you have some tests with several clients running at the same time on a shared directory? In particular, I think of the termination invariant where the clients are fighting because several users have reorganized the directory by moving a lot of stuff, and each client is trying to converge in a different direction (ie they are making operations that cancelled the ones made by other clients).

3. The article says "In the Nucleus data model, nodes are represented by a unique identifier". Does it happen that a node has to change its identifier? For example, in a scenario like this one:

      (ada) $ offline
    (grace) $ offline
    (grace) $ mv ~/Dropbox/shared/TODO.txt ~/Dropbox/private-grace/
      (ada) $ mv ~/Dropbox/shared/TODO.txt ~/Dropbox/private-ada/
      (ada) $ echo 'foo' >> ~/Dropbox/private-ada/TODO.txt
    (grace) $ echo 'bar' >> ~/Dropbox/private-grace/TODO.txt
      (ada) $ online
    (grace) $ online

rbtying · on April 22, 2020

1. We thought about this at one point, IIRC, and various parts of Dropbox sync have been formally modeled. But CanopyCheck and Trinity are useful in part because they test the real, production code -- the hash-indexing bug that Sujay mentioned elsewhere on this page is an implementation error, not a design error.

2. Kind of. There are tests for interactions between instances (in particular, between different Dropbox folders on the same machine), but running Dropbox twice on the same folder is explicitly not supported.

3. Yes, this happens. There's logic to handle these changes, since some applications expect to use in-place edits and others swap in a new file -- you don't actually need grace to have an interesting situation. You can experiment with this yourself in a Dropbox folder :)

bmichel · on Feb 18, 2020

https://medium.com/@herbcaudill/lessons-from-6-software-rewr... suggests to build a new product without throwing away the legacy one.

bmichel · on Feb 10, 2019

Tanspiling from ES2015 to ES5 is no longer the main usage of babel. For example, there is transforming JSX to normal JS. And object rest/spread proprerties (ES2018) is often used by front developers, even if it is not supported by Edge. So, I don't think that babel and swc will be obsolete in 2 years. The most used transform plugins for those will change, but the parser and engine will remain. And having fast and robust tool here is a good thing for this field.

bmichel · on Jan 15, 2019

There is also Blast (golang), built on top of Bleve.

- https://github.com/mosuka/blast - http://blevesearch.com/

hardwaresofton · on Jan 15, 2019

Wow I actually forgot about Bleve!

I watched a talk on the new indexing engine a while back:

https://www.youtube.com/watch?v=zjG2Y01i3Kk

Can we attribute some of this renewed zeal in the search space to the creation of more approachable systems languages (i.e. Golang and Rust)? Maybe I just haven't been watching the search space but I feel it wasn't always this full of new projects putting up good numbers.

ddorian43 · on Jan 15, 2019

Yeah but it's golang, so it's kinda like java, so I see no pros in it TBH.

hardwaresofton · on Jan 15, 2019

There are a lot of differences between Golang and Java. As much as I dislike writing Java when I have a choice, the JVM (with Java or whatever else on top) is a very capble tool... Could you explain what you mean by there being "no pros"?

Are you maybe trying to get at the difficulty of tuning the JVM?

ddorian43 · on Jan 15, 2019

rust/c++/c has no gc and better performance/efficiency compared to java/golang. so you get excited for a library/db in those languages

golang is kinda a java alternative. a db/search-engine in java/golang kinda sucks (it will under pressure)

hardwaresofton · on Jan 15, 2019

While I definitely agree with you on the broad strokes of the differences between rust/c++/c and java/golang (representing languages without runtimes and those with them respectively), I'd say that golang is a bit more than a java alternative if we consider more than whether a runtime is included or not.

Of course, if the only consideration is whether a runtime is there or not, golang is identical to java but also identical to common lisp or maybe even interpreted languages like python.

I do want to point out that it's possible to write horribly buggy code in c++/c (less so in rust :), which can tank performance/efficiency when compared to a java/golang program. All things considered though, the ceiling on performance and efficiency is of course higher in manual memory management land.

Thanks for clarifying what you meant!

rakoo · on Jan 15, 2019

golang isn't even close to using the same amount of memory as java, so at least there's that.