I use it through Kagi Assistant which has the proper R1 model through Together.ai/Fireworks.ai
My standard test is to ask the model to write a QSyntaxHighlighter subclass that uses TreeSitter to implement syntax highlighting. O1 can do it after a few iterations, but R1’s output has been a mess. That said, its thought process revealed a few issues that I then fixed in my canonical implementation.
Thanks for adding detail! My prompts have been very in-the-bubble-of-Qt I'd say, less so about mashing together Qt and something else, which I agree is a good real-world test case.
I haven’t had the chance to try it out with R1 yet but if you implement a debugger class that screenshots the widget/QML element, dumps its metadata like GammaRay, and includes the source, you can feed that context into Sonnet and o1. They are scarily good at identifying bugs and making modifications if you include all that context (although you have to be selective with what metadata you include. I usually just dump a few things like properties, bindings, signals, etc).
My standard test is to ask the model to write a QSyntaxHighlighter subclass that uses TreeSitter to implement syntax highlighting. O1 can do it after a few iterations, but R1’s output has been a mess. That said, its thought process revealed a few issues that I then fixed in my canonical implementation.