More

andrewnc · 2024-07-15T22:33:56 1721082836

Hey this is great! How long does it take to generate the Wiki when setting up, and what have you found is the best way to keep it up to date?

oshams · 2024-07-15T22:49:19 1721083759

it takes a few hours but we're adding parallelization so it takes only a few minutes for typical repos

for it to stay up to date simply have an active subscription and we'll take care of the rest!

andrewnc · on Nov 29, 2022

I think "The Gong Show" was an old tv show about amateur talents. Sometimes good, most of the time terrible and hilariously unaware. Not sure if that was what was intended here.

throwaway2037 · on Dec 1, 2022

So is the GP criticizing Python? If yes, I am curious to know why. No, I am not here to defend Python. The constant runtime exceptions due to typing mistakes is so tiring.

andrewnc · on Sept 29, 2022

Yes! It definitely should be

andrewnc · on Sept 16, 2022

My favorite part here is that it's a noun and a verb

"i-kwan mo yung kwan" -> "just thing the thing"

andrewnc · on Sept 7, 2022

We have some preliminary work in this direction https://github.com/gretelai/multi-table

I love the idea of "table space" though. It would be fun to traverse this space and output a new database at each step, like a VAE.

andrewnc · on Sept 7, 2022

We've been working hard to make it super simple to get started with useful synthetic data.

If you want to know how you would go and use this for your own problems check out some of our other posts

https://gretel.ai/blog/how-to-safely-work-with-another-compa...

niviksha · on Sept 7, 2022

My use case is generating a very high rate (10k e/s up to 100k e/s) of JSON-NL events from samples of JSON-encoded log data (JSON-NL to be exact). Is this supported in OSS Gretel?

FYI, I'd built a hand-crafted generator using JSONNet templates and Golang, but I really wanted something that could model source data distributions accurately. The use case is large-scale load testing of customer workloads without requiring actual data.

andrewnc · on Sept 7, 2022

We're currently beta testing something that fits this use case directly. The models we have today are really great at capturing the original distribution, but they're not always the fastest. This new stuff will change that, feel free to reach out (maybe on our slack?) and we can see if we can get something working

andrewnc · on Sept 7, 2022

Blog post is out now https://gretel.ai/blog/introducing-gretel-amplify

They get 43,300 records per second on this example, which seems to the right order of magnitude for you

andrewnc · on Aug 24, 2022

One examples of this is back translation [0]

It works fairly well as model size increases

Also, shameless plug, we're pretty proud of our generated anything's at Gretel[1]. It's tabular, text, and time series for now - but we recently had a blog post that shows how generated data can be useful for downstream ML [2]

[0] https://arxiv.org/abs/2110.05448

[1] https://gretel.ai/

[2] https://gretel.ai/blog/how-to-safely-work-with-another-compa...

Terretta · on Aug 25, 2022

> https://gretel.ai/

Where were you the last decade of my professional life?

I couldn’t find anyone to take my money for exactly this.

andrewnc · on July 19, 2022

I saw a few DALLE coverarts in there ;) very cool!

andrewnc · on July 16, 2022

Thanks for the feedback, I was hoping the course would be something of a redemption.

I self published the book to prove I could do something like that. It got much more traction than I had planned and, in hindsight, I wish I had paid for editing and formatting as a minimum.

As for the quality of the paperback, that was unfortunately out of my control as I used Amazon's print on demand services. Definitely a painful lesson for me.

In any case, I appreciate this comment and others here. I'm definitely working towards much higher substance with increased polish. :)

andrewnc · on July 16, 2022

I've been a long time fan of your content! Thanks all for these criticisms, we'll definitely take them to heart.