Hacker News new | past | comments | ask | show | jobs | submit | andrewnc's comments login

Hey this is great! How long does it take to generate the Wiki when setting up, and what have you found is the best way to keep it up to date?


it takes a few hours but we're adding parallelization so it takes only a few minutes for typical repos

for it to stay up to date simply have an active subscription and we'll take care of the rest!


I think "The Gong Show" was an old tv show about amateur talents. Sometimes good, most of the time terrible and hilariously unaware. Not sure if that was what was intended here.


So is the GP criticizing Python? If yes, I am curious to know why. No, I am not here to defend Python. The constant runtime exceptions due to typing mistakes is so tiring.


Yes! It definitely should be


My favorite part here is that it's a noun and a verb

"i-kwan mo yung kwan" -> "just thing the thing"


We have some preliminary work in this direction https://github.com/gretelai/multi-table

I love the idea of "table space" though. It would be fun to traverse this space and output a new database at each step, like a VAE.


We've been working hard to make it super simple to get started with useful synthetic data.

If you want to know how you would go and use this for your own problems check out some of our other posts

https://gretel.ai/blog/how-to-safely-work-with-another-compa...


My use case is generating a very high rate (10k e/s up to 100k e/s) of JSON-NL events from samples of JSON-encoded log data (JSON-NL to be exact). Is this supported in OSS Gretel?

FYI, I'd built a hand-crafted generator using JSONNet templates and Golang, but I really wanted something that could model source data distributions accurately. The use case is large-scale load testing of customer workloads without requiring actual data.


We're currently beta testing something that fits this use case directly. The models we have today are really great at capturing the original distribution, but they're not always the fastest. This new stuff will change that, feel free to reach out (maybe on our slack?) and we can see if we can get something working


Blog post is out now https://gretel.ai/blog/introducing-gretel-amplify

They get 43,300 records per second on this example, which seems to the right order of magnitude for you


One examples of this is back translation [0]

It works fairly well as model size increases

Also, shameless plug, we're pretty proud of our generated anything's at Gretel[1]. It's tabular, text, and time series for now - but we recently had a blog post that shows how generated data can be useful for downstream ML [2]

[0] https://arxiv.org/abs/2110.05448

[1] https://gretel.ai/

[2] https://gretel.ai/blog/how-to-safely-work-with-another-compa...


> https://gretel.ai/

Where were you the last decade of my professional life?

I couldn’t find anyone to take my money for exactly this.


I saw a few DALLE coverarts in there ;) very cool!


Thanks for the feedback, I was hoping the course would be something of a redemption.

I self published the book to prove I could do something like that. It got much more traction than I had planned and, in hindsight, I wish I had paid for editing and formatting as a minimum.

As for the quality of the paperback, that was unfortunately out of my control as I used Amazon's print on demand services. Definitely a painful lesson for me.

In any case, I appreciate this comment and others here. I'm definitely working towards much higher substance with increased polish. :)


I've been a long time fan of your content! Thanks all for these criticisms, we'll definitely take them to heart.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: