Hacker News new | past | comments | ask | show | jobs | submit | WhiteOwlEd's comments login

With OpenAI and other LLMs, web development is accelerating. For example, I put together a AI call center demo ( https://www.youtube.com/watch?v=Vv7mI_qRrhE ) by using Open AI o1-preview. There I could take a lot of different files on typescript and backend server stuff written in python. I would add logs into the mix to make one massive prompt, and then I would let the AI work on reasoning in the cases where I needed to accelerate the writing of additional code.


Building on this, Human preference optimization (such as Direct Preference Optimization or Kahneman Tversky Optimization) could be used to help in refining models to create better data.

I wrote about this more recently in the context of using LLMs to improve data pipelines. That blog post is at: https://www.linkedin.com/posts/ralphbrooks_bigdata-dataengin...


The author is actually doing something that will help with the job search, and that is reaching out by methods other than the resume.

Having a blog on the front page (coupled with the value that he can bring) should at least give him a few warm leads for job opportunities.


I hope so! Thought it was an interesting, funny post, which is why I submitted it. (I don't know the author.)


If you are using no-code solutions, increasing an "idea" in a dataset will make that idea more likely to appear.

If you are fine-tuning your own LLM, there are other ways to get your idea to appear. In the literature this is sometimes called RLHF or preference optimization, and here are a few approaches:

Direct Preference Optimization

This uses Elo-scores to learn pairwise preferences. Elo is used in chess and basketball to rank individuals who compete in pairs.

@argilla_io on X.com has been doing some work in evaluating DPO.

Here is a decent thread on this: https://x.com/argilla_io/status/1745057571696693689?s=20

Identity Preference Optimization

IPO is research from Google DeepMind. It removes the reliance of Elo scores to address overfitting issues in DPO.

Paper: https://x.com/kylemarieb/status/1728281581306233036?s=20

Kahneman-Tversky Optimization

KTO is an approach that uses mono preference data. For example, it asks if a response is "good or not." This is helpful for a lot of real word situations (e.g. "Is the restaurant well liked?").

Here is a brief discussion on it:

https://x.com/ralphbrooks/status/1744840033872330938?s=20

Here is more on KTO:

* Paper: https://github.com/ContextualAI/HALOs/blob/main/assets/repor...

* Code: https://github.com/ContextualAI/HALOs


Data scientist here who spent a couple of years working with Unreal (to produce high end data visualizations). Here are my thoughts

> Blueprints suck! Not really. Think of Blueprints like python. Its good for routing and keeping track of things at a high level. Think of C++ as handling things at a lower level.

> I heard you need to start with blueprints. Not really. After going through the basic tutorial that Unreal Sensei has on YouTube (https://youtu.be/gQmiqmxJMtA?si=TqBiiIe12M5hiCda) , it is better to do a mix of blueprints and C++ if you have any programming background.

> I don't know what to use for the IDE. I used Rider for Unreal Engine and it has good integration into Unreal Engine.

> So when do you use C++? When I was doing data vis of census data, I needed a way to load in 10,000 data points into memory. The "out of the box" tools for Unreal didn't support this, so custom C++ was the way to go.

> But really, if I just want to get started with Unreal and want official tutorials, where do I go?

After going through Unreal Sensei, I looked at at https://dev.epicgames.com/community/unreal-engine/getting-st..., there are a ton of tutorials there for game developers.

Also, a year ago I put together an online course that looked at how to ramp up on Unreal Engine. The course ("Data Visualization in the Metaverse") is ideal if you already have a programming background. I put the course out on YouTube for free (https://www.youtube.com/playlist?list=PLKH3Xg62luIgPaB4fiFuT...) and happy to answer any questions about it.


>Data scientist here who spent a couple of years working with Unreal (to produce high end data visualizations) That sounds amazing, could you share some of your visualizations?


> I don't know what to use for the IDE. I used Rider for Unreal Engine and it has good integration into Unreal Engine.

Don't forget Visual Assist! It's more flexible, is faster, uses less memory, works on uncompilable code, etc. (I work on it.)


10 thoughts on data visualization best practices and tools:

1) For interactive visualizations of data on 3D globes, I use a mix of C++, Python (for data cleaning), and Unreal Engine (with a plugin called Cesium). An example of this is at https://youtu.be/9i-tQ8Sr80o.

2) If I am trying to put together a 3D globe that has less quality but that can be accessed by the web, I use Mapbox GL JS, D3.js, and React. An example of this is at https://www.whiteowleducation.com/blog/2022/10/14/real-estat....

3) I have seen others use Three.js for developing 3D data visualizations on the web. An example of this in a data science context is at https://blog.fastforwardlabs.com/2019/04/29/visualizing-acti....

4) If you are trying to do 3D population density maps in R, there are a lot in the community that say you should use https://www.rayshader.com/ with R.

5) If you are really trying to push the limits of data visualization, follow https://twitter.com/Arti_AR_video . He is doing data vis in AR. Robert Scoble had a good tweet the other day (https://twitter.com/Scobleizer/status/1620498790653501440?) showing Arti with 3D bar charts sitting on a table.

6) If you are doing data vis for urban planning, odds are they are already using ArcGIS, and odds are you will be using something like that.

7) If you are trying to do data vis that relates to architecture, I would actually suggest starting with Twinmotion (which is part of the Unreal Engine ecosystem).

8) If you are trying to do data vis for simulations, it may be worth looking at https://www.nvidia.com/en-us/omniverse/ .

9) If you are wanting to show some high end maps fast, use Geolayers 3. There is a YouTube channel called "Boone Loves Video" (https://www.youtube.com/channel/UCXyGw2OkrAzLhq1r7hyDZkA). Boone explains Geolayers often in his videos.

10) I personally believe that if you are trying to get to next-gen data visualization my best guess is that you would use a mix of Blender, Nuke, Houdini, or After Effects. I personally have only used Blender and After Effects so far.

Also, if you have any data visualization needs, I am currently on the job market. https://www.linkedin.com/in/ralphbrooks has details about me.


NLP machine learning can summarize text. ChatGPT can write code.

It seems like deep down the author wants an AI tool that summarizes code as a diagram with different “styles” base on roles.


Building on this, for many companies, the leader of the IT org has the main responsibility of focusing on end (or outside) customer needs at the highest quality with a low "Total Cost of Ownership".


I just put together a tutorial for data analysts who want to learn BigQuery. The tutorial goes through the process that I use as a data scientist to examine data.

For the tutorial, I took a look at election contributions based on Federal Election Commission data. The numbers seem to align with what is reported in the press (so the analysis seems sound), but feedback is always welcome.


There is an interesting irony to this in that SBF had contributed $38 million to PACs as part of this election cycle.


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: