With OpenAI and other LLMs, web development is accelerating. For example, I put together a AI call center demo ( https://www.youtube.com/watch?v=Vv7mI_qRrhE ) by using Open AI o1-preview. There I could take a lot of different files on typescript and backend server stuff written in python. I would add logs into the mix to make one massive prompt, and then I would let the AI work on reasoning in the cases where I needed to accelerate the writing of additional code.
Building on this, Human preference optimization (such as Direct Preference Optimization or Kahneman Tversky Optimization) could be used to help in refining models to create better data.
If you are using no-code solutions, increasing an "idea" in a dataset will make that idea more likely to appear.
If you are fine-tuning your own LLM, there are other ways to get your idea to appear. In the literature this is sometimes called RLHF or preference optimization, and here are a few approaches:
Direct Preference Optimization
This uses Elo-scores to learn pairwise preferences. Elo is used in chess and basketball to rank individuals who compete in pairs.
@argilla_io on X.com has been doing some work in evaluating DPO.
KTO is an approach that uses mono preference data. For example, it asks if a response is "good or not." This is helpful for a lot of real word situations (e.g. "Is the restaurant well liked?").
Data scientist here who spent a couple of years working with Unreal (to produce high end data visualizations). Here are my thoughts
> Blueprints suck!
Not really. Think of Blueprints like python. Its good for routing and keeping track of things at a high level. Think of C++ as handling things at a lower level.
> I heard you need to start with blueprints.
Not really. After going through the basic tutorial that Unreal Sensei has on YouTube (https://youtu.be/gQmiqmxJMtA?si=TqBiiIe12M5hiCda) , it is better to do a mix of blueprints and C++ if you have any programming background.
> I don't know what to use for the IDE.
I used Rider for Unreal Engine and it has good integration into Unreal Engine.
> So when do you use C++?
When I was doing data vis of census data, I needed a way to load in 10,000 data points into memory. The "out of the box" tools for Unreal didn't support this, so custom C++ was the way to go.
> But really, if I just want to get started with Unreal and want official tutorials, where do I go?
Also, a year ago I put together an online course that looked at how to ramp up on Unreal Engine. The course ("Data Visualization in the Metaverse") is ideal if you already have a programming background. I put the course out on YouTube for free (https://www.youtube.com/playlist?list=PLKH3Xg62luIgPaB4fiFuT...) and happy to answer any questions about it.
>Data scientist here who spent a couple of years working with Unreal (to produce high end data visualizations)
That sounds amazing, could you share some of your visualizations?
10 thoughts on data visualization best practices and tools:
1) For interactive visualizations of data on 3D globes, I use a mix of C++, Python (for data cleaning), and Unreal Engine (with a plugin called Cesium). An example of this is at https://youtu.be/9i-tQ8Sr80o.
4) If you are trying to do 3D population density maps in R, there are a lot in the community that say you should use https://www.rayshader.com/ with R.
6) If you are doing data vis for urban planning, odds are they are already using ArcGIS, and odds are you will be using something like that.
7) If you are trying to do data vis that relates to architecture, I would actually suggest starting with Twinmotion (which is part of the Unreal Engine ecosystem).
9) If you are wanting to show some high end maps fast, use Geolayers 3. There is a YouTube channel called "Boone Loves Video" (https://www.youtube.com/channel/UCXyGw2OkrAzLhq1r7hyDZkA). Boone explains Geolayers often in his videos.
10) I personally believe that if you are trying to get to next-gen data visualization my best guess is that you would use a mix of Blender, Nuke, Houdini, or After Effects. I personally have only used Blender and After Effects so far.
Building on this, for many companies, the leader of the IT org has the main responsibility of focusing on end (or outside) customer needs at the highest quality with a low "Total Cost of Ownership".
I just put together a tutorial for data analysts who want to learn BigQuery. The tutorial goes through the process that I use as a data scientist to examine data.
For the tutorial, I took a look at election contributions based on Federal Election Commission data. The numbers seem to align with what is reported in the press (so the analysis seems sound), but feedback is always welcome.