Many product and data teams have to deal with broken pipelines because analytics events get changed in the tracking code. We built an open source tool to solve this problem. It infers the event schema and monitors changes to it on every pull request. It integrates with Github but you can plug it into other CI/CD systems as well: https://github.com/marketplace/actions/syft-event-analytics-...
We currently support tracking code written in Typescript with Segment, Amplitude, Mixpanel, and Google Analytics SDKs. Please try it out - would love your feedback and contributions (and stars)! Also happy to answers your questions here . Thanks!
This new model is way better at code generation after testing it on today's Advent of Code problem. I couldn't get Codex to generate a correct solution for part 2 whereas ChatGPT created a correct-looking solution for both parts! Transcript below:
--- Day 1: Calorie Counting ---
Santa's reindeer typically eat regular reindeer food, but they need a lot of magical energy to deliver presents on Christmas. For that, their favorite snack is a special type of star fruit that only grows deep in the jungle. The Elves have brought you on their annual expedition to the grove where the fruit grows.
To supply enough magical energy, the expedition needs to retrieve a minimum of fifty stars by December 25th. Although the Elves assure you that the grove has plenty of fruit, you decide to grab any fruit you see along the way, just in case.
Collect stars by solving puzzles. Two puzzles will be made available on each day in the Advent calendar; the second puzzle is unlocked when you complete the first. Each puzzle grants one star. Good luck!
The jungle must be too overgrown and difficult to navigate in vehicles or access from the air; the Elves' expedition traditionally goes on foot. As your boats approach land, the Elves begin taking inventory of their supplies. One important consideration is food - in particular, the number of Calories each Elf is carrying (your puzzle input).
The Elves take turns writing down the number of Calories contained by the various meals, snacks, rations, etc. that they've brought with them, one item per line. Each Elf separates their own inventory from the previous Elf's inventory (if any) by a blank line.
For example, suppose the Elves finish writing their items' Calories and end up with the following list:
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
This list represents the Calories of the food carried by five Elves:
The first Elf is carrying food with 1000, 2000, and 3000 Calories, a total of 6000 Calories.
The second Elf is carrying one food item with 4000 Calories.
The third Elf is carrying food with 5000 and 6000 Calories, a total of 11000 Calories.
The fourth Elf is carrying food with 7000, 8000, and 9000 Calories, a total of 24000 Calories.
The fifth Elf is carrying one food item with 10000 Calories.
In case the Elves get hungry and need extra snacks, they need to know which Elf to ask: they'd like to know how many Calories are being carried by the Elf carrying the most Calories. In the example above, this is 24000 (carried by the fourth Elf).
write a program to Find the Elf carrying the most Calories. How many total Calories is that Elf carrying?
To solve this problem, we can keep track of the total Calories for each Elf as we read in the input. When we encounter a blank line, we can start a new count for the next Elf. When we have finished reading all of the input, we can find the Elf with the most Calories by looking for the maximum total Calories.
Here is one possible solution in Python:
Copy code
# Read the input
calories = []
current_elf = []
for line in input:
line = line.strip()
if line == '':
# Start a new count for the next Elf
calories.append(current_elf)
current_elf = []
else:
# Add the Calories to the current Elf's count
current_elf.append(int(line))
# Add the last Elf's count
calories.append(current_elf)
# Find the Elf with the most Calories
most_calories = 0
for elf in calories:
total_calories = sum(elf)
if total_calories > most_calories:
most_calories = total_calories
# Print the result
print(most_calories)
Given the input from the example above, this program would output 24000, which is the correct answer.
--- Part Two ---
By the time you calculate the answer to the Elves' question, they've already realized that the Elf carrying the most Calories of food might eventually run out of snacks.
To avoid this unacceptable situation, the Elves would instead like to know the total Calories carried by the top three Elves carrying the most Calories. That way, even if one of those Elves runs out of snacks, they still have two backups.
In the example above, the top three Elves are the fourth Elf (with 24000 Calories), then the third Elf (with 11000 Calories), then the fifth Elf (with 10000 Calories). The sum of the Calories carried by these three elves is 45000.
Write a program to Find the top three Elves carrying the most Calories. How many Calories are those Elves carrying in total?
To solve this problem, we can modify the solution from Part 1 to keep track of the top three Elves with the most Calories, instead of just the Elf with the most Calories.
Here is one possible solution in Python:
Copy code
# Read the input
calories = []
current_elf = []
for line in input:
line = line.strip()
if line == '':
# Start a new count for the next Elf
calories.append(current_elf)
current_elf = []
else:
# Add the Calories to the current Elf's count
current_elf.append(int(line))
# Add the last Elf's count
calories.append(current_elf)
# Find the top three Elves with the most Calories
top_elves = []
for elf in calories:
total_calories = sum(elf)
if len(top_elves) < 3:
# Add the Elf to the list of top Elves
top_elves.append((total_calories, elf))
else:
# Sort the top Elves by total Calories
top_elves = sorted(top_elves, key=lambda x: x[0], reverse=True)
# If the current Elf has more Calories than the Elf with the
# least number of Calories among the top Elves, replace that Elf
if total_calories > top_elves[-1][0]:
top_elves[-1] = (total_calories, elf)
# Calculate the total Calories for the top three Elves
total_calories = sum([x[0] for x in top_elves])
# Print the result
print(total_calories)
Given the input from the example above, this program would output 45000, which is the correct answer.
> Check out Zbigniew Michalewicz's other books @ ...
thank you ! this led me to his books on evolution programs (Genetic Algorithms + Data Structures = Evolutionary Programs), which is an excellent book as well.
This looks very cool. At-least once semantics are the way to go because most tasks require idempotence anyway and that helps in dealing with multiple delivery. Strict FIFO ordering is not always needed either as long as you avoid starvation - most of the time you need "reliable" deferred execution ("durable threads").
I started prototyping something along these lines on top of riak (it is incomplete - missing leases etc but that should be straightforward to add):
https://github.com/isbo/carousel
It is a durable and loosely FIFO queue. It is AP because of Riak+CRDTs. It is a proof of concept - would be nice to build it on top of riak_core instead of as a client library.
I personally enjoyed learning more about the different languages that were used at each level. Granted I'd be meaning to try out node/scala/go in a more meaningful manner and the ctf problems were the perfect setting. The downside to this is that it can be difficult to learn a new language and solve the problem at hand.
I agree level4 was a bit undercooked and not as didactic as it could be. One could simply clear the level by using raft as a black box. Personally I enjoyed level 3 the most given the trade-offs that were present in solving it.
Many product and data teams have to deal with broken pipelines because analytics events get changed in the tracking code. We built an open source tool to solve this problem. It infers the event schema and monitors changes to it on every pull request. It integrates with Github but you can plug it into other CI/CD systems as well: https://github.com/marketplace/actions/syft-event-analytics-...
We currently support tracking code written in Typescript with Segment, Amplitude, Mixpanel, and Google Analytics SDKs. Please try it out - would love your feedback and contributions (and stars)! Also happy to answers your questions here . Thanks!