Scrolling HN, I saw the two words I'm always weary of - "from scratch". Mostly because I will click on the link, hoping to learn the mathematics behind a particular algorithm, only to see they've import sklearn and skip over all the explaining of how things are ACTUALLY getting done. Not that these types of tutorials do not have their place, but its irksome to see "from scratch" including most of the hard part being done.
With that, thank you Victor. Specifically because you did not do this at all and instead wrote a very easy to follow guide. I think this type of learning material will be very useful for CS and mathematics. The idea that very complicated algorithms must be explicitly implemented and then walked through, rather than symbols in a white paper will help make the mathematics of CS more applicable to everyone.
And to anyone bringing up numpy, it is at a level of "prepackaged" I'm fine with. I'm not going to raise my own pigs and chickens to make a breakfast burrito, but saying I did it from scratch by microwaving a frozen one isn't going to cut it either. Numpy is like the basic ingredients to the recipe. While something like sklearn or tensorflow are perfectly acceptable, I wouldn't say that's the best method for learning CNNs.
Thanks for this writeup! Having moved into Data Science as a primary career (after 25 years architecting software), it's frustrating to try to decipher the various symbology used in Physics, Economics, Computer Science whitepapers.
I need a Whitepaper Rosetta Stone. And this article is one such example.
I agree with you overall. This is purely speculative, but it seems that the rise of pre-packaged ML solutions has caused the meaning of "from scratch" to have changed to "with sklearn". It's easy to feel like a short python script using sklearn is "from scratch" when you were using a WYSIWYG solution before.
That said, the book "Data Science from Scratch" is great, and I'd recommend it to those looking for a deeper understanding than just "import sklearn".
“Each of the 4 filters in the conv layer produces a 26x26 output, so stacked together they make up a 26x26x8 volume. All of this happens because of 3 × 3 (filter size) × 8 (number of filters) = only 72 weights!”
There seems to be a mismatch here with 4 and 8. Probably 4 is wrong.
This is a great tutorial. However, every time I see RNN/CNN it's always applied to some video stream or set of images. I really would like to find some tutorial but applied to event logs or other text-based input. Anyone has a good link for that?
> If you trained a network to detect dogs, you’d want it to be able to a detect a dog regardless of where it appears in the image. Imagine training a network that works well on a certain dog image, but then feeding it a slightly shifted version of the same image. The dog would not activate the same neurons, so the network would react completely differently!
But CNNs only deal with translations. What if the image of the dog is rotated?
True! If dealing with stuff like rotations is a concern, you could augment the training set by applying small random transforms to it (like rotations, cropping, scaling, color adjustment, etc).
Well, human children do train their vision in a lot of different ways while playing, and probably very early develop a mechanism for dealing with all kinds rotated objects.
For some objects though, even adults are not instantly good, reading text upside down is something that is initially very hard, but with some practice it can be done, I've met some teachers that can do it from years of tutoring people from other sides of desks.
With that, thank you Victor. Specifically because you did not do this at all and instead wrote a very easy to follow guide. I think this type of learning material will be very useful for CS and mathematics. The idea that very complicated algorithms must be explicitly implemented and then walked through, rather than symbols in a white paper will help make the mathematics of CS more applicable to everyone.
And to anyone bringing up numpy, it is at a level of "prepackaged" I'm fine with. I'm not going to raise my own pigs and chickens to make a breakfast burrito, but saying I did it from scratch by microwaving a frozen one isn't going to cut it either. Numpy is like the basic ingredients to the recipe. While something like sklearn or tensorflow are perfectly acceptable, I wouldn't say that's the best method for learning CNNs.