Hacker News new | past | comments | ask | show | jobs | submit login
Implementing a Convolutional Neural Network from Scratch in Python (victorzhou.com)
238 points by vzhou842 on May 22, 2019 | hide | past | favorite | 25 comments



Scrolling HN, I saw the two words I'm always weary of - "from scratch". Mostly because I will click on the link, hoping to learn the mathematics behind a particular algorithm, only to see they've import sklearn and skip over all the explaining of how things are ACTUALLY getting done. Not that these types of tutorials do not have their place, but its irksome to see "from scratch" including most of the hard part being done.

With that, thank you Victor. Specifically because you did not do this at all and instead wrote a very easy to follow guide. I think this type of learning material will be very useful for CS and mathematics. The idea that very complicated algorithms must be explicitly implemented and then walked through, rather than symbols in a white paper will help make the mathematics of CS more applicable to everyone.

And to anyone bringing up numpy, it is at a level of "prepackaged" I'm fine with. I'm not going to raise my own pigs and chickens to make a breakfast burrito, but saying I did it from scratch by microwaving a frozen one isn't going to cut it either. Numpy is like the basic ingredients to the recipe. While something like sklearn or tensorflow are perfectly acceptable, I wouldn't say that's the best method for learning CNNs.


Thanks for this writeup! Having moved into Data Science as a primary career (after 25 years architecting software), it's frustrating to try to decipher the various symbology used in Physics, Economics, Computer Science whitepapers.

I need a Whitepaper Rosetta Stone. And this article is one such example.


This story started really bad, but what a plottwist!

I saw this very same posted here in HN or Reddit last week methinks, and one of the top comments was complaining preciselly about numpy.

You disarmed that point right from the start, so kudos.


I stando corrected based on a comment form the author that is found below: Last week's post was the previous one in this series.


I agree with you overall. This is purely speculative, but it seems that the rise of pre-packaged ML solutions has caused the meaning of "from scratch" to have changed to "with sklearn". It's easy to feel like a short python script using sklearn is "from scratch" when you were using a WYSIWYG solution before.

That said, the book "Data Science from Scratch" is great, and I'd recommend it to those looking for a deeper understanding than just "import sklearn".


Hey, author here. Any/all feedback is welcome, and I'm happy to answer questions.

Previous discussion on HN of the "introduction to Neural Networks" referenced in this article: https://news.ycombinator.com/item?id=19320217

Runnable code from the article: https://repl.it/@vzhou842/A-CNN-from-scratch-Part-1

Github: https://github.com/vzhou842/cnn-from-scratch


Honestly: the "convolution" (cross-correlation) part in your article is the clearest step-by-step explanation I've ever encountered. Well done!


For those interested in a clear explanation of the backward pass (especially for convolution layers), here's a good resource:

https://arxiv.org/abs/1811.11987

https://github.com/Ranlot/backpropagation-CNNs


came here exactly looking for the backward pass


I'll have a sequel explaining the backward pass with full code up by next week!



I like your writing style, clear and concise. Your NN posts would have been super helpful when I was taking an Intro AI course.


“Each of the 4 filters in the conv layer produces a 26x26 output, so stacked together they make up a 26x26x8 volume. All of this happens because of 3 × 3 (filter size) × 8 (number of filters) = only 72 weights!”

There seems to be a mismatch here with 4 and 8. Probably 4 is wrong.


Firs thing first stop stealing. You copy code and figures from karpathy lecture notes and did not cite him. Its call plagiarism.


This is a great tutorial. However, every time I see RNN/CNN it's always applied to some video stream or set of images. I really would like to find some tutorial but applied to event logs or other text-based input. Anyone has a good link for that?


Andrej Karpathy has a pretty good introductory article to RNNs here: http://karpathy.github.io/2015/05/21/rnn-effectiveness/

He has some code which is pretty easy to follow to go along with the article: https://gist.github.com/karpathy/d4dee566867f8291f086


> If you trained a network to detect dogs, you’d want it to be able to a detect a dog regardless of where it appears in the image. Imagine training a network that works well on a certain dog image, but then feeding it a slightly shifted version of the same image. The dog would not activate the same neurons, so the network would react completely differently!

But CNNs only deal with translations. What if the image of the dog is rotated?


True! If dealing with stuff like rotations is a concern, you could augment the training set by applying small random transforms to it (like rotations, cropping, scaling, color adjustment, etc).


Ok, this makes me wonder how humans do it. Would a person who has never seen a rotated upside-down dog recognize it?


Well, human children do train their vision in a lot of different ways while playing, and probably very early develop a mechanism for dealing with all kinds rotated objects.

For some objects though, even adults are not instantly good, reading text upside down is something that is initially very hard, but with some practice it can be done, I've met some teachers that can do it from years of tutoring people from other sides of desks.


The way something looks can definitely change based on its rotation -- see http://thatchereffect.com for example.


For that level of effectiveness you would want to investigate using a capsule network.


You copy code and figures from karpathy lecture notes and did not cite him. Its call plagiarism.


Where's part 2? That's going to be the thing that's actually non-trivial


It's in the works! I needed a few more days to polish it up. It'll probably be up by early next week.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: