NO NO NO!!! Don’t start with Venn diagrams, sets, and other such fluff. Reminds me of the thin, little book they tried sticking on us in my probability class; undergrad EE. It was meant for math majors.
There is a book “Probability and Statistics for Engineers and Scienctists” by Raymond Walpole. That book is excellent. Rolling dice and pulling colored marbles from jars is how you teach probability.
I studied probability during my undergrad (and high school) using dice, coins and other such things. It made sense to me but there was a dark area in my understanding. It felt like a blind spot and I could never get into it. In the final year of engineering, we had someone do a quick refresher on probability as a prelude to a longer course on pattern recognition and he described the whole thing using set theory (Venn diagrams, functions mapping from one space to another etc.) and I felt that the blind spot was illuminated. So, I don't know if starting from there would make sense but I do think it's useful, atleast sometime in your studies, to look at the whole system through this lens.
I've been working through http://www.greenteapress.com/thinkbayes/ and am quite enjoying it. My only complaint is that he, as intended, teaches using programs and a computer and I learn better by doing stuff by hand. He also has a think stats book at http://www.greenteapress.com/thinkstats/ which people might find interesting.
There is a good connection between probability and Venn diagrams: Both are about area. Probability is about area where the area of everything under consideration is 1. So, there is a set of trials. It has area 1. Each subset of the set of trials is an event and has an area, its probability. Then we can move on to random variables, distributions of random variables, independence of events and random variables, the event that a random variable has value <= some real number x, etc.
In pure math, since H. Lebesgue in about 1900, the usual good theory of area is Lebesgue's measure theory. The ordinary ideas of area we learned in grade school, plane geometry, and calculus are all special cases. But Lebesgue's theory of area handles some bizarre, pathological, extreme cases. And we can show that there can be no really perfect theory of area -- e.g., there have to be some bizarre subsets of the real line to which no nice theory of area can assign a length. But, once we have the Lebesgue theory, the usual way to show that there is a subset of the real line without an area uses the axiom of choice.
Well, in 1933, A. Kolmogorov wrote a paper showing how Lebesgue's theory of area would make a solid foundation for probability, and that approach is the standard one for advanced work in probability, statistics, and stochastic processes.
I agree that to build fundamental intuition dice and marbles are great. They only take you so far, though, and it would be terribly wasteful not to utilize mathematical machinery that already exists. Practically applied mathematics is a difficult tool to wield but incredibly powerful. I.e. you need to know when and how to apply it, but when it's used correctly it's immensely practical.
There is a book “Probability and Statistics for Engineers and Scienctists” by Raymond Walpole. That book is excellent. Rolling dice and pulling colored marbles from jars is how you teach probability.