Hello everyone,
I want to answer the question: "What is the user activity most likely to lead to the user upgrading?"
What are the best practices for capturing potentially large amounts of usage data (e.g., perhaps, capturing every click on every button?), storing it cost-effectively without swamping your main app db, and then running reports/analysis on that data?
Is it best to log this stuff to text and then run a parser to load it into a separate db? To use external services (which ones work best? why?) To store it in your main db? Is it best to capture ALL the data and then figure out what you want to extract out of it later? Is that practical/realistic? Is it better to just focus on specific hypotheses and just record those specific data points?
I'm particularly interested in these questions in the context of a Rich Internet Application - i.e. something that looks more like a desktop app than like a website. Website analytics are fairly well documented, with plenty of articles tools, etc, but I haven't found much on the topic of doing this kind of tracking for complex RIAs that don't have simple and/or obvious conversion funnels.
For example, if you built an application like Huddle, how would you record data to figure out interesting usage patterns and which ones are more likely to lead to a purchase?
Thanks for any input/insight!
One thing I've learned over the years: tracking data is easy. You can generate a virtual firehose of noise just by snapping your fingers. Tracking data which actually drives decisions meaningful for the business is a bit trickier.
I use a mix of Mixpanel, the DB, and key/value stores. I have log files, too, but log files are where data goes to die.
In general my core activity loop is generating a hypothesis, asking what I need to answer the hypothesis, either building just what I need to capture that data or a more generalized system (particularly when I keep asking the same kind of questions -- this is how I ended up writing my own A/B testing framework), then analyzing the data and trying a few things in response to it. The hardest part about drowning in data is understanding, in your bones, that if it doesn't drive a decision at the end of the day, you're just wasting your time. Even though the graph is pretty.
I generally get more bang for my buck out of simple hypotheses than complex ones. Simple ones are easy to describe, easy to instrument, easy to test, easy to improve, and likely to apply to a large portion of my users/business/etc. Complex ones are, well, the exact opposite. For example, while it is within my capabilities to target behavior at Mac-owning Firefox users who are interested in Catholicism... I'm probably going to spend time working on buttons seen by substantially everyone.
I generally do the first cut of analysis by playing around in my Rails console. When I find something interesting in playing around, I often promote it to a graph somewhere on the backend.
I think if you count all the data points I've got filed away somewhere you'd come up with a number in the low tens of millions, so my chief bottleneck is less technical/scaling (3 million entries in a key/value store? Oh well, iterate over all of them.) and more that the time, creativity, and attention to ask the right questions are very limited.