Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: How do you capture and analyse usage patterns on your RIA?
18 points by swombat on March 10, 2010 | hide | past | favorite | 12 comments
Hello everyone,

I want to answer the question: "What is the user activity most likely to lead to the user upgrading?"

What are the best practices for capturing potentially large amounts of usage data (e.g., perhaps, capturing every click on every button?), storing it cost-effectively without swamping your main app db, and then running reports/analysis on that data?

Is it best to log this stuff to text and then run a parser to load it into a separate db? To use external services (which ones work best? why?) To store it in your main db? Is it best to capture ALL the data and then figure out what you want to extract out of it later? Is that practical/realistic? Is it better to just focus on specific hypotheses and just record those specific data points?

I'm particularly interested in these questions in the context of a Rich Internet Application - i.e. something that looks more like a desktop app than like a website. Website analytics are fairly well documented, with plenty of articles tools, etc, but I haven't found much on the topic of doing this kind of tracking for complex RIAs that don't have simple and/or obvious conversion funnels.

For example, if you built an application like Huddle, how would you record data to figure out interesting usage patterns and which ones are more likely to lead to a purchase?

Thanks for any input/insight!




I'm kind of a metrics junkie.

One thing I've learned over the years: tracking data is easy. You can generate a virtual firehose of noise just by snapping your fingers. Tracking data which actually drives decisions meaningful for the business is a bit trickier.

I use a mix of Mixpanel, the DB, and key/value stores. I have log files, too, but log files are where data goes to die.

In general my core activity loop is generating a hypothesis, asking what I need to answer the hypothesis, either building just what I need to capture that data or a more generalized system (particularly when I keep asking the same kind of questions -- this is how I ended up writing my own A/B testing framework), then analyzing the data and trying a few things in response to it. The hardest part about drowning in data is understanding, in your bones, that if it doesn't drive a decision at the end of the day, you're just wasting your time. Even though the graph is pretty.

I generally get more bang for my buck out of simple hypotheses than complex ones. Simple ones are easy to describe, easy to instrument, easy to test, easy to improve, and likely to apply to a large portion of my users/business/etc. Complex ones are, well, the exact opposite. For example, while it is within my capabilities to target behavior at Mac-owning Firefox users who are interested in Catholicism... I'm probably going to spend time working on buttons seen by substantially everyone.

I generally do the first cut of analysis by playing around in my Rails console. When I find something interesting in playing around, I often promote it to a graph somewhere on the backend.

I think if you count all the data points I've got filed away somewhere you'd come up with a number in the low tens of millions, so my chief bottleneck is less technical/scaling (3 million entries in a key/value store? Oh well, iterate over all of them.) and more that the time, creativity, and attention to ask the right questions are very limited.


I record everything to Google App Engine data store, then have a script to export the data into a simple text file on my machine that runs from cron. Then I can run Python scripts on my own computer to just go through the text. I get about 60MB of events / day. Soon I will have to start deleting old events from App Engine, because I have to pay Google for data store entities stored.

At first I was planning on doing some clever incremental thing to get all the stats on App Engine, but it's a lot simpler if you can just have everything locally and rerun any analytics quickly when I make mistakes. It only takes a few minutes to load a week's worth of data and go through it, so doesn't make sense to spend days trying to be really clever about doing it on App Engine.

My favorite thing to extract is kind of a life story of users. When a user joins, what do they do on average in the first 24 hours? How about the next day? This is a pretty different point of view than Google Analytics, because it is from the user point of view and not historical of the whole app.

So does this point of view give you some new insight? At least in our case. For example we are now testing which of two profile box designs is better on our MySpace app. Just a simple stat would reveal that the new design has a better clickthrough rate. BUT, when I look at the whole story, it seems that users are more likely to remove the app when they have this new design, so end results may be worse.

Additionally I use Google Analytics events tracking, but I end up looking more at the life story than those events, because it is difficult to pull out how many events happen per visit, or how many events happen per visit in first 24 hours since user joined etc.


Would love more detail about how you do this... how exactly do you "record everything to GAE data store"? Did you create a GAE REST app? Why GAE? What do you record? Can you post an example of a few events that you record?


The app server is already in App Engine, so it was easy to just add an Event model and create an Event entity every time the user does anything that I might want to track. This is what it looks like to browse the events in App Engine admin panel: http://i.imgur.com/yJTeZ.png

The model is just 10 lines of code, and storing things there is even easier. For events that happen on client side I have to do a silent ajax request to the server to let it know what happened, similar to what Google Analytics does, but mostly everything that happens in the app already is known by the server.


I'm using jquery and google analytics to track this. Tie the interface actions into different analytic tags and use google to store and display the activity.

http://www.thewhyandthehow.com/tracking-events-with-google-a...


Agreed. Basically, even when there are no page loads, GA allows you to register events, actions, etc. to be tracked. You can have dynamic names that you send to Google analytics using javascript, and everything will be visible when you check your stats in addition to the normal metrics like page views, etc...


http://mixpanel.com (YC '09)

Basically, when something interesting happens, you make an AJAX call to a logging service. Mixpanel then graphs events and conversions over time and provides user profiling similar to Google Analytics.

You can also create virtual page views in Google Analytics by calling pageTracker._trackPageview.


The problem with this is, what if you don't know what your funnels are? What if you don't know what your key events are?

Recording everything into Mixpanel would get very expensive very quickly.


Hmm, what is the order of magnitude of your action volume? Worst case, you might write them sequentially to log files via a trivial web service and deal with them later, a la "tin" [ https://nosqleast.com/2009/#speaker/anglade ]


Why not try Google Analytics for Flash's event tracking? I've used that in the past and it's been interesting


It seems great in theory, but I've had difficulties getting out the info I need. I'd like to know how many of certain events / visit users do. For example, I have a feature in my app that users can steal points from other users, I can see the aggregate stats for a certain time period just fine for the whole event category: http://i.imgur.com/22kxL.png

But if I click on a certain event, "steal" in this case, I get this: http://i.imgur.com/mXUtQ.png It says total events: zero, even though the graph shows something very nonzero. And there doesn't seem to be a way to see how many times users used the "steal" function per visit. How about how many % of users use the "steal" function during their first month of using the app? Maybe it's possible, but I haven't quite figured out how to get these answers from GA.


My concern is the same.. there are a lot of questions I can ask my current relatively rudimentary system, that I couldn't ask of GA.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: