Hacker News new | past | comments | ask | show | jobs | submit login
NYC taxi visualization (nyctaxi.herokuapp.com)
259 points by lalwanivikas on July 14, 2014 | hide | past | favorite | 77 comments



The taxi data just has startpoint and endpoint correct?

So the route in between each trip is just a guess?

Edit: Whoops, from the about page:

> The raw data include only start and end locations for each trip. These points were run through Google's Directions API to create the routes shown in this visualization. Of course, these are Google's best choice, not necessarily the one the taxi took.


This explains some very slow drifts in the middle


This doesn't seem to work on OS/X Mavericks, Chrome 36.0.1985.103 beta. When I click on "Begin", nothing happens (also no Javascript errors).


I'm getting a 503 after it finishes loading the initial assets.


Doesn't work on Chrome Android (Nexus 7) too (latest, whatever that is)


I was surprised at how inefficient some of the drivers' behavior was. I thought after dropping off a fare they would all immediately head to an area they knew of that might have a high likelihood of generating fares. Instead many of them seem to just wander around aimlessly looking for the next fare.

edit: spelling / grammar


From the page: Empty Taxis also follow the "best route" between a dropoff and the next pickup. Just as with the trips, this is just an effective way to move the marker around, but doesn't reflect the reality of where the taxi traveled.


It's also the case that taxi drivers need to eat, use the toilet, do their shopping, see the sights, etc. They aren't robots, and they don't necessarily have to dash to the next customer in a mad scramble for survival.

Imagine if somebody made a visualization of your workday and put it out in public. Would your manager say "he seems to just wander around aimlessly instead of efficiently moving from one piece of code to the next. I thought after compiling one project, they would all immediately head to the next one."


The analogy is ok, but mostly fails because as a developer, we are not paid commission for each line of code written. Whereas taxi drivers have a very strong profit motive to act efficiently and quickly. I think your examples of why they would deviate from this believed behavior is pretty solid on its own.


Seems like this data could be used to build an app that suggests where a taxi driver should go at any given time to maximize their chances of getting a fare. Uber & Lyft already do this for their drivers, but I'm not aware of any app that does this for NYC yellow cabs.


Most taxi drivers already know the right time/areas to go to get hailed, and to get the kind of hails they want(IE not to JFK or LGA, the bronx, most of queens/BK). Even though it is just tribal knowledge, I'd be really surprised if it wasn't largely accurate.


Why don't they want to get hailed to JFK?


Long drive, capped rates as already mentioned. Also once there you cannot immediately pick up another passenger - you have to go to the central "taxi pool" where you get to line up with dozens/hundreds of other cabs for the privilege of picking up a fare.

So as a driver your choices are to wait forever in a line to pick up a fare from the airport, or beeline it back to Manhattan as quickly as you can with an empty car. Neither are great choices.


Although in the wee hours (between 1 and 6AM), getting a hail to JFK is not bad, because it only takes ~20 minutes to get there, and yields a $60 fare. But during rush hour, getting stuck going to JFK is the worst thing possible, as you lose ~2-3 hours of the best fare times.

Which is why you should use the AE/Penn Station train during rush hour (it'll be faster than a cab), or tip your cabbie very well (30%+), though it still ends up being sucky for them and stressful for you.


Certain destinations (outer-boroughs) will earn you a "shorty" that lets cabbies skip the airport taxi pool lines. Most cabbies prefer this scenario.


It's a long drive (from Manhattan) and the airport rates are capped around ~$40 or so.


Damn, I wish we had that in the Bay Area. I usually take Uber or another black car service from SFO because to Oakland in a taxi it's $90+ on the meter and lots of extra fees.


Do they really? I just asked my Uber driver yesterday and he said that Uber does not provide him anything like that. I was quite surprised.


An UberX driver said the same to me and I was surprised as well, but after thinking about it, it started to make more sense. What could Uber really tell them beforehand that they wouldn't already know, or learn very quickly from driving around?

The driver thought Uber didn't provide the information because they want to be able to charge surge pricing, but that cannot be right. Uber doesn't want to charge surge prices unless they absolutely have to because they reduce the number of rides people take.

Considering how much money Uber invests in giving away free rides and temporarily reducing prices to increase ridership, they clearly see a lot of value per ride, especially for travelers who are more likely to be at an event, try Uber for the first time, and then bring that demand home to their own cities.

Surge pricing only makes sense in situations where drivers need additional incentive to go out in the first place.


I thought they built it but scrapped it after discovering the predictions weren't very good.


They use the data for pricing I believe.


:-) i thought that was no market for it : i wrote the heat map for specifically that purpose :

Where People Are in NYC ? Look at places where taxi pick up their passengers :

Guide To Uber Drivers : Where to get passengers at given time of day

https://github.com/akuchlous/NYC_CAB_ANALYTICS


From what I've heard, Lyft is just testing this sort of heatmap with Android drivers, and that it isn't yet available for iOS drivers.


That's super interesting.

I wonder why my guy wandered really slowly up Riverside Ave for like 30 minutes instead of going inland to find a fare. Looks like it's really slow between 11am and 3pm, and constant driving otherwise.


The route information between trips is just a guess. So he could have really just stopped for dinner or something.


I was wondering the same thing, but as they just average the location from the last dropoff to the next pickup, it's almost definite that the driver "went to lunch" in the interim. I saw this trend in the two taxis I looked at.


Taxi drivers don't usually stop for dinner when on shift. The closest they will do is run into a bodega, or stop on the corner and have a streetfood vendor hand them something into the cab.


I've seen halal carts where the taxis line up down the block and a guy takes their order on the south end of the block and they pick it up at the north end, just like a drive-thru burger joint.


Only has data for begin/end points of the trips. They're just using google directions to move between the sampled data points - they don't have minute by minute gps location of the cab.


I was surprised by the amount of fares that just went a few blocks. Looks like it should have been pretty walkable in a lot of those cases.


Just a few observations over time for quick taxi rides aside from weather and transporting a heavy item.

- Often times rides are paid on the time of a company so people will just expense it.

- If someone, say a relator, has a client, they might opt for a taxi.

- Models will take a taxi even if their destination is only a few blocks away due to their high heels.


People will often use cabs to go somewhere when they have something to carry, even if its a short walk. Also, if the weather is bad.


looking at heat maps :

jul 2 : 2013

people are in hurry at around 8:30 AM : http://akuchlous.github.io/NYC_CAB_ANALYTICS/July/1/51.html

that at : 10:40 AM http://akuchlous.github.io/NYC_CAB_ANALYTICS/July/1/64.html


Could've been raining that day. Or it may be US/NY specific... I have only been to west coast, but coming from Europe I was really surprised what's considered a "walking distance" in the city centre. 2 blocks away and people were ready to get a cab.

Then again, the cabs were probably 5+ times cheaper than what I was used to in the UK. Maybe I'd use them more if it cost me $5, not £10 to go around 10 blocks away. One is close to spare change, the other I'd have to think about...


10 blocks is very much considered walking distance in NYC. People walk here all the time - it's the default mode of transport to nearby places. Cabs are usually used for specific reasons: transporting something heavy, bad weather, rider has poor mobility (injured leg, etc.), had a lot to drink, entertaining/working with a client, etc.

This isn't true in the rest of the US, however. The rest of the country is less walkable and the infrastructure is designed for cars first and foremost, so people learn to default to cars even for nearby trips. When I lived in California it was hard to get people to walk anywhere, even if it was only 5-10 minutes away.


https://github.com/akuchlous/NYC_CAB_ANALYTICS/blob/gh-pages...

data from chriswong : 22% rides are less 1 mile long


It'd be interesting if comparable data could be gotten for different cities. Wonder what the curves would look like for e.g., Boston, SF, London, Athens, Frankfurt.


Small nit, but guessing this is crow's flight distance which doesn't equal path distance. Regardless, this still isn't surprising. Short rides happen all the time.


:-) this data is equal path as measured by odometer : I was lazy and did not use longitude/latitude for distance. The data itself has miles travelled.

though a bit surprising : 45% of rides are single person rides. But then I wonder : what's surprising : is it low or is it high?


How is the data input on # of passengers? Could it be that it's just easier to hit "1 person" all the time regardless? Are there cabbies that always have the same #?


Loads of tourists are terrified of walking anywhere in NYC both due to fear of crime and fear of getting lost, and thus take cabs for distances that probably don't warrant them.


Perhaps they're disabled.


Or have small children.


You shouldn't serve your assets (/js, /css) off of heroku. They aren't loading because of the request volume.


I think Chris was exceeding his Mapbox limits, and that was causing problems.


One interesting observation is that the mean fare is ~ $10. That's the same as 4 subway rides (each one with unlimited transfers and can take you from one end of the city to the other). You need to be pretty rich to be riding a taxi regularly nowadays. Taxi fares have gone up at a far greater rate than subway fares.


Of course, you can also share a taxi... it may be an indulgence to hop on a taxi alone as part of your daily commute, but it becomes much more convenient, private, and cost-effective when in a pair or small group.

These taxi data do not seem to capture taxi capacity factor / number of seats filled.


I was making a heatmap for taxi data : published for first 5 days for july

http://akuchlous.github.io/NYC_CAB_ANALYTICS/

maybe can help Uber / Taxi drivers figure out where to get most taxifare pickups are!

hosting on github.io

pushing to github.io is awesome. make maps, save and push!


[deleted]


Perhaps the link was edited since your comment, but the links "July 1" through "July 6" return taxi heatmaps, as promised.


First of all... this is really great. Really good job.

On the ipad a few things are wonky with the layout, you may want to test on there and fix the few UI issues.

It would be cool to see $/hr precalculted as well. It may be better to make the right side a table with each row being a ride and then totals at the bottom too.


I just followed a taxi that had a 12% tip rate. I thought with the new automated payment systems [1] they were supposedly getting 25% now...

edit: I should have read about the * before I posted, but it's interesting to show then how much are cash/charge transactions by the discrepancy.

[1] http://thenextweb.com/shareables/2012/05/14/how-3-simple-but...


From a brief look at the data it looks like cash tips might be counted as 0%. So that 12% average might be originally closer to 25%, but brought down by the nulled out cash tips.


What's happening when I see a green blob which then turns red a few seconds later? The cab hasn't moved, but the passenger count and fare tally have both gone up. The cab then moves at a normal pace from that location to the next fare, so I don't think it's missing GPS data.

Edit: just saw another comment saying the points were calculated through a Google API, so maybe it's just gremlins in that.


I see a taxi route on: Sunday, June 29th 1913...


Me too, both of the taxis I watched showed 1913. This is some weird Y2K stuff.


Today I learned that Heroku is awful at keeping up with Hacker News + Reddit traffic.


Well it doesn't autoscale, if thats what you mean.



Wow, I totally understand Uber's valuation now. $D

But really, UberX and Lyft driving makes more sense viscerally. Taxis earn decent money but have lots of expenses. I see potential indeed. Ha


I watched a few different cabs and they all seemed to generate around $600 in a 24 hour period. Am I the only person who thinks this is astonishingly low?


They lose a lot to fees, car stuff, gas, variable customer, routes people want them to take (and time to find a new fare) etc. They probably make around $25 an hour. I'm not sure what that looks like in New York but it sounds alright.


astonishingly low? I thought that was pretty high


It's a decent change. but...

- Most people do 12 hour shift. Not 24 hours.

- Most drivers lease their cabs at $100/day

- Fuel bill adds up to $15-$35 depending on car.

- There are other fees.

I think you're doing well if you net $200 in a shift on average. I think it's an optimistic figure.


I was chatting with a UberX driver yesterday in NYC and apparently his insurance is $7K a year (it's a Toyota Camry too, nothing super fancy).

That works out to ~$27 a day if you drive the car 5 days a week, every week of the year. Not a massive amount, but significant in the numbers we're talking about.


yes, and then there is this. http://query.nytimes.com/gst/fullpage.html?res=9900E0D6153BF...

The demand for taxis is so high that a driver bluntly told me to give him my cell phone and setup GPS route, otherwise hes not going. And that was a good 2-burrow trip for $40. Bottom line; they don't need you -- you need them.


That's about $50/hr (before medallion or car costs). I wouldn't call it astonishingly low.


It's actually half of that, $25/hr.


They're not working 24hr straight. They can only work 12 hours.


To see some quick action: `for(var i = 0; i < 50; i++) $(".faster").trigger("click") ` it gets interesting to watch at that speed.


Anyone know if there is similar data available (for any city) that actually includes real-world routes, rather than just endpoints?


Hitting "next" on Chrome 35 (Win 8) doesn't do anything for me. Any ideas?


I was seeing the same a few minutes ago (also in Chrome) but it's working for me now so it's worth giving it another try.


ITs interesting to see how busy a taxi is in New York even at 3.30 in the morning..


They certainly drive a lot faster when they have a fare.


OpenStreetMap attribution...


Removed at this commit.

https://github.com/chriswhong/taxitracker/commit/8565b71ae74...

I'm sure the developer would appreciate an issue or a pull request.


Does not work in Safari. Come on people, if you're using some non-standard or cutting-edge feature, at least use Modernizr to detect it and tell users instead of just having a completely non-functional "begin" button.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: