Hacker News new | past | comments | ask | show | jobs | submit login

I'm not sure what 'a lot faster' really means in this context.

Honestly, I've found using Spatialite queries to be orders of magnitude faster for analysis than shapely or geopandas. The latter typically imply row-by-row selection and manipulation for starters.

If you can wrangle the data into a geopackage first it's super easy to run queries over the data and extract what you need.




Faster both in terms of querying and in terms of doing the kind of analysis you want and getting the answers you need.

Of course, happy to acknowledge that different tools might work better in different scenarios. For example, I suspect that speed of querying is really just due to the data being in memory so if you can configure Spatialite or PostGIS to do the same, I certainly wouldn't be surprised if you say you can do even better.

But for one-off analyses, it's common to spend a lot of time just getting your data into the right shape, doing various manipulations, perhaps even wrangling the geometries. For that, working entirely within SQL is frustrating as heck. For example, I did an analysis on flight paths over heavily populated areas once, which involved turning infrequent point locations with gaps in the data into a smooth interpolated flight path. That's easy if you have numpy and scipy at your disposal, otherwise it's not. Another analysis involved estimating housing prices in neighborhoods without any recent sales, from prices in adjacent neighborhoods with sales, and again it's easy to code up an algorithm to fill the gaps or to run a geostatistical analysis that can impute the missing values, but not if all you have is SQL, or if you have to constantly do roundtrips between database and code.

I mention all this not to start an argument, but simply because when I first started doing GIS work, I was very confused about what the right tools and workflow were, and once I embraced projections (vs. working directly with spheroids) and in-memory analysis in Python, my productivity went way up. If other people find themselves in the same scenario, they owe it to themselves to try out both approaches to see what works best for them.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: