I've been experimenting with a lot of geo-based projects over the past couple of years, and something I've been wondering quite a bit lately:
Why is the projection ID(SRID/EPSG) situation so horrendously bad?
As far as I can tell, the only way to take a Shapefile and figure out the proper SRID number, is to take the .proj4 file and run it through an API like http://www.prj2epsg.org/search (which spent some number of weeks down earlier this year, and I haven't tried again to see if the API is back up).
It's funny, because you can't actually take in map data
in PostGIS without the SRID properly set, and yet there's no reliable way to get the number, so working with multiple projection datasets is just... not possible unless you manually do the conversions?
Honestly, if I could get the data that the API above has, I'd run the service myself to at least make it reliable, even though it requires a lookup. But there should be a better system, somehow.
So the shapefile is from a pre-internet era, so it has a .prj file that describes the projection in a detailed enough way so that any program can just look at the file and figure out how to handle the data. (that being said it's a shit format which is vague enough that there are 2 separate implementations of it that are very different, also it defines it's own serialization method, don't do that).
The European Petrolatum Survey Group creates it's own database of these projections (it's literally an sqlite database, filled with XML) for quick reference but this is not a small database so not every place uses it. So you've now got 2 separate ways to talk about projections .prj files (aka WKT) and just EPSG numbers.
Now geo is becoming a 'thing' now so you've also got outsiders coming into the geospatial area without all the legacy baggage aka google, they don't care about projections and local coordinate reference systems because they are working at a global scale, hence google maps tends to do everything in Lat Lons and when it does project things it does it in Web Mercator (EPSG: 3857) which makes every cartography and traditional GIS person die a little inside but on the other hand given what google maps was designed for, is a pretty rock solid choice for doing local directions all over the parts of the world where people live.
And that means that a lot of the innovative stuff in the geo space is either not dealing with projections at all (like big query GIS [1]) and just doing all the calculations on a sphere or is doing bespoke projections based on the data (like D3) and not bothering with the standardized state plane and national projection schemes (which is what GIS people often talk about when they argue about projections even though at the end of the day it's all about just matching the projection to the house projection used by whatever agency is using the data).
I don't know exactly where I was going with this but https://epsg.io is IMHO the best place for finding info on projections, if your prj file doesn't just have a name or a number in it you can easily use.
Off topic: Do you have any good general learning resources for someone who is relatively new, but getting increasingly more involved in GIS? I'm a web dev, but a lot of the applications I work on involve maps (mostly using esri stuff)
GeoTools (Java GIS library) has several versions of the EPSG database available as JARs and you can use the CRS class to do lookups in both directions.
The situation is bad for Shapefiles because people forget to export the prj file and it can get lost in the process of transformation. I feel your frustration - it's just a very old format which encodes data into 4 different files.
This however, has nothing to do with PostGIS. PostGIS will ingest any geo data you give it. and if you don't know the SRID you can use -1. As for the problem of Shapefiles there are better alternative formats now like geojson and spatialite based geodatabases that encode SRID in a single file.
I sort of hope GeoJSON takes off a bit more, it just seems easier to work with at the very minimum with things like Leaflet. Of course you have Bing which uses their own objects for their latest rendition of the Bing Maps v8 JS API. Thankfully using Leaflet we were able to use their imagery instead. Having dealt with Shapefiles I can say it definitely felt like dealing with something quite dated, especially compared to how much simpler JSON can be.
Everybody is hoping something other than shape files eventually takes off, as its's universally agreed to be a pretty terrible format. The problem is that the industry hasn't unified around a single format with GeoJSON, GeoPackage and to a lesser extend KML all trying to become the next shape file.
As it stands however, everybody and everything can read shape files and that cannot be said about any other format.
The real annoying thing is that one of the reasons shapefiles are so bad, and so hard to replace is they fill a bunch of niches in a half assed way and no sane replacement format is going to replace shapefile in all situations.
Need a standardized way to disseminate data that's super easy to ingest? GeoJSON and KML work great there, while GeoPackage is going to be not a great fit for things like APIs plus you can't do a streaming read.
Need a way to edit and work with data locally and maybe send stuff around your office? Geopackage is going to be a much better fit then GeoJSON or KML.
Another 'hidden' feature of shape files that I've seen regularly abused (and abused myself more than once) is that they use the standard dBASE format to store their data. Meaning that all kinds of apps that know nothing about shape files (like Excel or Access) can be used to read and analyse the data stored within them.
Also, most of the data I'm using doesn't have a GeoJSON version publically available- I'm having to trust that someone else converted it correctly, and at a satisfiable resolution.
-1 means unknown projection, though for GeoJSON you can usually assume it's either going to have an EPSG code in there or it's 4326 (the only differences in GeoJSON versions is that in newer versions of GeoJSON it HAS to be in 4326 and in older ones it could say it was something else but most things will break if they get a GeoJSON that's not 4326 even if the GeoJSON has it's CRS info in it).
I've plugged Shapefiles into QGIS, which was able to discern the corresponding EPSG numbers.
The .prj files also contain enough information to figure out projection IDs, and various tools exist to manipulate and extract information from those files.
It looks like that webservice is a wrapper around one or many of those tools.
You can use custom projections in PostGIS by inserting rows into the spatial_ref_sys table (no need to know the EPSG-assigned SRID, use whatever you like as long as it does not collide with already assigned ones). This, however, involves knowing both the WKT and the PROJ.4 form of the projection string: [1] is a table of mappings between WKT and PROJ.4 expressions, [2] is a way to automate the conversion between the two using OGR.
Generally I've been harvesting them manually using tools like RGeo or ex_shape.
I also don't see how I could be "doing it bad" since the thing I need to do is to use data from sets that use multiple SRIDs. If I can't reliably pick out the SRID, I also can't reliably bring them all into my DB.
I once worked on a project where we were forced (over the entire engineering team's objections) to use Oracle Spatial (geospatial add-on for Oracle) instead of Postgres+PostGIS :( Not nearly as good. Really, expensive as hell and it doesn't even come close to PostGIS in performance and capability.
Reading the change log one feature strikes me as very useful for machine learning projects:
ST_QuantizeCoordinates
With this you should be able to store coordinates with whatever precision they come with but get back queries that only care about the 'city level' or 'block level' and can be encoded with an exact number of bytes for a neural net.
I think the answer is still no. Have you had success using geogig? I have been watching the space and have hopes for solutions like the Dat in Rust rewrite: https://datrs.yoshuawuyts.com
I'm just looking into a match between needs and capabilities at the moment. I think we want something we can run analysis on in R-spatial, and I'm surprised this doesn't already exist.