Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Explore 16 Years of Green Card Applications (jobsintech.io)
173 points by negrit on June 3, 2015 | hide | past | favorite | 83 comments



Hello everyone. I quickly built this little tool based on public records provided by the government. I did the same a couple weeks ago with H-1Bs and I just added over 1.1million green cards records.

The data provided is not perfect so i'm still working on cleaning it but it should give you an idea of what is going on.

If you have any questions, feel free to ask.


This is awesome Theo. When can I have data about Americans in France?


I can't seem to find the H1B statistics page, maybe I'm just looking in the wrong place though. Got a link?


"If you have any questions, feel free to ask."

What tools did you use/craft to clean the source data?


PostgreSQL :)


so just dump the data in and process via sql. What about pre-processing ie: different data sources (pdf, cvs), incomplete or overlapping data? I imagine some code had to be written to do this?


All my datasource where csv and mdb files. So, I directly imported them to PostgreSQL. Very little code was written to clean the data. 99% of the cleaning was done with SQL queries.

The code is mostly used to display the data. Cleaning the data with SQL queries is much faster than writing code.


Great tool! There seems to be a bug though when I drill down to year plus country of citizenship and then try to filter by job description and state or city. The job description search term ends up in the state input field.

Also, filter by company would be great!


My bad.

Fixing it ASAP


This looks great! Can the green card data be shown by categories? Like how many green cards applications were filed in each category like EB1, EB2, etc.


Neat tool, though it's a little disconcerting to be able to easily find my application - not your fault, though, the data was already there!


kinda weird that all the timeseries charts have time going backwards. going lower to higher is certainly the standard way to do it. for eg: http://data.jobsintech.io/companies/google-inc


Any way for us to run `select` queries ourselves?


I haven't planned it but why not. I don't really know how to do it though.


Kudos my friend.


Great work mate


Just as an FYI, citizenship is not the same as country of chargeability (usually country of birth), which is what the USCIS looks at for placing you in EB-ROW/India/China/Mexico/Philippines. So you could have an Indian passport but if you are born in, let's say, Saudi Arabia, you are placed in EB-ROW.


Thats correct. Also, if your spouse is born in a different country than you, you could charge your application to that country. For example, if your spouse if born in Kenya but you are born in India you can charge your GC app to Kenya which is ROW ( Rest of World).


What is this? This website indexes all available LCAs from 2001 to 2015. Where does the information come from? LCAs are public records and provided by the "Office of Foreign Labor Certification".

IMHO, this is misleading. LCAs don't have a 1-1 corelation to 'Green Cards'. This data is based on the PERM process which is just 1 stage in the green card process. LCAs aren't green card applications, but a labour market test based on which green cards are applied for.


Yep, my bad, I forgot to update the FAQ. I do have 5.2 millions searchable LCAs as well as 1.1 millions perms


The labor condition application has nothing to do with immigrant visas, maybe you are thinking of labor certification?

https://en.wikipedia.org/wiki/Labor_certification#Difference...


LCAs (labor condition applications) are for H1Bs, not green cards. Did you mean LCs (labor certifications)?


I hope he did mean LCs. It's easy to confuse the two: http://en.wikipedia.org/wiki/Labor_certification#Differences...


this is roughly consistent with what I know, that India engineers took nearly half of all H1Bs, while its major peer China is taking less than 10%.

I could never figure out what's going on to make such a big gap, I would think each takes roughly 15~20% makes more sense.

One theory is that India IT giants are applying for lots of H1Bs then filling them when they're approved, they know the system too well. Meanwhile the Chinese IT workers/students don't have those group-effort.

Also India managers like to hire Indians, while Chinese does the opposite, over time that also make a big difference.

No bias, just curious here. Nice work indeed.


I work at Microsoft and the number of Indians there seems disproportionate to me. I too wonder why so many of them come to the US. Maybe IT education in India is really strong?

Two funny things I observe regularly:

1) Sometimes I take an MS shuttle to work. The shuttle is full, and I'm the only non-Indian in it.

2) Sometimes I'm in my building's lobby and when I look around, I'm the only (or one of 2-3 people) non-Indian there.

I pity those guys because they have to wait for 10+ years to get their Green Cards, even if they're EB-2.


> Maybe IT education in India is really strong?

More like systemic visa sponsorship fraud for reasons of cost saving. The visa candidates are often as not frauds as well: http://www.dawn.com/news/1080040


Indian IT community is much more entrenched in the US market. If a Chinese engineer wants to find H1-B work, he will most likely have to go at it by himself, and you know how hard that is.

Language is another barrier, while Chinese engineers most likely read English, few speak fluently.

Lastly, this is subjective, finding work in China is easy for good engineers. They also enjoy upper middle class pay AND social status. So while moving to the States means higher salary, it does not necessarily translate to better life.


> while Chinese does the opposite

Can you elaborate on that? I'd never heard of such a thing.


In general the culture in China encourages individuals to be its best and care less on teamwork/group-leadership. It's slowly changing but this will take some time.


The 'salary' feature is pretty telling. I just looked up a former coworker by searching for the company and narrowing down by the hire year. I knew what country he was from so I was able to see his (starting) salary.


Wow, I would really like to see my American coworkers' starting salaries the same way they are able to look mine up in this table :)

Don't think that would ever happen though, it would be labeled as a shocking breach of privacy.


You can look any most US citizens' salary if they work for a gov't agency.


I know that.

But I work for a private company, and I would bet the majority of HN readers do too. Imagine if your starting salary was known to your coworkers on the day you started, but not vice versa.


Can somebody explain to me why Americans need to have green card to work in the US? According the the data, there are 1659 applicants with a success rate of 68%.


Maybe it refers to US nationals as opposed to US citizens. For example, the people of American Samoa and Guam are US nationals but not US citizens, at least not automatically. But I don't know what all the differences between the two categories are.

Reference: https://www.youtube.com/watch?v=CesHr99ezWE


Also, the Soviet Union was dissolved in 1991, how come there were 4 USSR nationals between 2001 and 2015 applying for permanent residency?


Ah, I should have made it more clear. It's the decision date, note the application date. So probably those applicants applied when the USSR was still a thing.


So, in essence, the bureaucracy took so long that at some point during the process, their country ceased to exist. Kafka would be proud :)


Might be interesting to somehow show the delta between application date and decision date, in particular by country.


Yeah, Czechoslovakia is also on the list, even though they split in 1993.


Could be that they were Americans who denounced their citizenship and later wanted to work in the US. Or it could be malformed data :)


Thanks for the work to present that data in a nice and explorable way!

It would be awesome to have another column next to "% of Greencards" that says "% of population".

One of the reasons India has by far the most green cards is that they also are up there when it comes to total population.


The site says India 40.1% and China 6.7%. According to the CIA factbook, the respective populations are 1.2 billion and 1.3 billion.


The "% of Green Cards" refers to percent of Green Card applications in database, not against population of the country.


yes, and that's why I'd love to have another column to figure out if there are any countries that have statistically significant higher/lower Greencards per capita :)

It's probably not a super useful metric, but it would be interesting to see


Is there some filter on this data? Green Card for a certain type of job? India has 280K greencards and Mexico only 40K greencards in 16 years? That doesn't seem right.


I assume this is only employment green cards. Most Mexican Americans probably come in on family based green cards, or just got citizenship by birth in the US.


Even China has only 45K greencards. They seem to follow the same pattern as India for immigration, so what category are they coming in. Last I checked both countries (India, China) have similar number of immigrants in the US.


I don't know, but the EB2 and EB3 backlogs for India are longer than for China so there must be some difference.

Assuming those numbers are from PERM applications though, they don't correspond one-to-one to immigrants. For example Indians have a multi year wait to get a green card after their PERM is approved, and anytime they change employers while waiting they will likely apply for a new PERM, so by the time they get their green card one Indian could have gone through many PERM approvals (which this website will probably count separately).


As mentioned in one of the other comments, you're getting NaN% for some success rates. I'm guessing it's because you're using the number of certified as the denominator, which will return NaN if it's 0.

Also, maybe this is a dumb question, but is this all green cards? For example, your site says there were 261 green card recipients from Nigeria, which seems quite low.

(Goes without saying - really cool!)


> Also, maybe this is a dumb question, but is this all green cards? For example, your site says there were 261 green card recipients from Nigeria, which seems quite low.

I wondered the same thing. Fiji is listed with 2 recipients (last I checked Fiji's annual quota was 600).


Excellent way of drawing attention to the subject! Kudos.

Might be interesting to explore the option of adding a beautiful data vis on the home page to capture the user's attention from the get-go. Looking at the sub pages you're obviously skilled enough at creating fascinating visualizations.


Curious: what sort of attention are you hoping this brings to what subject?


Thanks, I'm planning on adding few charts


Are these across all LPR categories—i.e. employment based, family based, DV, asylum, etc.? The numbers make me inclined to think not, but I can't find any explanation on the website of what it covers.


Since it has salary data it must be based on PERMs, which only cover a subset of employment green cards.


It'll be fun to map the stock price with the hiring graph. Does a big jump in hiring signal an oncoming price surge in the next year or so?


If you ever get the chance I would consider redoing the graphs using https://dc-js.github.io/dc.js/ Being able to interact with data by clicking on graphs can make a massive difference to the way you can interrogate data. Kudos though.


This is awesome! I'm def considering it.


Search fields don't appear to work properly (Safari on OS X) - if you key in a full search (title, city, state) and hit 'refine' it puts the title in all three fields while showing you everything, not just search results. Interesting site though - I want to find me :)


On which page?


Go to a specific country, then a year. Try searching for a city or a state, it doesn't work.

I can't now reproduce the earlier behavior I saw where a job title ended up in all 3 search fields.



Getting an 'NaN' for the Success Rate in the table at the bottom of that page. May be something for OP to look into.


So reading into India's stats, there was dip in 2013 followed by a rise to maximum GC applications by indians ever? Is there a way to tell how many of those were upgrades? that would be interesting because then you can tell how many might just be repetitions.


I also wonder what happened to Indian applicants in 2012[0]. A 0.03% success rate.

[0] - http://data.jobsintech.io/green-cards/india/2012


The dates in this dataset are "decision" date, not the application date.

In that year, there was a fast forward movement in backlog of India's GC application... Green cards have a per country cap-- at the end of fiscal year, USCIS can use the "unused" GC quota of other countries for the backlogged countries. This is why you see that "priority date" for India stays at a fixed place (e.g., 2005 for EB2 India) for most of the year, and then suddenly moves forward at the end of fiscal year.


Awesome!

Nit-pick: Any chance you could right-align the columns with numbers, and add thousand delimiters?


Always interesting to look up your own company for activity. You may be surprised.


What's the difference between salary and prevailing wage?

http://en.m.wikipedia.org/wiki/Prevailing_wage


A staggering 500 applications from North Korea, wonder how they got out..


What is notable here is the low approval rate.


Spies?


This is a really great tool. Kudos to negrit.


Very interesting. Wonder what's the reason for the huge dip in Indian Software Applications 2013 onwards?

http://data.jobsintech.io/green-cards/india?utf8=%E2%9C%93&r...


could be a reporting issue -- these things take time to show up in the data?


Interestingly:

>Soviet Union 4 | 0.0 | 2 | 0 | 0 | 2 | 100.0%


In a company view, can you add Browse LCAs by job title?


definitively


Oh man Palau has NaN% software engineers accepted :(


don't despair, maybe they have infinite engineers.


is there anything similar for canada?


Canada 0 in 2012 is weird.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: