Officially, police departments in North Dakota tell us that they have no way of knowing the refugee status of their inmates, and so, cannot report this data. This information gets transformed by the pro-refugee advocates as “there is no evidence that refugees commit more crime in our area!”
Of course, they could just as easily say “there is no evidence that refugees commit less crime”, or “there’s no evidence”.
As it turns out, while there isn’t direct evidence, if you’re willing to do some work and some analysis, you can make an educated guess. There is a public website you can visit to find the current inmate roster of the Cass County Jail. For each inmate, the website shows the name, birthdate, arrest date, and charge(s) of the inmate.
Since I’m a software engineer, specializing in data analysis and “big data” I decided to see if it was possible to find some signals in the inmate data. I have Cass County jail inmate data from data from Jan 1st, 2012, until January 28th 2017. This represents about 66,000 separate arrest records, and 20,198 unique inmates.
If you go to the Cass County inmate roster on any given day you see many, many Muslim names. For instance, we all realize that someone named “Sven Svengaard” is more often than not going to be the child of parents who are culturally Scandinavian.
In the same way, when we see inmates named “ZIAD MOHANNAD QUDAH” and “ABDULLAHI MOHAMED IMAN”, we can conclude that they are probably not 5th generation North Dakotans who homesteaded here from Norway.
Can we use “Muslim sounding name” as a reliable suggestion of refugee status?
Not necessarily. Some of these “Muslim sounding names” might very well belong to legal immigrants who were well vetted and who paid their own way here. One criticism I have of some pro-refugee argument I’ve seen is that they carelessly conflate immigrants and refugees. I don’t want to do that here – if possible, I’d like to treat refugees and immigrants separately, somehow. But without specific immigration status information in the data, how could we do that?
It turns out that there is a very interesting “signal” in the inmate data. The inmate data records the date of birth for each inmate. In the US, birth dates are uniformly distributed across the year. That is, on average, the same number of babies are born every day of the year. There are about as many Americans born on February 5th as there are on August 21st, and this holds true no matter what day of the year you talk about, and no matter what year you’re talking about.
In general, if we don’t have a reason to believe that criminals are only born on certain days of the year, we’d expect the distribution of inmate birthdays to match the distribution of birthdays in the overall population. That is, we’d expect that the inmate birthdays to be evenly distributed across every day of the year.
And they are – except for one magical date.
It turns out that for the Cass County jail inmate data I have, on average, 54 inmates were born each day of the year.
But strangely, 341 inmates claim to have been born on Jan 1st.
January 1st, being the first date of the year, is exactly what you’d put as a birth date if you didn’t know your real birthday, or weren’t disclosing it.
In the Cass County inmate data, January 1st birthdays are seven times as common as they should be. Here is the histogram of inmate birthday data:
That giant spike on the left hand of the graph is January 1st. (Ignore that the graph says “1970” for the year, that’s just an Excel artifact.)
You can see how every other day of the year hovers tightly around the 54-per-day average.
This is a classic data problem. We know beyond any shadow of doubt that the birthday information for most of the inmates claiming to be born on January 1st is simply bad data.
What kind of situation exists in America where the police and the jail would allow someone to have invalid birthday data?
Perhaps the data can tell us something about who is failing to provide accurate birthday data to the jail?
I have a list here (PDF) of all 341 Cass County Inmates who claim to have been born on January 1st.
Does a certain “pattern” jump out at you from this list?
I have to say, a lot of these names ring my “sounds Muslim” bell.
Now, you’ll also notice that many of these names don’t sound Muslim at all. That’s to be expected. Remember, statistically, we should expect about 54 people to have actually been born on January 1st – just like every other day of the year. So, you should expect to see about 54 culturally American sounding names on this list. And that’s about what I see. I am not actually an expert on the cultural histories of names, so I may have mischaracterized some of them.
But the general pattern holds – we’ve got a big huge pile of “Muslim sounding” inmate names, and we absolutely know that the majority of them have false birth date information.
My basic theory is this: When we have a list of inmates with Muslim names and false birth date information, we can reliably guess that these are more often than not, refugees from majority-Muslim nations.
Because most refugees claim to be escaping failed states with civil wars and non-functioning governments. That’s precisely the kind of place that wouldn’t send over reliable birth date information – assuming it was collected to begin with.
So, the police can tell you that they don’t keep records of who is or isn’t a refugee, and therefore, don’t know.
But we can make some educated guesses, and do our best.
Of the 341 names in the January 1st club, I’m confident that at least 255 of them have a high likelihood of being refugees from majority Muslim nations. Statistically, that number should be closer to 287. You can look over the list I’ve shared and make your own count; let me know what estimate you come up with.
So, let’s assume for the sake of argument that I am correct – that at least 255 of the Cass County Inmates were refugees.
That would mean that of the 20,198 inmates in the 5 year period, 255 of them, or 1.4%, were refugees.
That is not a very large percentage.
However – refugees are a tiny part of the Cass County population. So, can we adjust this incarceration rate for population?
How many refugees has Lutheran Social Services brought into Cass County?
LSS claims that they have settled about 4000 refugees in all of North Dakota in the last 14 years.
If 255 refugees have done jail time, out of total population of about 4000, that works out to around 6% of refugees having done jail time. That’s not an exceptionally high rate of incarceration.
But we need to make two important adjustments.
People who work with refugees would be the first to tell you that not all refugees are the same. We have refugees from several different people groups that have been resettled into the Fargo area.
Refugee arrival data for the state of ND shows that the largest refugee population brought to Cass County is Bhutanese. I encourage you to read up on the refugee situation in Bhutan, but the key point is that Bhutanese refugees are not Muslim and do not have Muslim names. So, the likely-Muslim-refugee inmate names we see are probably not from the Bhutanese population.
If you total up the values in column “Cass County Total”, you get 4239, which is inline with the public claims made by LSS.
Now, if you remove the Bhutanese top row, you are left with 2619 total refugees in Cass county from all other nations.
So our actual Muslim nation refugee data now looks like this: out of about 2600 non-Bhutanese refugees, about 255 of them appear to spent time in Cass County Jail. That’s about 10%.
But there’s something else.
The refugees brought in by LSS have had about a 50/50 gender ratio of male to female. But the majority of inmates appear to have been male.
If we suppose that there are around 2600 non-Bhutanese refugees in Cass County, and only around half of them are men, then we’d have about 1300 male non-Bhutanese refugees in Cass County.
And 255 of them appear to have spent time in jail.
That’s 20%. That’s one out of five.
Data tells us that the arrest rate for white Americans in the USA is not one out of five. It’s not even one-out-of-five for white males. It’s significantly less than that.
(I focus on white males since Cass County has an overwhelmingly white population.)
It would appear from this analysis that likely refugees from Muslim nations are much more likely to have been in jail than the overall Cass County population.
If true, this would represent a significant economic and social cost to the Cass County community. Unlike the claims of refugee supporters, who claim that there is no evidence of increased costs to our community from absorbing refugee populations, here we appear to have some very specific and quantifiable data that suggests otherwise.
It would be nice if we had detailed information from jails, schools, welfare agencies, job agencies, and so on, that could help us better quantify the refugee group outcomes in our community. We might observe, for instance, that it is significantly more expensive to absorb Somali refugees than Bhutanese refugees. That appears to be the case, based on inmate data. Having better data would let us have these discussions.
However, I get the distinct sense that there are many people who are motivated to make sure data driven discussions about refugee policy are impossible. That’s an inexcusable attitude for anyone in public service. The people have the right to good data about the impacts of government programs that they pay for, and which impact their community. Shouting “you are a racist” is not a valid answer to a request for good data.
I will close by saying that I am a lay person. I am not an expert on refugee issues, cultural naming practices, inmate housing, jail record keeping, or other significant issues. What I have some experience with is data analysis, and, despite being a lay person, I appear to have done what all of the “experts” in our area have said is impossible: I have made some plausible attempts to quantify the relative criminality of refugees in our area, despite not having a specific refugee signal in the data set.
I expect that some people who read this document will point out methodology flaws in what I have done. I welcome your corrections, and hope to be able to incorporate them to provide a more useful document with future revisions. It could be that I have made several serious errors that completely invalidate the tentative conclusion presented. It is to the benefit of the entire community that such errors be made public and rigorously discussed.
I also hope that this document will serve to highlight the very real need for better data. The initial findings in this paper suggest that the problem is larger than commonly thought, and warrants additional investigation.