My colleague Robert Allison has a knack for finding fascinating data. Last week he did it again by locating data about how blood types and Rh factors vary among countries.
He produced a series of eight world maps, each showing the prevalence of a blood type (A+, A-, B+, B-, AB+, AB-, O+, and O-) in various countries around the world. As I studied his maps, I noticed that the distribution of blood types in certain ethnic groups (Chinese, Japanese, Indians,...) was different than the distribution in Western Europe and former European colonies.
When dealing with multivariate data, a single visualization is rarely sufficient to answer all the questions that you can ask. Robert's maps answer the question, "What is the spatial distribution of each blood type?" I was curious about a different question: "Within each country, what is the distribution of blood types?" To answer my question, I needed a different visualization of the same multivariate data.

My attempt is shown to the left. (Click to enlarge.) The graph is a stacked bar chart of the percentage of blood type for 63 countries, sorted by the percentage of types that have positive Rh factors. Blood types with positive Rh factors are plotted on the right; negative Rh factors are plotted on the left. The A+ and A- types are plotted closest to the 0% reference line. The next types are AB, B, and O, in increasing distances from the 0% reference line.
A few ethnic differences are apparent. At the top of the chart are Western European countries and former European colonies such as Brazil, Australia, and New Zealand. A little lower on the list are countries in Eastern Europe and Scandinavia.
After that, the list starts to get geographically jumbled. The United States, Canada, and South Africa were all settled by people of multiple ethnicities. The middle of the list is dominated by countries from the Middle East, Northern Africa, and the Near East.
The next set of countries include South American countries such as Argentina and Bolivia, Caribbean countries, India, and African countries.
Finally, the bottom of the list features Asian populations such as China, Japan, Mongolia, and the Philippines. These populations have almost no negative Rh factors in their blood. The distribution of blood types in those countries are similar to each other, although regional differences appear, such as the relatively small number of A+ blood in Thailand.
This one dimensional ranking of countries by blood types reflects historical connections between peoples as a result of conquest, trade, and colonization.
A few countries seem "out of place" in their list order. Lebanon, Ireland and Iceland, and Peru and Chile, are some of the countries whose distribution of blood types differ from those adjacent to them in the list.
The distribution of blood types by country. #Statistics Click To TweetRelationships between countries
Some of the "out of place" countries are probably a result of the fact that it is hard to linearly order the countries when there are eight variables to consider. Principal component analysis (PCA) is a statistical technique that can group observations according to similar characterisics. In SAS software, you can use the PRINCOMP procedure to conduct a principal component analysis.
The analysis reveals that 81% of the variation in the data can be explained by the first two principal components. About 92% can be explained by using three principal components, which means that the eight variables (percentages of each blood type) fit well into these lower-dimensional linear subspaces.

The score plot from a two-dimensional PCA analysis is shown to the left. (Click to enlarge.) I added colors to the data to indicate a geographical region for the countries; the regions came from the United Nations list of countries and geographic regions. This plot shows the relationships between countries based on similarities in the distribution of blood types.
The middle of the plot contains African and West Asian nations. (West Asia is the UN name for the region that many people call the Middle East.) The right side of the plot is dominated by European countries and their former colonies. The upper left quadrant contains the Asian countries. The lower left quadrant includes Caribbean, Central American, and South American countries. This presentation once again shows that the distribution of blood types in Peru and Chile are different from other countries, but are similar to each other.
You can download the data and the SAS program that analyzes it and do additional analyses.
What interesting features can you find in these data? Are there other ways to view these data? Leave a comment.
23 Comments
interesting to look at, but from what year are the blod type data taken?
As Europe today have a very mixed population immigrating from 1990 and increasing since.
It would be interesting with a timeseries plot of changes over time.
The source of the data compiles statistics from many different sources and years.
Gee Rick did you forget the 40 percent A+ Blood type in Japan? Gee if that is not an anomaly I don't know what is!
I find the PCA score plot provides some interesting information. Upper left quadrant is quite homogeneous, with almost all Asian countries. Lower right is most diverse. Switzerland and Chile are the most distant. I wonder what makes Chile and Peru so unique and different from the others?
It's an awesome article but guys it shows 8 types of blood type groups that is in India too same then how can it shows that different types of blood group in different countries
CAN ANYONE ANSWER OR SAY ME PLEASE KINDLY.. THANK YOU.. IN ADVANCE FOR ANSWERING...
For each country, the graph shows the proportion of the population that has each of the 8 different blood types.
immigrants.
the blood group B+ is to be believed developed in Himalayan continents, A+ and others to be outside.
that's the reason many ppl in India are having B+.
You say "A few countries seem 'out of place' in their list order. Lebanon, Ireland and Iceland, and Peru and Chile, are some of the countries whose distribution of blood types differ from those adjacent to them in the list."
Can you please define why they are out of place, those I am descendant of come from most of those places. What is it that makes them different? a specific blood type? Interested to know.
They are only "out of place" in the sense that the geographic location of the country differs from the geographic location of adjacent countries. The most likely explanation is the mixing of blood types through immigration. For example, in the 1990s there was a mass immigration to Ireland of people from Eastern Europe and Asia. This immigration altered the blood types in Ireland. Thus Ireland appearing next to Serbia on the list can be explained through immigration. I'm not a historian, but there are probably similar historical facts that explain other relationships.
Can you please confirm, is AB - mainly found in Peru Bolivia, the brown. My mother carried that. Those places consul in my DNA markings.?
Many thanks
AB- is most prevalent in Switzerland (2%). The following countried have AB- factors in about 1% of the population:
Australia, Austria, New Zealand, United Kingdom, Denmark, Ireland, Sweden, France, Germany, Czech Republic, Poland, South Africa, Finland, Israel, and Ethiopia.
The percentage in Peru is 0.1%. In Peru it is a tiny 0.02%.
what is the most common blood group in Cape verde and Afghanistan??
The data does not provide that information. You might assume that the proportions in be Afghanistan will be similar to its neighbor Pakistan. Cape Verde was settled by many groups, so you need actual data.
Pingback: Ahh, that's smooth! Anti-aliasing in SAS statistical graphics - The DO Loop
I'm interested in AB+ I read in another source that this blood type is a more recent development. That is, is seems to have developed after 900 AD. Do your findings support this too? I'm researching a project, and cannot find much information on when these more recent blood types first appeared in populations.
Thank you, in advance, for you time and help.
Sorry, but I do not know.
I am new at this I have A+blood so we're would my anstery be.
You need to run a genetic profile to obtain ancestry information. The data show that certain countries and regions tend to have a certain proportion of blood types, but you can't infer a person's ancestry from their blood type.
It would be interesting to see blood type distribution based in age ranges by country. Younger groups will represent the effects of very recent immigration while older groups will represent the population as it has been for since the last immigration wave in each country.
Only about 15% of the population of the African Continent in included in this graph. African has 1.2B, but only about 0.2B are included. Why were they excluded? Lack of data? I was hoping to compare Congo, Mongolia, Nepal, Bolivia, and Poland.
Thanks for your comment. This visualization is based on the data provided at
http://www.rhesusnegative.net/themission/bloodtypefrequencies/
in Nov 2014. You can investigate their site to see if they have any additional data.
What are your thoughts as to why Argentina and Chile, adjacent on the map, have such a difference in distribution in blood types? Visually they appeared more dissimilar, and was the first thing that jumped out at me from the chart. Chile with minimal O-, but an over abundance of O+ and Argentina with a significantly larger O- population.
Most likely differences in post-Colonial immigration. Although both were initially colonized by Spain, Argentina had a massive influx of European immigrants in 1880-1910. For example, 62% of Argentinians have Italian ancestry.