The distribution of blood types by country

21

My colleague Robert Allison has a knack for finding fascinating data. Last week he did it again by locating data about how blood types and Rh factors vary among countries.

He produced a series of eight world maps, each showing the prevalence of a blood type (A+, A-, B+, B-, AB+, AB-, O+, and O-) in various countries around the world. As I studied his maps, I noticed that the distribution of blood types in certain ethnic groups (Chinese, Japanese, Indians,...) was different than the distribution in Western Europe and former European colonies.

When dealing with multivariate data, a single visualization is rarely sufficient to answer all the questions that you can ask. Robert's maps answer the question, "What is the spatial distribution of each blood type?" I was curious about a different question: "Within each country, what is the distribution of blood types?" To answer my question, I needed a different visualization of the same multivariate data.

bloodtypes

My attempt is shown to the left. (Click to enlarge.) The graph is a stacked bar chart of the percentage of blood type for 63 countries, sorted by the percentage of types that have positive Rh factors. Blood types with positive Rh factors are plotted on the right; negative Rh factors are plotted on the left. The A+ and A- types are plotted closest to the 0% reference line. The next types are AB, B, and O, in increasing distances from the 0% reference line.

A few ethnic differences are apparent. At the top of the chart are Western European countries and former European colonies such as Brazil, Australia, and New Zealand. A little lower on the list are countries in Eastern Europe and Scandinavia.

After that, the list starts to get geographically jumbled. The United States, Canada, and South Africa were all settled by people of multiple ethnicities. The middle of the list is dominated by countries from the Middle East, Northern Africa, and the Near East.

The next set of countries include South American countries such as Argentina and Bolivia, Caribbean countries, India, and African countries.

Finally, the bottom of the list features Asian populations such as China, Japan, Mongolia, and the Philippines. These populations have almost no negative Rh factors in their blood. The distribution of blood types in those countries are similar to each other, although regional differences appear, such as the relatively small number of A+ blood in Thailand.

This one dimensional ranking of countries by blood types reflects historical connections between peoples as a result of conquest, trade, and colonization.

A few countries seem "out of place" in their list order. Lebanon, Ireland and Iceland, and Peru and Chile, are some of the countries whose distribution of blood types differ from those adjacent to them in the list.

The distribution of blood types by country. #Statistics Click To Tweet

Relationships between countries

Some of the "out of place" countries are probably a result of the fact that it is hard to linearly order the countries when there are eight variables to consider. Principal component analysis (PCA) is a statistical technique that can group observations according to similar characterisics. In SAS software, you can use the PRINCOMP procedure to conduct a principal component analysis.

The analysis reveals that 81% of the variation in the data can be explained by the first two principal components. About 92% can be explained by using three principal components, which means that the eight variables (percentages of each blood type) fit well into these lower-dimensional linear subspaces.

bloodtypes2

The score plot from a two-dimensional PCA analysis is shown to the left. (Click to enlarge.) I added colors to the data to indicate a geographical region for the countries; the regions came from the United Nations list of countries and geographic regions. This plot shows the relationships between countries based on similarities in the distribution of blood types.

The middle of the plot contains African and West Asian nations. (West Asia is the UN name for the region that many people call the Middle East.) The right side of the plot is dominated by European countries and their former colonies. The upper left quadrant contains the Asian countries. The lower left quadrant includes Caribbean, Central American, and South American countries. This presentation once again shows that the distribution of blood types in Peru and Chile are different from other countries, but are similar to each other.

You can download the data and the SAS program that analyzes it and do additional analyses.

What interesting features can you find in these data? Are there other ways to view these data? Leave a comment.

Share

About Author

Rick Wicklin

Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of PROC IML and SAS/IML Studio. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

21 Comments

  1. interesting to look at, but from what year are the blod type data taken?
    As Europe today have a very mixed population immigrating from 1990 and increasing since.
    It would be interesting with a timeseries plot of changes over time.

  2. Sanjay Matange

    I find the PCA score plot provides some interesting information. Upper left quadrant is quite homogeneous, with almost all Asian countries. Lower right is most diverse. Switzerland and Chile are the most distant. I wonder what makes Chile and Peru so unique and different from the others?

  3. It's an awesome article but guys it shows 8 types of blood type groups that is in India too same then how can it shows that different types of blood group in different countries
    CAN ANYONE ANSWER OR SAY ME PLEASE KINDLY.. THANK YOU.. IN ADVANCE FOR ANSWERING...

  4. You say "A few countries seem 'out of place' in their list order. Lebanon, Ireland and Iceland, and Peru and Chile, are some of the countries whose distribution of blood types differ from those adjacent to them in the list."

    Can you please define why they are out of place, those I am descendant of come from most of those places. What is it that makes them different? a specific blood type? Interested to know.

    • Rick Wicklin

      They are only "out of place" in the sense that the geographic location of the country differs from the geographic location of adjacent countries. The most likely explanation is the mixing of blood types through immigration. For example, in the 1990s there was a mass immigration to Ireland of people from Eastern Europe and Asia. This immigration altered the blood types in Ireland. Thus Ireland appearing next to Serbia on the list can be explained through immigration. I'm not a historian, but there are probably similar historical facts that explain other relationships.

  5. Can you please confirm, is AB - mainly found in Peru Bolivia, the brown. My mother carried that. Those places consul in my DNA markings.?

    Many thanks

    • Rick Wicklin

      AB- is most prevalent in Switzerland (2%). The following countried have AB- factors in about 1% of the population:
      Australia, Austria, New Zealand, United Kingdom, Denmark, Ireland, Sweden, France, Germany, Czech Republic, Poland, South Africa, Finland, Israel, and Ethiopia.

      The percentage in Peru is 0.1%. In Peru it is a tiny 0.02%.

    • Rick Wicklin

      The data does not provide that information. You might assume that the proportions in be Afghanistan will be similar to its neighbor Pakistan. Cape Verde was settled by many groups, so you need actual data.

  6. Pingback: Ahh, that's smooth! Anti-aliasing in SAS statistical graphics - The DO Loop

  7. I'm interested in AB+ I read in another source that this blood type is a more recent development. That is, is seems to have developed after 900 AD. Do your findings support this too? I'm researching a project, and cannot find much information on when these more recent blood types first appeared in populations.
    Thank you, in advance, for you time and help.

    • Rick Wicklin

      You need to run a genetic profile to obtain ancestry information. The data show that certain countries and regions tend to have a certain proportion of blood types, but you can't infer a person's ancestry from their blood type.

      • It would be interesting to see blood type distribution based in age ranges by country. Younger groups will represent the effects of very recent immigration while older groups will represent the population as it has been for since the last immigration wave in each country.

  8. Robert Meszaros on

    Only about 15% of the population of the African Continent in included in this graph. African has 1.2B, but only about 0.2B are included. Why were they excluded? Lack of data? I was hoping to compare Congo, Mongolia, Nepal, Bolivia, and Poland.

Leave A Reply

Back to Top