North Africa in Genetics and History

detailed map of African and Middle Eastern ethnicities in Haaks et al’s dataset

North Africa is an often misunderstood region in human genetics. Since it is in Africa, people often assume that it contains the same variety of people referenced in terms like “African Americans,” “black Africans,” or even just “Africans.” In reality, the African content contains members of all three of the great human clades–Sub-Saharan Africans in the south, Polynesians (Asian clade) in Madagascar, and Caucasians in the north.

The North African Middle Stone Age and its place in recent human evolution provides an overview of the first 275,000 years of humanity’s history in the region(300,000-25,000 years ago, more or less), including the development of symbolic culture and early human dispersal. Unfortunately the paper is paywalled.

Throughout most of human history, the Sahara–not the Mediterranean or Red seas–has been the biggest local impediment to human migration–thus North Africans are much closer, genetically, to their neighbors in Europe and the Middle East than their neighbors across the desert (and before the domestication of the camel, about 3,000 years ago, the Sahara was even harder to cross.)

But from time to time, global weather patterns change and the Sahara becomes a garden: the Green Sahara. The last time we had a Green Sahara was about 9-7,000 years ago; during this time, people lived, hunted, fished, herded and perhaps farmed throughout areas that are today nearly uninhabited wastes.

The Peopling of the last Green Sahara revealed by high-coverage resequencing of trans-Saharan patrilineages sheds light on how the Green (and subsequently brown) Sahara affected the spread (and separation) of African groups into northern and sub-Saharan:

In order to investigate the role of the last Green Sahara in the peopling of Africa, we deep-sequence the whole non-repetitive portion of the Y chromosome in 104 males selected as representative of haplogroups which are currently found to the north and to the south of the Sahara. … We find that the coalescence age of the trans-Saharan haplogroups dates back to the last Green Sahara, while most northern African or sub-Saharan clades expanded locally in the subsequent arid phase. …

Our findings suggest that the Green Sahara promoted human movements and demographic expansions, possibly linked to the adoption of pastoralism. Comparing our results with previously reported genome-wide data, we also find evidence for a sex-biased sub-Saharan contribution to northern Africans, suggesting that historical events such as the trans-Saharan slave trade mainly contributed to the mtDNA and autosomal gene pool, whereas the northern African paternal gene pool was mainly shaped by more ancient events.

In other words, modern North Africans have some maternal (female) Sub-Saharan DNA that arrived recently via the Islamic slave trade, but most of their Sub-Saharan Y-DNA (male) is much older, hailing from the last time the Sahara was easy to cross.

Note that not much DNA is shared across the Sahara:

After the African humid period, the climatic conditions became rapidly hyper-arid and the Green Sahara was replaced by the desert, which acted as a strong geographic barrier against human movements between northern and sub-Saharan Africa.

A consequence of this is that there is a strong differentiation in the Y chromosome haplogroup composition between the northern and sub-Saharan regions of the African continent. In the northern area, the predominant Y lineages are J-M267 and E-M81, with the former being linked to the Neolithic expansion in the Near East and the latter reaching frequencies as high as 80 % in some north-western populations as a consequence of a very recent local demographic expansion [810]. On the contrary, sub-Saharan Africa is characterised by a completely different genetic landscape, with lineages within E-M2 and haplogroup B comprising most of the Y chromosomes. In most regions of sub-Saharan Africa, the observed haplogroup distribution has been linked to the recent (~ 3 kya) demic diffusion of Bantu agriculturalists, which brought E-M2 sub-clades from central Africa to the East and to the South [1117]. On the contrary, the sub-Saharan distribution of B-M150 seems to have more ancient origins, since its internal lineages are present in both Bantu farmers and non-Bantu hunter-gatherers and coalesce long before the Bantu expansion [1820].

In spite of their genetic differentiation, however, northern and sub-Saharan Africa share at least four patrilineages at different frequencies, namely A3-M13, E-M2, E-M78 and R-V88.

A recent article in Nature, “Whole Y-chromosome sequences reveal an extremely recent origin of the most common North African paternal lineage E-M183 (M81),” tells some of North Africa’s fascinating story:

Here, by using whole Y chromosome sequences, we intend to shed some light on the historical and demographic processes that modelled the genetic landscape of North Africa. Previous studies suggested that the strategic location of North Africa, separated from Europe by the Mediterranean Sea, from the rest of the African continent by the Sahara Desert and limited to the East by the Arabian Peninsula, has shaped the genetic complexity of current North Africans15,16,17. Early modern humans arrived in North Africa 190–140 kya (thousand years ago)18, and several cultures settled in the area before the Holocene. In fact, a previous study by Henn et al.19 identified a gradient of likely autochthonous North African ancestry, probably derived from an ancient “back-to-Africa” gene flow prior to the Holocene (12 kya). In historic times, North Africa has been populated successively by different groups, including Phoenicians, Romans, Vandals and Byzantines. The most important human settlement in North Africa was conducted by the Arabs by the end of the 7th century. Recent studies have demonstrated the complexity of human migrations in the area, resulting from an amalgam of ancestral components in North African groups15,20.

According to the article, E-M81 is dominant in Northwest Africa and absent almost everywhere else in the world.

The authors tested various men across north Africa in order to draw up a phylogenic tree of the branching of E-M183:

The distribution of each subhaplogroup within E-M183 can be observed in Table 1 and Fig. 2. Indeed, different populations present different subhaplogroup compositions. For example, whereas in Morocco almost all subhaplogorups are present, Western Sahara shows a very homogeneous pattern with only E-SM001 and E-Z5009 being represented. A similar picture to that of Western Sahara is shown by the Reguibates from Algeria, which contrast sharply with the Algerians from Oran, which showed a high diversity of haplogroups. It is also worth to notice that a slightly different pattern could be appreciated in coastal populations when compared with more inland territories (Western Sahara, Algerian Reguibates).

Overall, the authors found that the haplotypes were “strikingly similar” to each other and showed little geographic structure besides the coastal/inland differences:

As proposed by Larmuseau et al.25, the scenario that better explains Y-STR haplotype similarity within a particular haplogroup is a recent and rapid radiation of subhaplogroups. Although the dating of this lineage has been controversial, with dates proposed ranging from Paleolithic to Neolithic and to more recent times17,22,28, our results suggested that the origin of E-M183 is much more recent than was previously thought. … In addition to the recent radiation suggested by the high haplotype resemblance, the pattern showed by E-M183 imply that subhaplogroups originated within a relatively short time period, in a burst similar to those happening in many Y-chromosome haplogroups23.

In other words, someone went a-conquering.

Alternatively, given the high frequency of E-M183 in the Maghreb, a local origin of E-M183 in NW Africa could be envisaged, which would fit the clear pattern of longitudinal isolation by distance reported in genome-wide studies15,20. Moreover, the presence of autochthonous North African E-M81 lineages in the indigenous population of the Canary Islands, strongly points to North Africa as the most probable origin of the Guanche ancestors29. This, together with the fact that the oldest indigenous inviduals have been dated 2210 ± 60 ya, supports a local origin of E-M183 in NW Africa. Within this scenario, it is also worth to mention that the paternal lineage of an early Neolithic Moroccan individual appeared to be distantly related to the typically North African E-M81 haplogroup30, suggesting again a NW African origin of E-M183. A local origin of E-M183 in NW Africa > 2200 ya is supported by our TMRCA estimates, which can be taken as 2,000–3,000, depending on the data, methods, and mutation rates used.

However, the authors also note that they can’t rule out a Middle Eastern origin for the haplogroup since their study simply doesn’t include genomes from Middle Eastern individuals. They rule out a spread during the Neolithic expansion (too early) but not the Islamic expansion (“an extensive, male-biased Near Eastern admixture event is registered ~1300 ya, coincidental with the Arab expansion20.”) Alternatively, they suggest E-M183 might have expanded near the end of the third Punic War. Sure, Carthage (in Tunisia) was defeated by the Romans, but the era was otherwise one of great North African wealth and prosperity.


Interesting papers! My hat’s off to the authors. I hope you enjoyed them and get a chance to RTWT.

So who is White?

“White” is a nebulous category. “Black” is actually easier to define, because there’s a pretty hard boundary (the Sahara) between black Africa and everywhere else. To be fair, there are also groups like the Bushmen (who are more tawny brownish,) and the Pygmies who are genetically separate from other sub-Saharan Africans by over 100,000 years, but these are pretty small on the global scale. But “whites” and “Asians” occupy the same continent, and thus shade into each other.

If we use a strictly skin tone definition (as the world “white” implies) we can just pull up a map of global skin tone variation:

source: Wikipedia
source: Wikipedia

Of course, this implies that either Spaniards and Finns aren’t white, or Chinese and Eskimos are. Either way is fine, of course, though this would contradict most people’s usage. (And I kind of question that data on the Finns:

credit: The Postnational Monitor
credit: The Postnational Monitor)

These composites of faces from around the world offer us some more data, though depending on how they were made, they may not accurately reflect skin tone in all countries (ie, if the creator relied on pictures of famous people available on the internet, then these will reflect local beauty norms than group averages.)

(Plus, I wonder why the Romanians are pink.)

J. B. Huang has taken some of the Eurasian faces from this set and gone through the effort of trying to quantitize their shapes, as displayed in this graph (at least, that’s what I think they’re doing):

all_embeddingInterestingly, while some of the faces cluster together the way you might expect–China, Taiwan, Korea, and Japan are all near each other, as are Belgium and the Netherlands–many of the groupings are near random, eg, Mongolia, Turkey, and the Philippines. Hungary and Austria are closer to India and Japan than to Poland or Finland. The European faces are all over the map.

Maybe this doesn’t mean anything at all, or maybe it means that there’s a lot of variation in European faces.

This is actually not too surprising, given that modern Europeans are genetically descended from three different groups who conquered the peninsula in successive waves, leaving more or less of their DNA in different areas: the hunter gatherers who were there first, followed by farmers who spread out from Anatolia (modern Turkey,) followed by the “Indo-Europeans” aka the Yamnaya, who were part hunter gatherer (by DNA, not profession) and part another group whose origins have yet to be located, but which I call the “teal people” because their DNA is teal on Haak’s graph.

Oh yes, we are getting to Haak.

Click for full size
From Haak et al.

This isn’t the full graph, but it’s probably enough for our purposes. The European countries show a characteristic profile of Orange, Dark Blue, and Teal. (By contrast, the east Asian countries, which cluster closely together on the facial map, are mostly yellow with only a bit of red.)

Obviously DNA isn’t actually colored. It’s just a visual aid.

Haak’s graph makes it fairly easy to rule out the groups that are definitely different (at least genetically.) The American Indians, Inuit, West Africans, Chinese, and Aborigines are distinctly out. This leaves us with Europe, the Middle East, North Africa, India, and parts of central Asia/Siberia:


The Orange-centric region, which Haak et al arranged to display the movements of the Anatolian farmer people.


The heavily teal Indian section (The middle part, from Hazara-Tlingit, are obviously not Indian).

siberiaAnd finally some Siberian DNA.

Now, I could stare at these all day; I love them. They tell so many fascinating stories about people and where they went. Of the three ancestries found in Europeans, the oldest, the dark blue (hunter-gatherers,) is found throughout India, Siberia, and even the Aleutian islands (though I caution that some of this could just be because of Russians raping the Aleuts back in the day.) The dark blue appears to hit a particular low point in the Caucuses region, which of course is about where the teal got its start.

The orange–Anatolian farmers–shows up throughout the Middle East and Europe, but is near totally absent in India and Siberia. (Not much farming in Siberia!)

At a lower resolution (not pictured,) India, central Asia, and Siberia appear to have a mix of–broadly speaking–“European” and “Asian” ancestry. (Not too surprising, since they are in the middle of the continent.) Obviously the middle of Asia is a big crossroads between different groups–red (Siberian) yellow (east Asian) teal and dark blue, and bits of the same DNA that shows up in the Eskimo (Inuit) and Aleuts.

But this is all kind of complicated. Luckily for us, this is only one way to visualize DNA–I’ve got others!

Credit Robert Lindsay, Beyond Highbrow
Credit Robert Lindsay, Beyond Highbrow

If you’re not familiar with these sorts of trees, the basic story is that geneticists gathered DNA samples (from spit, I think, which is pretty awesome,) from ethnic groups from all over the world, and then measured how many genes they have in common. More genes in common = groups more closely related to each other. Fewer genes = more genetic distance from each other.

Since different genetic samples and computer models are different, different teams have produced slightly different genetic trees.

Note that since the tree is constructed by comparing # of genes two groups have in common, a group could end up in a particular spot because it is descended from a common ancestor with other nearby groups, or because of mixing between two groups. Ashkenazi Jews, for example, cluster with southern Europeans because they’re about half Italian (and obviously half ancient Israeli.) Here’s another chart, giving us another perspective:

I totally stole this from Razib Khan, didn't I?
I totally stole this from Razib Khan–though he got it from here.

This chart also shows us genetic differences between groups, with strong clustering among African and East Asians, respectively, and then a sort of scattered group of Europeans and Indians (South Asians.)

Also credit Robert Lindsay
Also credit Robert Lindsay

Neither of these graphs shows Siberians or central Asians in great detail, because they are tiny groups, but I think it’s safe to say the Siberians at least cluster near their neighbors, the other Asians and far-north Americans.

The central and south Asians, though, are quite the interesting case!

Between archaeology and genetics, we’ve been able to trace the path of human expansion, from central Africa to the world:

I think this map came from that recent article about possibly finding traces of the first out-of-Africa event in Papuans.
I think this map came from that recent article I discussed in the post about possibly finding traces of the first out-of-Africa event in Papuans.

Since this post is already image heavy, here is a graph showing finer detail on European and North African groups, Moroccans, (Berbers), Aleut woman, Sardinians, Sami (Lapps), Iranians, Gujarati, (another), Dravidian, Brahmin, Dalits, Altai, Uyghur, Selkup. (Look at the pictures!)

Well, ultimately, there’s no hard division between most ethnic groups or races–you can draw dividing lines where you want them. The term “white” implies dermal paleness, of course, so you may prefer a narrower definition for “white” than “Caucasian.” Greater minds than mine have already covered the subject in more authoritative detail, of course. I merely offer my thoughts for entertainment.