The Negritos are a fascinating group of short-statured, dark-skinned, frizzy-haired peoples from southeast Asia–chiefly the Andaman Islands, Malaysia, Philippines, and Thailand. (Spelling note: “Negritoes” is also an acceptable plural, and some sources use the Spanish Negrillos.)
Because of their appearance, they have long been associated with African peoples, especially the Pygmies. Pygmies are formally defined as any group where adult men are, on average 4’11” or less and is almost always used specifically to refer to African Pygmies; the term pygmoid is sometimes used for groups whose men average 5’1″ or below, including the Negritos. (Some of the Bushmen tribes, Bolivians, Amazonians, the remote Taron, and a variety of others may also be pygmoid, by this definition.)
However, genetic testing has long indicated that they, along with other Melanesians and Australian Aborigines, are more closely related to other east Asian peoples than any African groups. In other words, they’re part of the greater Asian race, albeit a distant branch of it.
But how distant? And are the various Negrito groups closely related to each other, or do there just happen to be a variety of short groups of people in the area, perhaps due to convergent evolution triggered by insular dwarfism?
They found that the Negrito groups they studied “are basal to other East and Southeast Asians,” (basal: forming the bottom layer or base. In this case, it means they split off first,) “and that they diverged from West Eurasians at least 38,000 years ago.” (West Eurasians: Caucasians, consisting of Europeans, Middle Easterners, North Africans, and people from India.) “We also found relatively high traces of Denisovan admixture in the Philippine Negritos, but not in the Malaysian and Andamanese groups.” (Denisovans are a group of extinct humans similar to Neanderthals, but we’ve yet to find many of their bones. Just as Neanderthal DNA shows up in non-Sub-Saharan-Africans, so Denisvoan shows up in Melanesians.)
Figure 1 (A) shows PC analysis of Andamanese, Malaysian, and Philippine Negritos, revealing three distinct clusters:
In the upper right-hand corner, the Aeta, Agta, Batak, and Mamanwa are Philippine Negritos. The Manobo are non-Negrito Filipinos.
In the lower right-hand corner are the Jehai, Kintak and Batek are Malaysian Negritos.
And in the upper left, we have the extremely isolated Andamanese Onge and Jarawa Negritos.
(Phil-NN and Mly-NN I believe are Filipino and Malaysian Non-Negritos.)
You can find the same chart, but flipped upside down, with Papuan and Melanesian DNA in the supplemental materials. Of the three groups, they cluster closest to the Philippine Negritos, along the same line with the Malaysians.
By excluding the Andamanese (and Kintak) Negritos, Figure 1 (B) allows a closer look at the structure of the Philippine Negritos.
The Agta, Aeta, and Batak form a horizontal “comet-like pattern,” which likely indicates admixture with non-Negrito Philipine groups like the Manobo. The Mamanawa, who hail from a different part of the Philippines, also show this comet-like patterns, but along a different axis–likely because they intermixed with the different Filipinos who lived in their area. As you can see, there’s a fair amount of overlap–several of the Manobo individuals clustered with the Mamanwa Negritos, and the Batak cluster near several non-Negrito groups (see supplemental chart S4 B)–suggesting high amounts of mixing between these groups.
ADMIXTURE analysis reveals a similar picture. The non-Negrito Filipino groups show up primarily as Orange. The Aeta, Agta, and Batak form a clear genetic cluster with each other and cline with the Orange Filipinos, with the Aeta the least admixed and Batak the most.
The white are on the chart isn’t a data error, but the unique signature of the geographically separated Mananwa, who are highly mixed with the Manobo–and the Manobo, in turn, are mixed with them.
But this alone doesn’t tell us how ancient these populations are, nor if they’re descended from one ancestral pop. For this, the authors constructed several phylogenetic trees, based on all of the data at hand and assuming from 0 – 5 admixture events. The one on the left assumes 5 events, but for clarity only shows three of them. The Denisovan DNA is fascinating and well-documented elsewhere in Melanesian populatons; that Malaysian and Philippine Negritos mixed with their neighbors is also known, supporting the choice of this tree as the most likely to be accurate.
Regardless of which you pick, all of the trees show very similar results, with the biggest difference being whether the Melanesians/Papuans split before or after the Andamanese/Malaysian Negritos.
In case you are unfamiliar with these trees, I’ll run down a quick explanation: This is a human family tree, with each split showing where one group of humans split off from the others and became an isolated group with its own unique genetic patterns. The orange and red lines mark places where formerly isolated groups met and interbred, producing children that are a mix of both. The first split in the tree, going back million of years, is between all Homo sapiens (our species) and the Denisovans, a sister species related to the Neanderthals.
All humans outside of sub-Saharan Africans have some Neanderthal DNA because their ancestors met and interbred with Neanderthals on their way Out of Africa. Melanesians, Papuans, and some Negritos also have some Denisovan DNA, because their ancestors met and made children with members of this obscure human species, but Denisovan DNA is quite rare outside these groups.
Here is a map of Denisovan DNA levels the authors found, with 4% of Papuan DNA hailing from Denisivan ancestors, and Aeta nearly as high. By contrast, the Andamanese Negritos appear to have zero Denisovan. Either the Andamanese split off before the ancestors of the Philippine Negritos and Papuans met the Denisovans, or all Denisovan DNA has been purged from their bloodlines, perhaps because it just wasn’t helpful for surviving on their islands.
Back to the Tree: The second node is where the Biaka, a group of Pygmies from the Congo Rainforest in central Africa. Pygmy lineages are among the most ancient on earth, potentially going back over 200,000 years, well before any Homo sapiens had left Africa.
The next group that splits off from the rest of humanity are the Yoruba, a single ethnic group chosen to stand in for the entirety of the Bantus. Bantus are the group that you most likely think of when you think of black Africans, because over the past three millennia they have expanded greatly and conquered most of sub-Saharan Africa.
Next we have the Out of Africa event and the split between Caucasians (here represented by the French) and the greater Asian clade, which includes Australian Aborigines, Melanesians, Polynesians, Chinese, Japanese, Siberians, Inuit, and Native Americans.
The first groups to split off from the greater Asian clade (aka race) were the Andamanese and Malaysian Negritos, followed by the Papuans/Melanesians Australian Aborigines are closely related to Papuans, as Australia and Papua New Guinea were connected in a single continent (called Sahul) back during the last Ice Age. Most of Indonesia and parts of the Philippines were also connected into a single landmass, called Sunda. Sensibly, people reached Sunda before Sahul, though (Perhaps at that time the Andaman islands, to the northwest of Sumatra, were also connected or at least closer to the mainland.)
Irrespective of the exact order in which Melanesians and individual Negrito groups split off, they all split well before all of the other Asian groups in the area.
This is supported by legends told by the Filipinos themselves:
Legends, such as those involving the Ten Bornean Datus and the Binirayan Festival, tell tales about how, at the beginning of the 12th century when Indonesia and Philippines were under the rule of Indianized native kingdoms, the ancestors of the Bisaya escaped from Borneo from the persecution of Rajah Makatunaw. Led by Datu Puti and Datu Sumakwel and sailing with boats called balangays, they landed near a river called Suaragan, on the southwest coast of Panay, (the place then known as Aninipay), and bartered the land from an Ati [Negrito] headman named Polpolan and his son Marikudo for the price of a necklace and one golden salakot. The hills were left to the Atis while the plains and rivers to the Malays. This meeting is commemorated through the Ati-atihan festival.
The study’s authors estimate that the Negritos split from Europeans (Caucasians) around 30-38,000 years ago, and that the Malaysian and Philippine Negritos split around
13-15,000 years ago. (This all seems a bit tentative, IMO, especially since we have physical evidence of people in the area going back much further than that, and the authors themselves admit in the discussion that their time estimate may be too short.)
The authors also note:
Both our NJ (fig. 3A) and UPGMA (supplementary fig. S10) trees show that after divergence from Europeans, the ancestral Asians subsequently split into Papuans, Negritos and East Asians, implying a one-wave colonization of Asia. … This is in contrast to the study based on whole genome sequences that suggested Australian Aboriginal/Papuan first split from European/East Asians 60 kya, and later Europeans and East Asians diverged 40 kya (Malaspinas et al. 2016). This implies a two-wave migration into Asia…
The matter is still up for debate/more study.
In conclusion: All of the Negrito groups are likely descended from a common ancestor, (rather than having evolved from separate groups that happened to develop similar body types due to exposure to similar environments,) and were among the very first inhabitants of their regions. Despite their short stature, they are more closely related to other Asian groups (like the Chinese) than to African Pygmies. Significant mixing with their neighbors, however, is quickly obscuring their ancient lineages.
I wonder if all ancient human groups were originally short, and height a recently evolved trait in some groups?
In closing, I’d like to thank Jinam et al for their hard work in writing this article and making it available to the public, their sponsors, and the unique Negrito peoples themselves for surviving so long.
A species may live in relative equilibrium with its environment, hardly changing from generation to generation, for millions of years. Turtles, for example, have barely changed since the Cretaceous, when dinosaurs still roamed the Earth.
But if the environment changes–critically, if selective pressures change–then the species will change, too. This was most famously demonstrated with English moths, which changed color from white-and-black speckled to pure black when pollution darkened the trunks of the trees they lived on. To survive, these moths need to avoid being eaten by birds, so any moth that stands out against the tree trunks tends to get turned into an avian snack. Against light-colored trees, dark-colored moths stood out and were eaten. Against dark-colored trees, light-colored moths stand out.
This change did not require millions of years. Dark-colored moths were virtually unknown in 1810, but by 1895, 98% of the moths were black.
The time it takes for evolution to occur depends simply on A. The frequency of a trait in the population and B. How strongly you are selecting for (or against) it.
Let’s break this down a little bit. Within a species, there exists a great deal of genetic variation. Some of this variation happens because two parents with different genes get together and produce offspring with a combination of their genes. Some of this variation happens because of random errors–mutations–that occur during copying of the genetic code. Much of the “natural variation” we see today started as some kind of error that proved to be useful, or at least not harmful. For example, all humans originally had dark skin similar to modern Africans’, but random mutations in some of the folks who no longer lived in Africa gave them lighter skin, eventually producing “white” and “Asian” skin tones.
(These random mutations also happen in Africa, but there they are harmful and so don’t stick around.)
Natural selection can only act on the traits that are actually present in the population. If we tried to select for “ability to shoot x-ray lasers from our eyes,” we wouldn’t get very far, because no one actually has that mutation. By contrast, albinism is rare, but it definitely exists, and if for some reason we wanted to select for it, we certainly could. (The incidence of albinism among the Hopi Indians is high enough–1 in 200 Hopis vs. 1 in 20,000 Europeans generally and 1 in 30,000 Southern Europeans–for scientists to discuss whether the Hopi have been actively selecting for albinism. This still isn’t a lot of albinism, but since the American Southwest is not a good environment for pale skin, it’s something.)
You will have a much easier time selecting for traits that crop up more frequently in your population than traits that crop up rarely (or never).
Second, we have intensity–and variety–of selective pressure. What % of your population is getting removed by natural selection each year? If 50% of your moths get eaten by birds because they’re too light, you’ll get a much faster change than if only 10% of moths get eaten.
Selection doesn’t have to involve getting eaten, though. Perhaps some of your moths are moth Lotharios, seducing all of the moth ladies with their fuzzy antennae. Over time, the moth population will develop fuzzier antennae as these handsome males out-reproduce their less hirsute cousins.
No matter what kind of selection you have, nor what part of your curve it’s working on, all that ultimately matters is how many offspring each individual has. If white moths have more children than black moths, then you end up with more white moths. If black moths have more babies, then you get more black moths.
So what happens when you completely remove selective pressures from a population?
Back in 1968, ethologist John B. Calhoun set up an experiment popularly called “Mouse Utopia.” Four pairs of mice were given a large, comfortable habitat with no predators and plenty of food and water.
Predictably, the mouse population increased rapidly–once the mice were established in their new homes, their population doubled every 55 days. But after 211 days of explosive growth, reproduction began–mysteriously–to slow. For the next 245 days, the mouse population doubled only once every 145 days.
The birth rate continued to decline. As births and death reached parity, the mouse population stopped growing. Finally the last breeding female died, and the whole colony went extinct.
As I’ve mentioned before Israel is (AFAIK) the only developed country in the world with a TFR above replacement.
It has long been known that overcrowding leads to population stress and reduced reproduction, but overcrowding can only explain why the mouse population began to shrink–not why it died out. Surely by the time there were only a few breeding pairs left, things had become comfortable enough for the remaining mice to resume reproducing. Why did the population not stabilize at some comfortable level?
Professor Bruce Charlton suggests an alternative explanation: the removal of selective pressures on the mouse population resulted in increasing mutational load, until the entire population became too mutated to reproduce.
Unfortunately, randomly changing part of your genetic code is more likely to give you no skin than skintanium armor.
But only the worst genetic problems that never see the light of day. Plenty of mutations merely reduce fitness without actually killing you. Down Syndrome, famously, is caused by an extra copy of chromosome 21.
While a few traits–such as sex or eye color–can be simply modeled as influenced by only one or two genes, many traits–such as height or IQ–appear to be influenced by hundreds or thousands of genes:
Differences in human height is 60–80% heritable, according to several twin studies and has been considered polygenic since the Mendelian-biometrician debate a hundred years ago. A genome-wide association (GWA) study of more than 180,000 individuals has identified hundreds of genetic variants in at least 180 loci associated with adult human height. The number of individuals has since been expanded to 253,288 individuals and the number of genetic variants identified is 697 in 423 genetic loci.
Obviously most of these genes each plays only a small role in determining overall height (and this is of course holding environmental factors constant.) There are a few extreme conditions–gigantism and dwarfism–that are caused by single mutations, but the vast majority of height variation is caused by which particular mix of those 700 or so variants you happen to have.
The general figure for the heritability of IQ, according to an authoritative American Psychological Association report, is 0.45 for children, and rises to around 0.75 for late teens and adults. In simpler terms, IQ goes from being weakly correlated with genetics, for children, to being strongly correlated with genetics for late teens and adults. … Recent studies suggest that family and parenting characteristics are not significant contributors to variation in IQ scores; however, poor prenatal environment, malnutrition and disease can have deleterious effects.…
Despite intelligence having substantial heritability2 (0.54) and a confirmed polygenic nature, initial genetic studies were mostly underpowered3, 4, 5. Here we report a meta-analysis for intelligence of 78,308 individuals. We identify 336 associated SNPs (METAL P < 5 × 10−8) in 18 genomic loci, of which 15 are new. Around half of the SNPs are located inside a gene, implicating 22 genes, of which 11 are new findings. Gene-based analyses identified an additional 30 genes (MAGMA P < 2.73 × 10−6), of which all but one had not been implicated previously. We show that the identified genes are predominantly expressed in brain tissue, and pathway analysis indicates the involvement of genes regulating cell development (MAGMA competitive P = 3.5 × 10−6). Despite the well-known difference in twin-based heritability2 for intelligence in childhood (0.45) and adulthood (0.80), we show substantial genetic correlation (rg = 0.89, LD score regression P = 5.4 × 10−29). These findings provide new insight into the genetic architecture of intelligence.
The greater number of genes influence a trait, the harder they are to identify without extremely large studies, because any small group of people might not even have the same set of relevant genes.
High IQ correlates positively with a number of life outcomes, like health and longevity, while low IQ correlates with negative outcomes like disease, mental illness, and early death. Obviously this is in part because dumb people are more likely to make dumb choices which lead to death or disease, but IQ also correlates with choice-free matters like height and your ability to quickly press a button. Our brains are not some mysterious entities floating in a void, but physical parts of our bodies, and anything that affects our overall health and physical functioning is likely to also have an effect on our brains.
The study focused, for the first time, on rare, functional SNPs – rare because previous research had only considered common SNPs and functional because these are SNPs that are likely to cause differences in the creation of proteins.
The researchers did not find any individual protein-altering SNPs that met strict criteria for differences between the high-intelligence group and the control group. However, for SNPs that showed some difference between the groups, the rare allele was less frequently observed in the high intelligence group. This observation is consistent with research indicating that rare functional alleles are more often detrimental than beneficial to intelligence.
Greg Cochran has some interesting Thoughts on Genetic Load. (Currently, the most interesting candidate genes for potentially increasing IQ also have terrible side effects, like autism, Tay Sachs and Torsion Dystonia. The idea is that–perhaps–if you have only a few genes related to the condition, you get an IQ boost, but if you have too many, you get screwed.) Of course, even conventional high-IQ has a cost: increased maternal mortality (larger heads).
the difference between the fitness of an average genotype in a population and the fitness of some reference genotype, which may be either the best present in a population, or may be the theoretically optimal genotype. … Deleterious mutation load is the main contributing factor to genetic load overall. Most mutations are deleterious, and occur at a high rate.
There’s math, if you want it.
Normally, genetic mutations are removed from the population at a rate determined by how bad they are. Really bad mutations kill you instantly, and so are never born. Slightly less bad mutations might survive, but never reproduce. Mutations that are only a little bit deleterious might have no obvious effect, but result in having slightly fewer children than your neighbors. Over many generations, this mutation will eventually disappear.
(Some mutations are more complicated–sickle cell, for example, is protective against malaria if you have only one copy of the mutation, but gives you sickle cell anemia if you have two.)
Throughout history, infant mortality was our single biggest killer. For example, here is some data from Jakubany, a town in the Carpathian Mountains:
We can see that, prior to the 1900s, the town’s infant mortality rate stayed consistently above 20%, and often peaked near 80%.
When I first ran a calculation of the infant mortality rate, I could not believe certain of the intermediate results. I recompiled all of the data and recalculated … with the same astounding result – 50.4% of the children born in Jakubany between the years 1772 and 1890 would diebefore reaching ten years of age! …one out of every two! Further, over the same 118 year period, of the 13306 children who were born, 2958 died (~22 %) before reaching the age of one.
Historical infant mortality rates can be difficult to calculate in part because they were so high, people didn’t always bother to record infant deaths. And since infants are small and their bones delicate, their burials are not as easy to find as adults’. Nevertheless, Wikipedia estimates that Paleolithic man had an average life expectancy of 33 years:
Based on the data from recent hunter-gatherer populations, it is estimated that at 15, life expectancy was an additional 39 years (total 54), with a 0.60 probability of reaching 15.
In other words, a 40% chance of dying in childhood. (Not exactly the same as infant mortality, but close.)
Wikipedia gives similarly dismal stats for life expectancy in the Neolithic (20-33), Bronze and Iron ages (26), Classical Greece(28 or 25), Classical Rome (20-30), Pre-Columbian Southwest US (25-30), Medieval Islamic Caliphate (35), Late Medieval English Peerage (30), early modern England (33-40), and the whole world in 1900 (31).
Over at ThoughtCo: Surviving Infancy in the Middle Ages, the author reports estimates for between 30 and 50% infant mortality rates. I recall a study on Anasazi nutrition which I sadly can’t locate right now, which found 100% malnutrition rates among adults (based on enamel hypoplasias,) and 50% infant mortality.
As Priceonomics notes, the main driver of increasing global life expectancy–48 years in 1950 and 71.5 years in 2014 (according to Wikipedia)–has been a massive decrease in infant mortality. The average life expectancy of an American newborn back in 1900 was only 47 and a half years, whereas a 60 year old could expect to live to be 75. In 1998, the average infant could expect to live to about 75, and the average 60 year old could expect to live to about 80.
Michael A Woodley suggests that what was going on [in the Mouse experiment] was much more likely to be mutation accumulation; with deleterious (but non-fatal) genes incrementally accumulating with each generation and generating a wide range of increasingly maladaptive behavioural pathologies; this process rapidly overwhelming and destroying the population before any beneficial mutations could emerge to ‘save; the colony from extinction. …
The reason why mouse utopia might produce so rapid and extreme a mutation accumulation is that wild mice naturally suffer very high mortality rates from predation. …
Thus mutation selection balance is in operation among wild mice, with very high mortality rates continually weeding-out the high rate of spontaneously-occurring new mutations (especially among males) – with typically only a small and relatively mutation-free proportion of the (large numbers of) offspring surviving to reproduce; and a minority of the most active and healthy (mutation free) males siring the bulk of each generation.
However, in Mouse Utopia, there is no predation and all the other causes of mortality (eg. Starvation, violence from other mice) are reduced to a minimum – so the frequent mutations just accumulate, generation upon generation – randomly producing all sorts of pathological (maladaptive) behaviours.
Today, almost everyone in the developed world has plenty of food, a comfortable home, and doesn’t have to worry about dying of bubonic plague. We live in humantopia, where the biggest factor influencing how many kids you have is how many you want to have.
Back in 1930, infant mortality rates were highest among the children of unskilled manual laborers, and lowest among the children of professionals (IIRC, this is Brittish data.) Today, infant mortality is almost non-existent, but voluntary childlessness has now inverted this phenomena:
Yes, the percent of childless women appears to have declined since 1994, but the overall pattern of who is having children still holds. Further, while only 8% of women with post graduate degrees have 4 or more children, 26% of those who never graduated from highschool have 4+ kids. Meanwhile, the age of first-time moms has continued to climb.
Take a moment to consider the high-infant mortality situation: an average couple has a dozen children. Four of them, by random good luck, inherit a good combination of the couple’s genes and turn out healthy and smart. Four, by random bad luck, get a less lucky combination of genes and turn out not particularly healthy or smart. And four, by very bad luck, get some unpleasant mutations that render them quite unhealthy and rather dull.
Infant mortality claims half their children, taking the least healthy. They are left with 4 bright children and 2 moderately intelligent children. The three brightest children succeed at life, marry well, and end up with several healthy, surviving children of their own, while the moderately intelligent do okay and end up with a couple of children.
On average, society’s overall health and IQ should hold steady or even increase over time, depending on how strong the selective pressures actually are.
Or consider a consanguineous couple with a high risk of genetic birth defects: perhaps a full 80% of their children die, but 20% turn out healthy and survive.
Today, by contrast, your average couple has two children. One of them is lucky, healthy, and smart. The other is unlucky, unhealthy, and dumb. Both survive. The lucky kid goes to college, majors in underwater intersectionist basket-weaving, and has one kid at age 40. That kid has Down Syndrome and never reproduces. The unlucky kid can’t keep a job, has chronic health problems, and 3 children by three different partners.
Your consanguineous couple migrates from war-torn Somalia to Minnesota. They still have 12 kids, but three of them are autistic with IQs below the official retardation threshold. “We never had this back in Somalia,” they cry. “We don’t even have a word for it.”
People normally think of dysgenics as merely “the dumb outbreed the smart,” but genetic load applies to everyone–men and women, smart and dull, black and white, young and especially old–because we all make random transcription errors when copying our DNA.
I could offer a list of signs of increasing genetic load, but there’s no way to avoid cherry-picking trends I already know are happening, like falling sperm counts or rising (diagnosed) autism rates, so I’ll skip that. You may substitute your own list of “obvious signs society is falling apart at the genes” if you so desire.
Nevertheless, the transition from 30% (or greater) infant mortality to almost 0% is amazing, both on a technical level and because it heralds an unprecedented era in human evolution. The selective pressures on today’s people are massively different from those our ancestors faced, simply because our ancestors’ biggest filter was infant mortality. Unless infant mortality acted completely at random–taking the genetically loaded and unloaded alike–or on factors completely irrelevant to load, the elimination of infant mortality must continuously increase the genetic load in the human population. Over time, if that load is not selected out–say, through more people being too unhealthy to reproduce–then we will end up with an increasing population of physically sick, maladjusted, mentally ill, and low-IQ people.
If all of the above is correct, then I see only 4 ways out:
Do nothing: Genetic load increases until the population is non-functional and collapses, resulting in a return of Malthusian conditions, invasion by stronger neighbors, or extinction.
Sterilization or other weeding out of high-load people, coupled with higher fertility by low-load people
Abortion of high load fetuses
#1 sounds unpleasant, and #2 would result in masses of unhappy people. We don’t have the technology for #4, yet. I don’t think the technology is quite there for #2, either, but it’s much closer–we can certainly test for many of the deleterious mutations that we do know of.
The Sino-Tibetan languages, in a few sources also known as Tibeto-Burman or Trans-Himalayan, are a family of more than 400 languages spoken in East Asia, Southeast Asia and South Asia. The family is second only to the Indo-European languages in terms of the number of native speakers. The Sino-Tibetan languages with the most native speakers are the varieties of Chinese (1.3 billion speakers), Burmese (33 million) and the Tibetic languages (8 million). Many Sino-Tibetan languages are spoken by small communities in remote mountain areas and as such are poorly documented.
But the claim that Tibetans and Chinese people are genetically disparate looks more questionable. While the Wikipedia page on Sino-Tibetan claims that, “There is no ethnic unity among the many peoples who speak Sino-Tibetan languages,” in the next two sentences it also claims that, “The most numerous are the Han Chinese, numbering 1.4+ billion(in China alone). The Hui (10 million) also speak Chinese but are officially classified as ethnically distinct by the Chinese government.”
But the Chinese government claiming that a group is an official ethnic group doesn’t make it a genetic group. “Hui” just means Muslim, and Muslims of any genetic background can get lumped into the group. I actually read some articles about the Hui ages ago, and as far as I recall, the category didn’t really exist in any official way prior to the modern PRC declaring that it did for census purposes. Today (or recently) there are some special perks for being an ethnic minority in China, like exceptions to the one-child policy, which lead more people to embrace their “Hui” identity and start thinking about themselves in this pan-Chinese-Muslim way rather than in terms of their local ethnic group, but none of this is genetics.
So right away I am suspicious that this claim is more “these groups see themselves as different” than “they are genetically different.” And I totally agree that Tibetan people and Chinese people are culturally distinct and probably see themselves as different groups.
For genetics, let’s turn back to Haak et al’s representation of global genetics:
Just in case you’re new around here, the part dominated by bright blue is sub-Saharan Africans, the yellow is Asians, and the orange is Caucasians. I’ve made a map to make it easier to visualize the distribution of these groups:
The first thing that jumps out at me is that the groups in the Sino-Tibetan language family do not look all that genetically distinct, at least not on a global scale. They’re more similar than Middle Easterners and Europeans, despite the fact that Anatolian farmers invaded Europe several thousand years ago.
The Wikipedia page on Sino-Tibetan notes:
J. A. Matisoff proposed that the urheimat of the Sino-Tibetan languages was around the upper reaches of the Yangtze, Brahmaputra, Salween, and Mekong. This view is in accordance with the hypothesis that bubonic plague, cholera, and other diseases made the easternmost foothills of the Himalayas between China and India difficult for people outside to migrate in but relatively easily for the indigenous people, who had been adapted to the environment, to migrate out.
The Yangtze, Brahmaputra, Salween and Mekong rivers, as you might have already realized if you took a good look at the map at the beginning of the post, all begin in Tibet.
Since Tibet was recently conquered by China, I was initially thinking that perhaps an ancient Chinese group had imposed their language on the Tibetans some time in the remote past, but Tibetans heading downstream and possibly conquering the people below makes a lot more sense.
According to About World Languages, Proto-Sino-Tibetan may have split into its Tibeto- and Sinitic- branches about 4,000 BC. This is about the same time Proto-Indo-European started splitting up, so we have some idea of what a language family looks like when it’s that old; much older, and the languages start becoming so distinct that reconstruction becomes more difficult.
But if we look at the available genetic data a little more closely, we see that there are some major differences between Tibetans and their Sinitic neighbors–most notably, many Tibetan men belong to Y-Chromosome haplogroup D, while most Han Chinese men belong to haplogroup O with a smattering of Haplogroup C, which may have arrived via the Mongols.
The distribution of Haplogroup D-M174 is found among nearly all the populations of Central Asia and Northeast Asia south of the Russian border, although generally at a low frequency of 2% or less. A dramatic spike in the frequency of D-M174 occurs as one approaches the Tibetan Plateau. D-M174 is also found at high frequencies among Japanese people, but it fades into low frequencies in Korea and China proper between Japan and Tibet.
It is found today at high frequency among populations in Tibet, the Japanese archipelago, and the Andaman Islands, though curiously not in India. The Ainu of Japan are notable for possessing almost exclusively Haplogroup D-M174 chromosomes, although Haplogroup C-M217 chromosomes also have been found in 15% (3/20) of sampled Ainu males. Haplogroup D-M174 chromosomes are also found at low to moderate frequencies among populations of Central Asia and northern East Asia as well as the Han and Miao–Yao peoples of China and among several minority populations of Sichuan and Yunnan that speak Tibeto-Burman languages and reside in close proximity to the Tibetans.
Unlike haplogroup C-M217, Haplogroup D-M174 is not found in the New World…
Haplogroup D-M174 is also remarkable for its rather extreme geographic differentiation, with a distinct subset of Haplogroup D-M174 chromosomes being found exclusively in each of the populations that contains a large percentage of individuals whose Y-chromosomes belong to Haplogroup D-M174: Haplogroup D-M15 among the Tibetans (as well as among the mainland East Asian populations that display very low frequencies of Haplogroup D-M174 Y-chromosomes), Haplogroup D-M55 among the various populations of the Japanese Archipelago, Haplogroup D-P99 among the inhabitants of Tibet, Tajikistan and other parts of mountainous southern Central Asia, and paragroup D-M174 without tested positive subclades (probably another monophyletic branch of Haplogroup D) among the Andaman Islanders. Another type (or types) of paragroup D-M174 without tested positive subclades is found at a very low frequency among the Turkic and Mongolic populations of Central Asia, amounting to no more than 1% in total. This apparently ancient diversification of Haplogroup D-M174 suggests that it may perhaps be better characterized as a “super-haplogroup” or “macro-haplogroup.” In one study, the frequency of Haplogroup D-M174 without tested positive subclades found among Thais was 10%.
Haplogroup D’s sister clade, Haplogroup E, (both D and E are descended from Haplogroup DE), is found almost exclusively in Africa.
Haplogroup D is therefore very ancient, estimated at 50-60,000 years old. Haplogroup O, by contrast, is only about 30,000 years old.
On the subject of Han genetics, Wikipedia states:
Y-chromosome haplogroup O3 is a common DNA marker in Han Chinese, as it appeared in China in prehistoric times. It is found in more than 50% of Chinese males, and ranging up to over 80% in certain regional subgroups of the Han ethnicity. However, the mitochondrial DNA (mtDNA) of Han Chinese increases in diversity as one looks from northern to southern China, which suggests that male migrants from northern China married with women from local peoples after arriving in modern-day Guangdong, Fujian, and other regions of southern China. … Another study puts Han Chinese into two groups: northern and southern Han Chinese, and it finds that the genetic characteristics of present-day northern Han Chinese was already formed as early as three-thousand years ago in the Central Plain area.
(Note that 3,000 years ago is potentially a thousand years after the first expansion of Proto-Sino-Tibetan.)
The estimated contribution of northern Hans to southern Hans is substantial in both paternal and maternal lineages and a geographic cline exists for mtDNA. As a result, the northern Hans are the primary contributors to the gene pool of the southern Hans. However, it is noteworthy that the expansion process was dominated by males, as is shown by a greater contribution to the Y-chromosome than the mtDNA from northern Hans to southern Hans. These genetic observations are in line with historical records of continuous and large migratory waves of northern China inhabitants escaping warfare and famine, to southern China.
Interestingly, the page on Tibetans notes, ” It is thought that most of the Tibeto-Burman-speakers in Southwest China, including the Tibetans, are direct descendants from the ancient Qiang.”
This ancient tribe is said to be the progenitor of both the modern Qiang and the Tibetan people. There are still many ethnological and linguistic links between the Qiang and the Tibetans. The Qiang tribe expanded eastward and joined the Han people in the course of historical development, while the other branch that traveled southwards, crosses over the Hengduan Mountains, and entered the Yungui Plateau; some went even farther, to Burma, forming numerous ethnic groups of the Tibetan-Burmese language family. Even today, from linguistic similarities, their relative relationship can be seen.
So here’s what I think happened (keeping in mind that I am in no way an expert on these subjects):
About 8,000 years ago: neolithic people lived in Asia. (People of some sort have been living in Asia since Homo erectus, after all.) The ancestors of today’s Sino-Tibetans lived atop the Tibetan plateau.
About 6,000 years ago: the Tibetans headed downstream, following the course of local rivers. In the process, the probably conquered and absorbed many of the local tribes they encountered.
About 4,000 years ago: the Han and Qiang are ethnically and linguistically distinct, though the Qiang are still fairly similar to the Tibetans.
The rest of Chinese history: Invasion from the north. Not only did the Mongols invade and kill somewhere between 20 and 60 million Chinese people in the 13th century, but there were also multiple of invasions/migrations by people who were trying to get away from the Mongols.
Note that while the original proto-Sino-Tibetan invasion likely spread Tibetan Y-Chromosomes throughout southern China, the later Mongol and other Chinese invasions likely wiped out a large percent of those same chromosomes, as invaders both tend to be men and to kill men; women are more likely to survive invasions.
Most recently, of course, the People’s Republic of China conquered Tibet in 1951.
I’m sure there’s a lot I’m missing that would be obvious to an expert.
Continuing with our discussion of German/Polish history/languages/genetics, let’s look at what some actual geneticists have to say.
(If you’re joining us for the first time, the previous two posts summarize to: due to being next door to each other and having been invaded/settled over the millennia by groups which didn’t really care about modern political borders, Polish and German DNA are quite similar. More recent events, however, like Germany invading Poland and trying to kill all of the Poles and ethnic Germans subsequently fleeing/being expelled from Poland at the end of the war have created conditions necessary for genetic differentiation in the two populations.)
So I’ve been looking up whatever papers I can find on the subject.
The male genetic landscape of the European continent has been shown to be clinal and influenced primarily by geography rather than by language.1 One of the most outstanding phenomena in the Y-chromosomal diversity in Europe concerns the population of Poland, which reveals geographic homogeneity of Y-chromosomal lineages in spite of a relatively large geographic area seized by the Polish state.2 Moreover, a sharp genetic border has been identified between paternal lineages of neighbouring Poland and Germany, which strictly follows a political border between the two countries.3 Massive human resettlements during and shortly after the World War II (WWII), involving millions of Poles and Germans, have been proposed as an explanation for the observed phenomena.2, 3 Thus, it was possible that the local Polish populations formed after the early Slavic migrations displayed genetic heterogeneity before the war owing to genetic drift and/or gene flow with neighbouring populations. It has been also suggested that the revealed homogeneity of Polish paternal lineages existed already before the war owing to a common genetic substrate inherited from the ancestral Slavic population after the Slavs’ early medieval expansion in Europe.2 …
We used high-resolution typing of Y-chromosomal binary and microsatellite markers first to test for male genetic structure in the Polish population before massive human resettlements in the mid-20th century, and second to verify if the observed present-day genetic differentiation between the Polish and German paternal lineages is a direct consequence of the WWII or it has rather resulted from a genetic barrier between peoples with distinct linguistic backgrounds. The study further focuses on providing an answer to the origin of the expansion of the Slavic language in early medieval Europe. For the purpose of our investigation, we have sampled three pre-WWII Polish regional populations, three modern German populations (including the Slavic-speaking Sorbs) and a modern population of Slovakia. …
AMOVA in the studied populations revealed statistically significant support for two linguistically defined groups of populations in both haplogroup and haplotype distributions (Table 2). It also detected statistically significant genetic differentiation for both haplogroups and haplotypes in three Polish pre-WWII regional populations (Table 2). The AMOVA revealed small but statistically significant genetic differentiation between the Polish pre-war and modern populations (Table 2). When both groups of populations were tested for genetic structure separately, only the modern Polish regional samples showed genetic homogeneity (Table 2). Regional differentiation of 10-STR haplotypes in the pre-WWII populations was retained even if the most linguistically distinct Kashubian speakers were excluded from the analysis (RST=0.00899, P=0.01505; data not shown). Comparison of Y chromosomes associated with etymologically Slavic and German surnames (with frequencies provided in Table 1) did not reveal genetic differentiation within any of the three Polish regional populations for all three (FST, ΦST and RST) genetic distances. Moreover, the German surname-related Y chromosomes were comparably distant from Bavaria and Mecklenburg as the ones associated with the Slavic surnames (Supplementary Figure S2). MDS of pairwise genetic distances showed a clear-cut differentiation between German and Slavic samples (Figure 2). In addition, the MDS analysis revealed the pre-WWII populations from northern, central and southern Poland to be moderately scattered in the plot, on the contrary to modern Polish regional samples, which formed a very tight, homogeneous cluster (Figure 3).
This all seems very reasonable. Modern Poland is probably more homogenous than pre-war Poland in part because modern Poles have cars and trains and can marry people from other parts of Poland much more easily than pre-war Poles could, and possibly because the war itself reduced Polish genetic diversity and displaced much of the population.
Genetic discontinuity along the Polish-German border also makes sense, as national, cultural, and linguistic boundaries all make intermarriage more difficult.
The Discussion portion of this paper is very interesting; I shall quote briefly:
Kayser et al3 revealed significant genetic differentiation between paternal lineages of neighbouring Poland and Germany, which follows a present-day political border and was attributed to massive population movements during and shortly after the WWII. … it remained unknown whether Y-chromosomal diversity in ethnically/linguistically defined Slavic and German populations, which used to be exposed to intensive interethnic contacts and cohabit ethnically mixed territories, was clinal or discontinuous already before the war. In contrast to the regions of Kaszuby and Kociewie, which were politically subordinated to German states for more than three centuries and before the massive human resettlements in the mid-20th century occupied a narrow strip of land between German-speaking territories, the Kurpie region practically never experienced longer periods of German political influence and direct neighbourhood with the German populations. Lusatia was conquered by Germans in the 10th century and since then was a part of German states for most of its history; the modern Lusatians (Sorbs) inhabit a Slavic-speaking island in southeastern Germany. In spite of the fact that these four regions differed significantly in exposure to gene flow with the German population, our results revealed their similar genetic differentiation from Bavaria and Mecklenburg. Moreover, admixture estimates showed hardly detectable German paternal ancestry in Slavs neighbouring German populations for centuries, that is, the Sorbs and Kashubes. However, it should be noted that our regional population samples comprised only individuals of Polish and Sorbian ethnicity and did not involve a pre-WWII German minority of Kaszuby and Kociewie, which owing to forced resettlements in the mid-20th century ceased to exist, and also did not involve Germans constituting since the 19th century a majority ethnic group of Lusatia. Thus, our results concern ethnically/linguistically rather than geographically defined populations and clearly contrast the broad-scale pattern of Y-chromosomal diversity in Europe, which was shown to be strongly driven by geographic proximity rather than by language.1 …
Two main factors are believed to be responsible for the Slavic language extinction in vast territories to the east of the Elbe and Saale rivers: colonisation of the region by the German-speaking settlers, known in historical sources as Ostsiedlung, and assimilation of the local Slavic populations, but contribution of both factors to the formation of a modern eastern German population used to remain highly speculative.8 Previous studies on Y-chromosomal diversity in Germany by Roewer et al17 and Kayser et al3 revealed east–west regional differentiation within the country with eastern German populations clustering between western German and Slavic populations but clearly separated from the latter, which suggested only minor Slavic paternal contribution to the modern eastern Germans. Our ancestry estimates for the Mecklenburg region (Supplementary Table S3) and for the pooled eastern German populations, assessed as being well below 50%, definitely confirm the German colonisation with replacement of autochthonous populations as the main reason for extinction of local Slavic vernaculars. The presented results suggest that early medieval Slavic westward migrations and late medieval and subsequent German eastward migrations, which outnumbered and largely replaced previous populations, as well as very limited male genetic admixture to the neighbouring Slavs (Supplementary Table S4), were likely responsible for the pre-WWII genetic differentiation between Slavic- and German-speaking populations. Woźniak et al18 compared several Slavic populations and did not detect such a sharp genetic boundary in case of Czech and Slovak males with genetically intermediate position between other Slavic and German populations, which was explained by early medieval interactions between Slavic and Germanic tribes on the southern side of the Carpathians. Anyway, paternal lineages from our Slovak population sample were genetically much closer to their Slavic than German counterparts. …
Note that they are discussing paternal ancestry. This does not rule out the possibility of significant Slavic maternal ancestry. Finally:
Our coalescence-based divergence time estimates for the two isolated western Slavic populations almost perfectly match historical and archaeological data on the Slavs’ expansion in Europe in the 5th–6th centuries.4 Several hundred years of demographic expansion before the divergence, as detected by the BATWING, support hypothesis that the early medieval Slavic expansion in Europe was a demographic event rather than solely a linguistic spread of the Slavic language.
I left out a lot of interesting material, so I recommend reading the complete discussion if you want to know more about Polish/German genetics.
Mitochondrial DNA (mtDNA) sequence variation was examined in Poles (from the Pomerania-Kujawy region; n = 436) and Russians (from three different regions of the European part of Russia; n = 201)… The classification of mitochondrial haplotypes revealed the presence of all major European haplogroups, which were characterized by similar patterns of distribution in Poles and Russians. An analysis of the distribution of the control region haplotypes did not reveal any specific combinations of unique mtDNA haplotypes and their subclusters that clearly distinguish both Poles and Russians from the neighbouring European populations. The only exception is a novel subcluster U4a within subhaplogroup U4, defined by a diagnostic mutation at nucleotide position 310 in HVS II. This subcluster was found in common predominantly between Poles and Russians (at a frequency of 2.3% and 2.0%, respectively) and may therefore have a central-eastern European origin. …
The analysis of mtDNA haplotype distribution has shown that both Slavonic populations share them mainly with Germans and Finns. The following numbers of the rare shared haplotypes and subclusters were found between populations analyzed: 10% between Poles and Germans, 7.4% between Poles and Russians, and 4.5% between Russians and Germans. A novel subcluster U4-310, defined by mutation at nucleotide position 310 in HVS II, was found predominantly in common between Poles and Russians (at frequency of 2%). Given the relatively high frequency and diversity of this marker among Poles and its low frequency in the neighbouring German and Finnish populations, we suggest a central European origin of U4-310, following by subsequent dispersal of this mtDNA subgroup in eastern European populations during the Slavonic migrations in early Middle Ages.
In other words, for the most part, Poles, Russians, Germans, and even Finns(!) (who do not speak an Indo-European language and are usually genetic outliers in Europe,) all share their maternal DNA.
Migrants, immigrants, and invaders tend disproportionately to be male (just look at any army) while women tend to stay behind. Invading armies might wipe each other out, but the women of a region are typically spared, seen as booty similar to cattle to be distributed among the invaders rather than killed. Female populations therefore tend to be sticky, in a genetic sense, persisting long after all of the men in an area were killed and replaced. The dominant Y-chromosome haplogroup in the area (R1a) hails from the Indo-European invasion (except in Finland, obviously,) but the mtDNA likely predates that expansion.
These data allow us to suggest that Europeans, despite their linguistic differences, originated in the common genetic substratum which predates the formation of the most modern European populations. It seems that considerable genetic similarity between European populations, which has been revealed by mtDNA variation studies, was further accelerated by a process of gene redistribution between populations due to the multiple migrations occurring in Europe during the past milenia…
It is interesting, though, that recent German invasions of Poland left very little in the way of a genetic contribution. I’d wager that WWII was quite a genetic disaster for everyone involved.
If you want more information, Khazaria has a nice list of studies plus short summaries on Polish DNA.
Commentator Unknown123 asks what we can tell about the differences between German and Polish DNA. Obviously German is here referring to one of the Germanic peoples who occupy the modern nation of Germany and speak a Germanic language. But as noted before, just because people speak a common language doesn’t necessarily mean they have a common genetic origin. Germans and English both speak Germanic languages , but Germans could easily share more DNA with their Slavic-language speaking neighbors in Poland than with the English.
It is suggested by geneticists that the movements of Germanic peoples has had a strong influence upon the modern distribution of the male lineage represented by the Y-DNAhaplogroup I1, which is believed to have originated with one man, who lived approximately 4,000 to 6,000 years somewhere in Northern Europe, possibly modern Denmark … There is evidence of this man’s descendants settling in all of the areas that Germanic tribes are recorded as having subsequently invaded or migrated to.[v] However, it is quite possible that Haplogroup I1 is pre-Germanic, that is I1 may have originated with individuals who adopted the proto-Germanic culture, at an early stage of its development or were co-founders of that culture. Should that earliest Proto-Germanic speaking ancestor be found, his Y-DNA would most likely be an admixture of the aforementioned I1, but would also contain R1a1a, R1b-P312 and R1b-U106, a genetic combination of the haplogroups found among current Germanic speaking peoples. …
According to a study published in 2010, I-M253 originated between 3,170 and 5,000 years ago, in Chalcolithic Europe. A new study in 2015 estimated the origin as between 3,470 and 5,070 years ago or between 3,180 and 3,760 years ago, using two different techniques. It is suggested that it initially dispersed from the area that is now Denmark.
A 2014 study in Hungary uncovered remains of nine individuals from the Linear Pottery culture, one of whom was found to have carried the M253 SNP which defines Haplogroup I1. This culture is thought to have been present between 6,500 and 7,500 years ago.
In 2002 a paper was published by Michael E. Weale and colleagues showing genetic evidence for population differences between the English and Welsh populations, including a markedly higher level of Y-DNA haplogroup I in England than in Wales. They saw this as convincing evidence of Anglo-Saxon mass invasion of eastern Great Britain from northern Germany and Denmark during the Migration Period. The authors assumed that populations with large proportions of haplogroup I originated from northern Germany or southern Scandinavia, particularly Denmark, and that their ancestors had migrated across the North Sea with Anglo-Saxon migrations and DanishVikings. The main claim by the researchers was:
“That an Anglo-Saxon immigration event affecting 50–100% of the Central English male gene pool at that time is required. We note, however, that our data do not allow us to distinguish an event that simply added to the indigenous Central English male gene pool from one where indigenous males were displaced elsewhere or one where indigenous males were reduced in number … This study shows that the Welsh border was more of a genetic barrier to Anglo-Saxon Y chromosome gene flow than the North Sea … These results indicate that a political boundary can be more important than a geophysical one in population genetic structuring.”
In 2003 a paper was published by Christian Capelli and colleagues which supported, but modified, the conclusions of Weale and colleagues. This paper, which sampled Great Britain and Ireland on a grid, found a smaller difference between Welsh and English samples, with a gradual decrease in Haplogroup I frequency moving westwards in southern Great Britain. The results suggested to the authors that Norwegian Vikings invaders had heavily influenced the northern area of the British Isles, but that both English and mainland Scottish samples all have German/Danish influence.
But the original question was about Germany and Poland, not England and Wales, so we are wandering a bit off-track.
A score of “1” on this graph means that the two populations in question are identical–fully inter-mixing. The closer to 1 two groups score, the more similar they are. The further from one they score, (the bigger the number,) the more different they are.
For example, the most closely related peoples on the graph are Austrians and their neighbors in southern Germany and Hungary (despite Hungarians speaking a non-Indo-European language brought in by recent steppe invaders.) Both groups scored 1.04 relative to Austrians, and a 1.08 relative to each other.
Northern and southern Germans also received a 1.08–so southern Germans are about as closely related to northern Germans as they are to Hungarians, and are more closely related to Austrians than to northern Germans.
This might reflect the pre-Roman empire population in which (as we discussed in the previous post) the Celtic cultures of Hallstatt and La Tene dominated a stretch of central Europe between Austria and Switzerland, with significant expansion both east and west, whilst the proto-Germanic peoples occupied northern Germany and later spread southward.
The least closely related peoples on the graph are (unsurprisingly) the Sami (Lapp) town of Kuusamo in northeastern Finland and Spain, at 4.21. (Finns are always kind of outliers in Europe, and Spaniards are kind of outliers in their own, different way, being the part of mainland Europe furthest from the Indo-European expansion starting point and so having received fewer invaders.
So what does the table say about Germans and their neighbors?
South Germany 1.08
Czech Repub 1.15
North Germany 1.08
Czech Repub 1.16
Czech Repub 1.09
North Germany 1.18
South Germany 1.23
Obviously I didn’t include all of the data in the original table; all of the other sampled European groups, such as Italians, Spaniards, and Finns are genetically further away from north and south Germany and Poland than the listed groups.
So northern Germany and Poland are quite closely related–even closer than northern Germans are to the French (whose country is named after a Germanic tribe, the Franks, who conquered it during the Barbarian Migrations at the Fall of the Roman Empire,) or the Swiss, many of whom speak German. By contrast, southern Germany is more closely related to France and Switzerland than to Poland, but still more closely related to the Poles than Italians or Spaniards.
Note: This post still contains a lot of oversimplification for the sake of explaining a few things.
Welcome back to our discussion of the geographic dispersion of humanity. On Tuesday, we discussed how two great barriers–the Sahara desert and the Himalayas + central Asian desert–have impeded human travelers over the millennia, resulting in three large, fairly well-defined groups of humans, the major races: Sub-Saharan Africans (SSA), Caucasians, and east Asians.
Of course, any astute motorist, having come to a halt at the Asian end of our highway, might observe that there is, in fact, a great deal of land in the world that we have not yet explored. So we head to the local shop and pick up a better map:
Our new map shows us navigational directions for getting to Melanesia and Australia–in ice age times, it instructs us, we can drive most of the way. If there isn’t an ice age, we’ll have to take a boat.
The people of Melanesia and Australia are related, the descendants of one of the first groups of humans to split off from the greater tribe that left Africa some 70k ago.
As the name “Melanesian” implies, they are quite dark-skinned–a result of never having ventured far from the equatorial zone.
Today, they live in eastern Indonesia, Papua New Guinea, Australia, and a smattering of smaller islands. (Notably, the Maori of New Zealand are Polynesians like the Hawaiians, not Melanesians, descendants of a different migration wave that originated in Taiwan.)
There is some speculation that they might have once been wider-spread than they currently are, or that various south-Asian tribes might be related to them, (eg, “A 2009 genetic study in India found similarities among Indian archaic populations and Aboriginal people, indicating a Southern migration route, with expanding populations from Southeast Asia migrating to Indonesia and Australia,”) but I don’t think any mainland group would today be classed as majority Melanesian by DNA.
They may also be related to the scattered tribes of similarly dark-skinned, diminutive people known as the Negritos:
Males from the Aeta people (or Agta) people of The Philippines, are of great interest to genetic, anthropological and historical researchers, as at least 83% of them belong to haplogroup K2b, in the form of its rare primary clades K2b1* and P* (a.k.a. K2b2* or P-P295*). Most Aeta males (60%) carry K-P397 (K2b1), which is otherwise uncommon in the Philippines and is strongly associated with the indigenous peoples of Melanesia and Micronesia. Basal P* is rare outside the Aeta and some other groups within Maritime South East Asia. …
A study of blood groups and proteins in the 1950s suggested that the Andamanese were more closely related to Oceanic peoples than African Pygmies. Genetic studies on Philippine Negritos, based on polymorphic blood enzymes and antigens, showed they were similar to surrounding Asian populations.
However, the Negritos are a very small set of tribes, and I am not confident that they are even significantly related to each other, rather than just some short folks living on a few scattered islands. We must leave them for another day.
The vast majority of Aborigines and Melanesians live in Australia, Papua New Guinea, and nearby islands. They resemble Africans, because they split off from the rest of the out-of-Africa crew long before the traits we now associate with “whites” and “Asians” evolved, and have since stayed near the equator, but they are most closely related to–sharing DNA with–south Asians (and Indians.)
So we have, here, on the genetic level, a funny situation. Melanesians are–relatively speaking–a small group. According to Wikipedia, thee are about 12 million Melanesians and 606,000 Aborigines. By contrast, Tokyo prefecture has 13 million people and the total Tokyo metro area has nearly 38 million. Meanwhile, the Han Chinese–not a race but a single, fairly homogenous ethnic group–number around 1.3 billion.
Of all the world’s peoples, Melanesians/Aborigines are most closely related to other Asians–but this is a distant relationship, and those same Asians are more closely related to Caucasians than to Aborigines.
As I mentioned on Tuesday, the diagram, because it is 1-dimensional, can only show the distance between two groups at a time, not all groups. The genetic distance between Caucasians and Aborigines is about 60 or 50k, while the distance between Asians and Caucasians is around 40k, but the distance between Sub-Saharan Africans and ALL non-SSAs is about 70k, whether they’re in Australia, Patagonia, or France. Our map is not designed to show this distance, only the distances between individual pairs.
Now if we hopped back in our car and zoomed back to the beginning of our trip, pausing to refuel in Lagos, we’d note another small group that has been added to the other end of the map: the Bushmen, aka the Khoi-San people. Wikipedia estimates 90,000 San and doesn’t give an estimate for the Khoi people, but their largest group, the Nama, has about 200,000 people. We’ll estimate the total, therefore, around 500,000 people, just to be safe.
The Bushmen are famous for being among the world’s last hunter-gatherers; their cousins the Khoi people are pastoralists. There were undoubtedly more of them in the past, before both Europeans and Bantus arrived in southern Africa. Some people think Bushmen look a little Asian, due to their lighter complexions than their more equatorial African cousins.
Mitochondrial DNA studies also provide evidence that the San carry high frequencies of the earliest haplogroup branches in the human mitochondrial DNA tree. This DNA is inherited only from one’s mother. The most divergent (oldest) mitochondrial haplogroup, L0d, has been identified at its highest frequencies in the southern African San groups.
In a study published in March 2011, Brenna Henn and colleagues found that the ǂKhomani San, as well as the Sandawe and Hadza peoples of Tanzania, were the most genetically diverse of any living humans studied. This high degree of genetic diversity hints at the origin of anatomically modern humans.
Recent analysis suggests that the San may have been isolated from other original ancestral groups for as much as 100,000 years and later rejoined, re-integrating the human gene pool.
A DNA study of fully sequenced genomes, published in September 2016, showed that the ancestors of today’s San hunter-gatherers began to diverge from other human populations in Africa about 200,000 years ago and were fully isolated by 100,000 years ago … 
So the total distance between Nigerians and Australian Aborogines is 70k years; the distance between Nigerians and Bushmen is at least 100k years.
When we zoom in on the big three–Sub-Saharan Africans, Caucasians, and Asians–they clade quite easily and obviously into three races. But when we add Aborigines and Bushmen, things complicate. Should we have a “race” smaller than the average American city? Or should we just lump them in with their nearest neighbors–Bushmen with Bantus and Aborigines with Asians?
I am fine with doing both, actually–but wait, I’m not done complicating matters! Tune in on Monday for more.
Note: There is a territorial dispute between India and Pakistan. I am not trying to wade into that dispute or pass judgment on who really controls what. Also, I don’t know what distinguishes the 4 Gujarati samples, so they’re just in ABC order.
And finally, greater Asia (plus Australia):
Note that I had to leave off some groups from this map that appeared on earlier maps, like most of the Caucasian ethnicities. (Note that central Siberia is not actually as badly sampled as it looks, because this is a Mercator projection which makes Siberia look bigger than it actually is. Yes, I know, I don’t like Mercator projections, either, but it’s hard to find a nice, blank map with Asia on the left and Alaska on the right, and a cylindrical projection allows me to just switch the two halves without messing up the angles of the continents.)
“White” is a nebulous category. “Black” is actually easier to define, because there’s a pretty hard boundary (the Sahara) between black Africa and everywhere else. To be fair, there are also groups like the Bushmen (who are more tawny brownish,) and the Pygmies who are genetically separate from other sub-Saharan Africans by over 100,000 years, but these are pretty small on the global scale. But “whites” and “Asians” occupy the same continent, and thus shade into each other.
If we use a strictly skin tone definition (as the world “white” implies) we can just pull up a map of global skin tone variation:
Of course, this implies that either Spaniards and Finns aren’t white, or Chinese and Eskimos are. Either way is fine, of course, though this would contradict most people’s usage. (And I kind of question that data on the Finns:
These composites of faces from around the world offer us some more data, though depending on how they were made, they may not accurately reflect skin tone in all countries (ie, if the creator relied on pictures of famous people available on the internet, then these will reflect local beauty norms than group averages.)
(Plus, I wonder why the Romanians are pink.)
J. B. Huang has taken some of the Eurasian faces from this set and gone through the effort of trying to quantitize their shapes, as displayed in this graph (at least, that’s what I think they’re doing):
Interestingly, while some of the faces cluster together the way you might expect–China, Taiwan, Korea, and Japan are all near each other, as are Belgium and the Netherlands–many of the groupings are near random, eg, Mongolia, Turkey, and the Philippines. Hungary and Austria are closer to India and Japan than to Poland or Finland. The European faces are all over the map.
Maybe this doesn’t mean anything at all, or maybe it means that there’s a lot of variation in European faces.
This is actually not too surprising, given that modern Europeans are genetically descended from three different groups who conquered the peninsula in successive waves, leaving more or less of their DNA in different areas: the hunter gatherers who were there first, followed by farmers who spread out from Anatolia (modern Turkey,) followed by the “Indo-Europeans” aka the Yamnaya, who were part hunter gatherer (by DNA, not profession) and part another group whose origins have yet to be located, but which I call the “teal people” because their DNA is teal on Haak’s graph.
Oh yes, we are getting to Haak.
This isn’t the full graph, but it’s probably enough for our purposes. The European countries show a characteristic profile of Orange, Dark Blue, and Teal. (By contrast, the east Asian countries, which cluster closely together on the facial map, are mostly yellow with only a bit of red.)
Obviously DNA isn’t actually colored. It’s just a visual aid.
Haak’s graph makes it fairly easy to rule out the groups that are definitely different (at least genetically.) The American Indians, Inuit, West Africans, Chinese, and Aborigines are distinctly out. This leaves us with Europe, the Middle East, North Africa, India, and parts of central Asia/Siberia:
The Orange-centric region, which Haak et al arranged to display the movements of the Anatolian farmer people.
The heavily teal Indian section (The middle part, from Hazara-Tlingit, are obviously not Indian).
And finally some Siberian DNA.
Now, I could stare at these all day; I love them. They tell so many fascinating stories about people and where they went. Of the three ancestries found in Europeans, the oldest, the dark blue (hunter-gatherers,) is found throughout India, Siberia, and even the Aleutian islands (though I caution that some of this could just be because of Russians raping the Aleuts back in the day.) The dark blue appears to hit a particular low point in the Caucuses region, which of course is about where the teal got its start.
The orange–Anatolian farmers–shows up throughout the Middle East and Europe, but is near totally absent in India and Siberia. (Not much farming in Siberia!)
At a lower resolution (not pictured,) India, central Asia, and Siberia appear to have a mix of–broadly speaking–“European” and “Asian” ancestry. (Not too surprising, since they are in the middle of the continent.) Obviously the middle of Asia is a big crossroads between different groups–red (Siberian) yellow (east Asian) teal and dark blue, and bits of the same DNA that shows up in the Eskimo (Inuit) and Aleuts.
But this is all kind of complicated. Luckily for us, this is only one way to visualize DNA–I’ve got others!
If you’re not familiar with these sorts of trees, the basic story is that geneticists gathered DNA samples (from spit, I think, which is pretty awesome,) from ethnic groups from all over the world, and then measured how many genes they have in common. More genes in common = groups more closely related to each other. Fewer genes = more genetic distance from each other.
Since different genetic samples and computer models are different, different teams have produced slightly different genetic trees.
Note that since the tree is constructed by comparing # of genes two groups have in common, a group could end up in a particular spot because it is descended from a common ancestor with other nearby groups, or because of mixing between two groups. Ashkenazi Jews, for example, cluster with southern Europeans because they’re about half Italian (and obviously half ancient Israeli.) Here’s another chart, giving us another perspective:
This chart also shows us genetic differences between groups, with strong clustering among African and East Asians, respectively, and then a sort of scattered group of Europeans and Indians (South Asians.)
Neither of these graphs shows Siberians or central Asians in great detail, because they are tiny groups, but I think it’s safe to say the Siberians at least cluster near their neighbors, the other Asians and far-north Americans.
The central and south Asians, though, are quite the interesting case!
Between archaeology and genetics, we’ve been able to trace the path of human expansion, from central Africa to the world:
Well, ultimately, there’s no hard division between most ethnic groups or races–you can draw dividing lines where you want them. The term “white” implies dermal paleness, of course, so you may prefer a narrower definition for “white” than “Caucasian.” Greater minds than mine have already covered the subject in more authoritative detail, of course. I merely offer my thoughts for entertainment.
Disclaimer: I am not a geneticist. For those of you who are new here, this is basically a genetics fan blog. I am trying to learn about genetics, and you know what?
Genetics is complicated.
I fully admit that here’s a lot of stuff that I don’t know yet, nor fully understand.
Luckily for me, there are a few genetics basics that are easy enough to understand that even a middle school student can master them:
“Evolution” is the theory that species change over time due to some individuals within them being better at getting food, reproducing, etc., than other individuals, and thereby passing on their superior traits to their children.
“Genes,” (or “DNA,”) are the biological code for all life, and the physical mechanism by which traits are passed down from parent to child.
“Mendel squares” work for modeling the inheritance of simple traits
More complicated trait are modeled with more complicated math
Lamarckism doesn’t work.
Lamarck was a naturalist who, in the days before genes were discovered, theorized that creatures could pass on “acquired” characteristics. For example, an animal with a relatively normal neck in an area with tall trees might stretch its neck in order to reach the tastiest leaves, and then pass on this longer-neck to its children, who would also stretch their necks and then pass on the trait to their children, until you get giraffes.
A fellow with similar ideas, Lysenko, was a Soviet Scientist who thought he could make strains of cold-tolerant wheat simply by exposing wheat kernels to the cold.
We have the luxury of thinking that Lysenko’s ideas sound silly. The Soviet peasants had to actually try to grow his wheat, and scientists who pointed out that this was nonsense got sent to the gulag.
The problem with Lamarckism is that it doesn’t work. You can’t make wheat grow in Antarctica by sticking it in your freezer for a few months and animals don’t have taller babies just because you stretch their necks.
Pop science articles talk about epigenetics as if it were Lamarckism. Through the magic of epigenetic markers, acquired traits can supposedly be passed down to one’s children and grandchildren, infinitely.
Actual epigenetics, as scientists actually study it, is a real and interesting field. But the effects of epigenetic changes are not so large and permanent as to substantially change most of the way we model genetic inheritance.
Epigenetics is, in essence, part of how you learn. Suppose you play a disturbing noise every time a mouse smells cherries. Pretty soon, the mouse would learn to associate “fear” and “cherry smell,” and according to Wikipedia, this gets encoded at the epigenetic level. Great, the mouse has learned to be afraid of cherries.
If these epigenetic traits get passed on to the mouse’s children–I am not convinced this is possible but let’s assume it is–then those children can inherit their mother’s fear of cherries.
This is pretty neat, but people take it too far when they assume that as a result, the mouse’s fear will persist over many generations, and that you have essentially just bred a new, cherry-fearing strain of mice.
You, see, you learn new things all the time. So do mice. Your epigenetics therefore keep changing throughout your life. The older you are, the more your epigenetics have changed since you were born. This is why even identical twins differ in small ways from each other. Sooner or later, the young mice will figure out that there isn’t actually any reason to be afraid of cherries, and they’ll stop being afraid.
If people were actually the multi-generational heirs of their ancestors’ trauma, pretty much everyone in the world would be affected, because we all have at least one ancestor who endured some kind of horrors in their life. The entire continent of Europe should be a PTSD basket case due to WWI, WWII, and the Depression.
Thankfully, this is not what we see.
Epigenetics has some real and very interesting effects, but it’s not Lamarckism 2.0.