1 Can human behaviour be studied scientifically?

The human sciences — psychology, economics, sociology, anthropology, political science — aspire to apply the methods of science to human behaviour. The ambition is justified: if we can know the world through systematic observation and experiment, why shouldn’t this apply to the most important thing in it, which is us? But the application raises a fundamental problem that distinguishes the human sciences from the natural sciences in ways that go beyond technique.

The SNB EUR/CHF Floor and Its Removal (2011–2015)

On 6 September 2011, with the Swiss franc rising sharply against the euro during the European sovereign-debt crisis, the Swiss National Bank announced an exchange-rate floor: it would no longer tolerate a EUR/CHF exchange rate below CHF 1.20 and was prepared to buy unlimited amounts of foreign currency to defend the floor.¹ The policy worked because market participants believed the SNB would defend it. Trading firms, exporters, mortgage banks, and international corporations restructured their balance sheets and their hedging on the assumption of CHF 1.20; Polish, Croatian, and Hungarian retail customers signed mortgages denominated in Swiss francs; Swiss watchmakers and pharmaceutical exporters built three-year forecasts on the same number. By December 2014 the SNB’s foreign-exchange reserves had grown to roughly 85 % of Swiss GDP — the largest balance sheet relative to GDP of any major central bank in modern times.² On 15 January 2015 at 10:30 CET, SNB Chairman Thomas Jordan announced abandonment of the floor without prior consultation with the EU or with Swiss-bank customers.³ Within two minutes the franc appreciated approximately 30 % against the euro. KOF research subsequently estimated that Swiss exporters lost CHF 4.5 billion of margin in 2015. Polish frankowicze — over 700,000 households who had taken out CHF-denominated mortgages between 2007 and 2011 — found their loan principal in złoty up by roughly a third overnight; the resulting Polish judicial proceedings have run continuously since 2015 and the Polish Supreme Court ruled against the banks in April 2024. Several FX brokers (Alpari UK, FXCM in part) went insolvent the same day. The case is the cleanest contemporary illustration of the Black-Scholes-shape phenomenon — the Black–Scholes 1973 options-pricing model is the canonical case, treated below in the body — but in a Swiss-immediate, monetary-policy register: a model held by participants creates the conditions the model assumes; once the participants stop believing, the model breaks; and the breakdown is visible in mortgage-payment notices that arrive in the post the next month.

1.1 The Object/Subject Problem

In physics, the objects of study — particles, waves, fields — do not know they are being studied. They cannot change their behaviour in response to the physicist’s attention, cannot understand the theory being used to describe them, and cannot adjust their properties in light of the conclusions drawn. The physicist studies a world that is genuinely independent.

In the human sciences, the object of study is human beings. Human beings are conscious, self-interpreting, theory-reading agents. They know they are being studied. They can read the published findings of sociology and adjust their behaviour accordingly. They can understand what experimenters expect and either conform to or subvert those expectations. The sociological law that “rising incomes reduce crime” can be rendered false if people read it, believe it, and use it as an excuse not to invest in crime prevention. There are no such feedback loops in physics.

The feedback loop is not a technical problem to be solved by better experimental control; it is a structural feature of the relationship between the human sciences and their subject matter.

1.2 Weber’s Verstehen

Max Weber, in The Methodology of the Social Sciences (essays published 1904–1917), argued that social science requires a distinct kind of understanding that is irreducible to the causal explanation used in natural science. He called it Verstehen — understanding.⁴

To explain why a Catholic population has lower rates of economic productivity than a Protestant one (Weber’s question in The Protestant Ethic and the Spirit of Capitalism, 1905),⁵ it is not sufficient to identify statistical correlations. You need to understand the meaning that economic activity has for agents shaped by different religious frameworks — the sense in which the Protestant doctrine of calling made mundane labour a spiritual vocation, while Catholic frameworks provided different routes to salvation. Causal explanation without Verstehen produces correlations without understanding.

Weber’s claim is epistemological: understanding human action requires grasping the subjective meaning it has for the actor, and this kind of grasping is different from the third-person causal knowledge that natural science produces.

Peter Winch, in The Idea of a Social Science (1958),⁶ radicalises Weber’s point: understanding a social practice is not like explaining a natural phenomenon at all. To understand what a rule-governed activity (a religious ritual, a legal proceeding) is requires grasping the concepts that constitute it from the inside. An account that translates these concepts into a causal vocabulary distorts them.

Individualism vs. holism. Jon Elster, in Explaining Social Behavior (rev. ed. 2015), insists that “in principle, explanations in the social sciences should refer only to individuals and their actions” — talk of households, classes, or nations is shorthand for the choices of the individuals who compose them.⁷ The macro counterweight is functionalism: the doctrine that social institutions are explained by the function they serve in maintaining the social system (Durkheim’s “social facts,” Parsons’s structural functionalism, Merton’s manifest/latent distinction). Carl Hempel’s 1959 critique is that even if a rain dance reinforces group identity, “at most, one can explain that the rain dance or some functional equivalent exists” — the function does not determine the mechanism.⁸ G. A. Cohen defended functional explanation in Marxism by arguing that consequence laws can be confirmed without specifying the underlying mechanism.⁹ Elster’s reply: a functional explanation that does not specify the feedback loop by which the beneficial consequence selects for the behaviour producing it is “worthless.”¹⁰

1.3 Reasons and Causes

Weber’s Verstehen leaves an open question that the 20th century took half a century to answer. If understanding human action requires grasping the agent’s reasons — beliefs, desires, intentions — and explaining a natural phenomenon requires identifying its causes, are reasons and causes two different categories of explanation, or two descriptions of the same thing?

Donald Davidson’s “Actions, Reasons, and Causes” (Journal of Philosophy, 1963) is the canonical answer. Davidson argued, against the then-dominant Wittgensteinian view that reasons-explanations and causal explanations belong to logically distinct categories, “the ancient — and commonsense — position that rationalization is a species of causal explanation.”¹¹ Giving an agent’s reason for an action — the belief and desire that together rationalise it — is not an alternative to identifying the cause of the action; the reason is the cause, under a particular description. The “primary reason” for an action, on Davidson’s account, “is its cause.”¹²

Take an everyday case. “She shouted because she was angry.” The clause names the reason — what made the shout intelligible to anyone who knows her — and the cause — what produced the shout. There are not two events here, the reason-event and the cause-event; there is one event under two descriptions.

The implication for social-scientific method is structural rather than rhetorical. Reasons and causes do not mark off two epistemically independent domains; an interpretive social science that grasps an agent’s meaning has not thereby exited the space of causal claims.

Rosenberg draws the consequence: if Davidson is right, the categorial gap between Verstehen and Erklären (understanding and causal explanation) that Weber and the German hermeneutic tradition treated as foundational collapses into a methodological difference — reasons require interpretive access, causes do not — without a difference in the logical type of the explanation produced.¹³ Whether Davidson’s argument actually does the work Rosenberg asks of it is itself contested in the philosophy of action (Anscombe, Sellars, the more recent neo-Wittgensteinian programme of Hutto and others have all denied it).

This does not settle the Verstehen debate; it relocates it. Even if reasons are causes, the question remains whether the social sciences can identify them with the rigour the natural sciences identify their causes — and whether, as Winch insists, grasping the concepts that constitute a rule-governed practice is a precondition of identifying any reasons at all.

1.4 Ethnography in Action: Hochschild’s Flight Attendants

The Verstehen / Erklären dispute can be made concrete by considering what a properly Verstehen-shaped piece of social science actually looks like. Arlie Russell Hochschild, The Managed Heart: Commercialization of Human Feeling (University of California Press, 1983), is the standard exemplar. Hochschild trained with Delta Air Lines flight attendants — observing the company’s training programmes, riding flights as passenger and observer, interviewing trainees and senior staff over three years — and produced a sustained account of what she called emotional labour: the active management of one’s own felt and displayed emotions as a condition of paid employment. Her central concepts (surface acting, in which the displayed emotion does not match the felt one; deep acting, in which the worker actually persuades herself to feel what the role requires; the transmutation of private emotional capacities into a managed work product) are not produced by any econometric instrument; they are produced by sustained close observation of what trained people do and what they say about it. Forty years on, emotional labour is a load-bearing concept across labour economics, the sociology of work, organisation studies, and the contemporary literature on care work and burnout — and the path by which it entered all those fields was not statistical but ethnographic. The point for present purposes is not that ethnography replaces econometrics but that any account of “evidence in the human sciences” that treats the experiment and the regression as exhaustive misses where many of the field’s most-cited concepts came from. The same point can be made using Erving Goffman’s frontstage / backstage distinction (The Presentation of Self in Everyday Life, 1959), or Elinor Ostrom’s fieldwork on Maine lobster fisheries (Governing the Commons, 1990) — both produced through observational and interview methods that no quantitative reformulation captures without remainder.

1.5 The Hawthorne Effect

The Hawthorne Works was a Western Electric factory in Cicero, Illinois, where, between 1924 and 1932, a series of experiments were conducted to test the effects of working conditions (lighting levels, break schedules, pay structures) on worker productivity.¹⁴ The findings were puzzling: almost every change in conditions — better lighting, worse lighting; longer breaks, shorter breaks — produced improved productivity. The researchers concluded that the workers were responding not to the specific changes but to the fact of being observed and studied.

The Hawthorne Effect has become the name for this general phenomenon: humans change their behaviour when they know they are being watched. The epistemological implication is severe: any study of human behaviour that involves observation may be measuring something other than natural human behaviour — it may be measuring how humans behave when they know they are being measured.

Later researchers challenged the Hawthorne findings sharply: Levitt and List (2011), re-examining the original Western Electric records, were essentially unable to detect the classical Hawthorne effect at all in the lighting study — the productivity gains they did find could be explained by ordinary trends and incentives, not by observation per se.¹⁵ The “Hawthorne effect” survives as a pedagogical label more securely than as a clean empirical finding traceable to the original studies. But the general principle it stands for — that observation is rarely neutral in the human sciences, that the act of measuring can change what is being measured — remains methodologically important; it is just that the textbook anchor is shakier than the principle requires.

There is a deeper version of the observer problem that goes beyond the Hawthorne effect. When social scientists publish their findings about human behaviour, those findings enter the world — and change the behaviour they describe. This is the performativity of social theory: the human sciences do not merely describe a pre-existing social reality; they actively constitute it.

The clearest case is economics. When Black and Scholes published their option-pricing model (1973), traders began using it. The model became self-fulfilling: markets behaved according to the formula not because the formula captured a pre-existing truth about markets, but because market participants started using it as the basis for decisions. Donald MacKenzie (An Engine, Not a Camera, 2006) calls this Barnesian performativity: a theory that is used by enough actors in a system tends to make the system conform to the theory, because the actors reshape their behaviour to match the model’s assumptions.¹⁶

A weaker analogue may appear in psychology. After Kahneman and Tversky’s findings about cognitive biases became widely known, the bias literature was absorbed by behavioural economics, by trader-training, and by debiasing programmes — the Nudge programme (below) assumes that this absorption is incomplete and that biases survive awareness. The honest performativity claim here is narrower than for Black–Scholes: salient, high-stakes presentations can dampen specific biases for trained populations,¹⁷ but the claim that Thinking, Fast and Slow “changed the population it described” overstates the debiasing literature. Performativity in economics — where market participants act on the model and reshape prices — is structural; performativity in cognitive psychology is, at most, an unevenly successful intervention.

This is not the Hawthorne effect. The Hawthorne effect shows that the act of observation changes behaviour during the study. Performativity shows that the published theory changes behaviour permanently, restructuring what it was supposed to describe. In the natural sciences, publishing a paper about black holes does not change how black holes behave. In the human sciences, publishing a paper about how humans make decisions changes how humans make decisions. The feedback loop between theory and phenomenon is structural, not incidental.

The replication crisis is partly a Hawthorne/performativity problem. Many failed replications involve social priming effects — findings that may have been real in a particular cultural moment but dissolved as the population’s awareness of them changed. The distinction between what the human sciences discover and what they produce is genuinely difficult to draw.

Forced Fork: Did the SNB EUR/CHF Floor Produce Knowledge About the Swiss Franc?

The case is in the info-box above. From 6 September 2011 the SNB committed to buy unlimited euros at the floor price of CHF 1.20. The policy worked because participants believed the SNB would defend it; the franc traded at or just above CHF 1.20 for over three years; mortgage products, hedging, exporter pricing, and central-bank reserves were all restructured around it. On 15 January 2015 the SNB abandoned the floor; within minutes the equilibrium price the policy had created disappeared. The earlier Black–Scholes case (treated above in the body) raises the same question in canonical form for options markets in 1973–87.

Position A (yes, the SNB had knowledge): The human sciences can produce knowledge even when their findings shape the phenomena. Medicine does this all the time — a diagnostic category like “post-traumatic stress disorder” changes how patients understand themselves and how clinicians treat them. We do not thereby deny that medicine is a science. The SNB had a calibrated macroeconomic model; the floor implemented a sound monetary-policy decision given the eurozone crisis; the policy was abandoned when its costs (a balance sheet of 85 % of GDP) outweighed its benefits. The fact that the price-level CHF 1.20 was produced by the policy rather than independently discovered by it is a feature of monetary policy generally — every central bank target works this way — and does not disqualify the SNB’s analysis as a piece of economic knowledge. Black–Scholes is the same shape: a model that becomes accurate as it is used is still a model whose accuracy can be verified.

Position B (no, the SNB intervention shows performativity breaks the knowledge claim): There is a fundamental difference between medicine’s therapeutic reshaping of patients and the SNB’s making the franc match a previously fictional level. In the medical case, a real illness pre-exists the category. In the SNB case, the price level CHF 1.20 became an equilibrium because the central bank committed to defend it; on 15 January 2015, the moment that commitment ended, the equilibrium ceased to exist. A “science” whose predictions become true because its practitioners act on them is not describing an independent domain; it is participating in that domain’s construction. The same applies to Black–Scholes: the formula became accurate because traders adopted it, not because it tracked a pre-existing regularity. The SNB’s monetary-policy expertise is a kind of practical knowledge; calling it scientific knowledge in the way that Newton’s mechanics is scientific knowledge is a category mistake.

Choose one. If you pick A, explain why MacKenzie’s finding for Black–Scholes — that the pricing of out-of-the-money options matched the formula only after the formula was published — does not count against the claim that the model was tracking a pre-existing regularity. The same point applies to the EUR/CHF floor: the equilibrium at 1.20 only existed while the SNB defended it. If you pick B, specify what could count as a piece of scientific knowledge about a market or a currency, given that any successful theory will eventually influence trader and central-bank behaviour and thus its own subject matter.

1.7 Questions to Argue About

If humans can change their behaviour in response to theories about them, does this mean the human sciences are fundamentally different from the natural sciences — or just more complicated?
Weber’s Verstehen implies that explaining human action requires grasping its subjective meaning. Can this be done rigorously? Or does it introduce unacceptable levels of interpretive latitude?
The Hawthorne Effect suggests that any observational study of human behaviour is contaminated by the observation itself. Is there any way out of this problem? Or must the human sciences accept it as a permanent limitation?
If social-scientific theories are performative — if publishing them changes the phenomena they describe — does this make the human sciences less objective, or just differently objective from the natural sciences?
Davidson says reasons just are causes (under a particular description). Weber, Winch, and the Verstehen tradition treat reason-explanations and causal explanations as categorially distinct. If you side with Davidson, what is left of Verstehen as a distinctive method? If you side with the Verstehen tradition, what do you say to the charge that you are simply giving causes a different name?

2 Are humans rational?

Mainstream economics, much political and moral philosophy, and most of the law are built on the assumption that humans are rational. Decades of empirical research have shown the assumption to be systematically wrong as a description of how people actually make decisions.

Amos Tversky, Daniel Kahneman, and the Asian Disease Problem

In 1981, Amos Tversky and Daniel Kahneman published a study in Science in which 152 Stanford undergraduates and physicians were presented with a public health scenario.¹⁸ Participants were randomly assigned one of two frames. In the first: Programme A saves 200 people; Programme B has a one-third probability of saving all 600. In the second: Programme C results in 400 deaths; Programme D has a one-third probability of no deaths. Programmes A and C are identical, as are B and D — only the framing differs. In the first frame, 72 per cent chose Programme A; in the second frame, 78 per cent chose Programme D. The preferences were inconsistent and directly contradicted by the participants themselves when the equivalence was pointed out to them. The experiment was conducted with experienced physicians making clinical decisions, who showed the same pattern. The finding was devastating for the homo economicus model of rational choice, which requires consistent preferences across logically equivalent descriptions. Kahneman and Tversky’s prospect theory, which models how people actually make decisions under uncertainty, has since become the most cited work in the history of economics.

2.1 Homo Economicus

Adam Smith’s analysis in The Wealth of Nations (1776) rests on the assumption that individuals act in their own self-interest in ways that, through the mechanism of the market, produce collective benefit — the “invisible hand.”¹⁹ This is not quite the same as assuming people are fully rational, but it became the basis for the model of Homo economicus developed in 19th and 20th century economic theory: an agent who knows their preferences, has consistent beliefs, and makes choices to maximise expected utility.

The model has real theoretical virtues: it is tractable, it generates precise predictions, and it captures something genuine about human motivation in competitive market contexts. It also happens to describe a kind of being that does not exist — not because human beings are irrational in random ways, but because they are irrational in consistent, predictable, and thoroughly documented ways. This distinction matters: if irrationality were random, it would cancel out. The disturbing finding is that it does not.

Two further caveats before Homo economicus gets used as a punching bag. First, working economists rarely treat the rational-agent model as a literal description of psychology — Milton Friedman’s 1953 “as if” methodology argued explicitly that the question is not whether agents are rational but whether their behaviour can be predicted as if they were, and Becker’s economic-imperialism programme extends the model to crime, family, and discrimination on the same instrumentalist grounds. Second, Smith himself was more behaviourally complex than the textbook caricature: the Wealth of Nations grounds market exchange not in maximisation but in “the propensity to truck, barter, and exchange” — and locates motivation in an interplay of self-love, fellow-feeling, and status, not in expected-utility calculation.²⁰

Amartya Sen’s “Rational Fools” (1977) is the canonical philosophical critique that calibrates rather than dismisses rational choice — pointing out that an agent with a single preference ordering, indifferent to commitment and sympathy, is “a social moron.” The behavioural-economics work below refines the rational-choice apparatus; it does not refute it wholesale.

2.2 Kahneman and Tversky

Daniel Kahneman and Amos Tversky spent decades documenting the systematic ways in which human judgement deviates from the predictions of rational choice theory. Their key findings, summarised in Kahneman’s Thinking, Fast and Slow (2011):²¹

Anchoring. When asked to estimate a quantity (the population of Chicago, the length of the Amazon), people’s estimates are heavily influenced by arbitrary numbers they were exposed to just before — even when those numbers were generated by a roulette wheel in front of them and they know the numbers are random. Rational agents should not be influenced by irrelevant anchors. (Original demonstration: Tversky and Kahneman, “Judgment Under Uncertainty: Heuristics and Biases,” Science 185, 1974.)²²

Loss aversion. People weigh losses approximately twice as heavily as equivalent gains. A prospect of losing £100 is more aversive than a prospect of gaining £100 is attractive. This violates the symmetry required by standard utility theory and produces a range of irrational choice patterns (status quo bias, endowment effect, risk aversion in the domain of gains, risk-seeking in the domain of losses). (Original account: Kahneman and Tversky, “Prospect Theory: An Analysis of Decision Under Risk,” Econometrica 47.2, 1979.)²³

Availability bias. People estimate the probability of events by how easily examples come to mind. Plane crashes are overestimated relative to car accidents because they are more memorable; deaths by lightning are underestimated relative to deaths by falling furniture. Rational Bayesian agents should estimate probability from base rates, not from availability. (Original demonstration: Tversky and Kahneman, “Availability: A Heuristic for Judging Frequency and Probability,” Cognitive Psychology 5, 1973.)²⁴

Framing effects. Whether a decision is framed as a gain or a loss changes what people choose, even when the options are logically equivalent. A medical treatment described as “90% survival rate” is chosen more often than the same treatment described as “10% mortality rate.”

A caveat that matters. The four findings above — products of Kahneman and Tversky’s 1970s heuristics-and-biases programme — have weathered the replication crisis comparatively well. What has not survived is the social priming literature that grew on top of them in the 1990s and 2000s: Bargh’s “elderly priming” walking-speed studies, the “Florida effect,” lexical primes shifting downstream behaviour. Kahneman gave the stratum a full chapter in Thinking, Fast and Slow under the heading “The Marvels of Priming.”

Richard Thaler and Cass Sunstein’s Nudge (2008)²⁵ applies Kahneman and Tversky’s findings to policy: if people are systematically irrational in predictable ways, governments can “nudge” behaviour toward better outcomes by changing the context of choice (default options, framing, salience) without restricting freedom. This raises its own ethical questions about paternalism and manipulation.

Doyen and colleagues’ 2012 failure to replicate Bargh, the wider replication crisis, and Kahneman’s own 2017 open letter conceding that the priming chapter rested on shaky evidence have together pulled that stratum apart.²⁶ On the standard reading of the resulting picture, the judgment-under-uncertainty findings sit on firmer ground than the priming findings, and the two strata should be evaluated separately. On a more sceptical reading (Gigerenzer, parts of the small-effects literature), the same selection pressures that made priming look strong also touch the heuristics-and-biases material; how far the doubt spreads is itself contested.

2.3 The 2008 Financial Crisis

The 2008 financial crisis is often cited as a real-world stress-test of rational-agent models. The crisis was produced by mortgage-backed securities rated AAA by agencies whose models assumed US housing prices would not fall nationally (they had not done so since the 1930s); banks holding far more risk than their capital reserves could absorb; and a global financial system in which closely related models were being used by nearly every major institution at the same time. The acute phase ran from the failure of Bear Stearns (March 2008, sold to JPMorgan with Federal Reserve assistance) through Lehman Brothers’ Chapter 11 filing on 15 September 2008 — the largest bankruptcy in US history at over $600 billion in assets — to the $85 billion Federal Reserve loan to AIG the following day, which prevented a chain failure of credit-default-swap counterparties.²⁷

A careful description matters here. The operative models did not, strictly, assume agents would “behave rationally” in the philosophical sense. Ratings models assumed historical default correlations would continue to hold — a statistical assumption about the data, not a psychological assumption about agents. Bank risk models (Value-at-Risk, Gaussian copula pricing) used distributional assumptions that systematically underweighted tail events. The misjudgement that mattered was about the joint distribution of housing-market outcomes, not about any trader’s rationality. Some economists — Raghuram Rajan in 2005, Nouriel Roubini from 2006 — warned publicly, using broadly orthodox tools, that the system was accumulating tail risk; the failure was as much institutional and incentive-driven as it was a failure of the rational-agent abstraction.²⁸

Kahneman: “The illusion that we understand the past fosters overconfidence in our ability to predict the future.”²⁹ The traders, bankers, and economists who believed their models were, in many cases, genuinely overconfident — not dishonest. The models were seductive because they were elegant, mathematically sophisticated, and had worked well for a period when the underlying assumptions happened to hold.

2.4 The Evolutionary Function of “Irrational” Behaviour

Framing cognitive biases as “errors” or “bugs” presupposes a benchmark of rationality — the rational agent of economic theory — and measures human cognition against it. But this framing has an alternative: what if the biases are adaptations, behaviours that were functionally rational in the environments in which they evolved?

A warning before any example. Reverse-engineering specific cognitive biases to specific ancestral selection pressures is the most overreached move in evolutionary psychology. “Ancestral environment” is not a single thing; it covers hundreds of thousands of years across radically different ecologies. The evidence that any particular contemporary bias was selected for a particular function in that period is almost always indirect, and the field’s track record of confidently reverse-engineering specific adaptations has been poor. The points below are plausibility sketches, not established causal stories. Hold them accordingly.

Loss aversion is the usual example. A hypothesis: in a foraging context where losing your food cache can cost you your life and missing an extra cache cannot, a weighting of losses over gains would be functionally rational rather than a cognitive error — and the bias is only plainly irrational when applied to financial decisions in environments where the payoffs are symmetric. This is a plausible story and no more; direct empirical support that loss aversion was specifically selected on this pressure is limited.

The availability heuristic — judging the probability of events by how easily examples come to mind — is accurate in environments where memory is a good proxy for frequency, and arguably fails in modern media environments where salience and frequency are systematically decoupled (plane crashes are rare but memorable; car accidents are common but unremarkable). Again: this is a functional story about current performance across environments, not a verified account of what was selected for.

Anchoring — adjusting estimates insufficiently from an initial reference point — may reflect a general principle of efficient computation: start from the best available estimate and adjust, rather than constructing every estimate from scratch. The failure is not the principle but its application in contexts where the anchor is arbitrary or deliberately manipulated.

Robert Sapolsky (Behave, 2017)³⁰ argues that to understand any human behaviour, you must understand it at multiple levels simultaneously: the immediate neurological cause, the hormonal context of the past hours, the developmental history of the individual, and the evolutionary history of the species. Sapolsky’s analogy — calling a behaviour “irrational” without asking what selection pressure produced it is like calling a leg “irrational” for being poorly designed for swimming — is more rhetorical than evidential, since it licenses the conclusion only when the evolutionary etiology of the bias is established, which (per the warning above) it usually is not. What survives is a methodological prescription: do not call a behaviour an “error” without specifying the benchmark of correct functioning, and on what grounds that benchmark is taken to be relevant.

This does not vindicate the biases. Loss aversion still produces bad financial decisions in modern markets. The availability heuristic still leads to poor risk assessment in media-saturated environments. But understanding that they are adaptations rather than mere failures tells us something important: they will not be easily corrected by information or argument alone, because they are not produced by information-processing failures. Changing the context of decision-making (Thaler and Sunstein’s “nudge”) is more effective than changing the content of reasoning, precisely because the biases are structural rather than informational.

2.5 Questions to Argue About

Kahneman distinguishes “System 1” (fast, intuitive) and “System 2” (slow, deliberate) thinking. Is irrational behaviour just System 1 operating in the wrong context? Or is it something more fundamental?
Are cognitive biases like loss aversion and anchoring “bugs” in human cognition — features that should be corrected — or “features” that served adaptive purposes in our evolutionary environment?
The 2008 crisis was partly caused by economists’ models of rational behaviour. Does this mean those models are simply false? Or were they appropriate in some contexts and misapplied in others?
If people are systematically irrational, should governments use “nudges” to steer them toward better decisions? Or is this paternalistic in a way that undermines autonomy?

Forced Fork: Was the 2008 Crisis a Refutation of Homo Economicus?

Consider the US subprime mortgage market 2004–2008. Borrowers took mortgages they could not service on the assumption of continuously rising house prices. Lenders extended credit on the assumption that mortgages could be sold on. Rating agencies assigned AAA ratings to packages of subprime loans on the assumption that default correlations would remain historically low. When house prices stopped rising in 2006, every assumption failed simultaneously. The question: was the rational-agent framework adequate to the task it was being asked to do?

Position A (no — replace it with a Minsky-Keynesian framework of endogenous instability): Hyman Minsky’s Stabilizing an Unstable Economy (1986) argued that financial systems generate their own crises through a hedge-financing → speculative-financing → Ponzi-financing cycle that no rational-agent model can capture, because the cycle depends on the rationality of each individual decision coexisting with the systemic irrationality of the asset-price spiral every individual decision contributes to. Minsky predicted exactly the structure of the 2008 crisis in 1986 (and was ignored for two decades). The post-2008 turn to systemic-risk modelling, macroprudential regulation, and stress-testing is in effect a replacement of the rational-agent framework with a Minsky-Keynesian framework, even where economists who sponsored the change continue to call themselves rational-expectations theorists.

Position B (the crisis was a regulatory and incentive failure within an adequate framework): 2008 was not a failure of the rational-agent model; it was a failure of regulation and incentive-alignment in institutions that knew perfectly well they were not operating under the model’s conditions. Each actor was behaving rationally by their own lights — the borrower, the lender, the rater. The model’s prediction failed in aggregate because the conditions for aggregation (risk internalised by the actor; accurate price signals) had been engineered away by financial structures that externalised default risk to taxpayers. The right response is to repair the conditions, not to replace the framework. The Minsky-style claim that crises are endogenous to capitalism in a way no rational-agent model can capture is a stronger metaphysical claim than the 2008 evidence supports.

Choose one. Position A must say what Minsky’s framework would have predicted in 2002 (when the bubble was inflating) that the rational-agent framework did not, beyond Minsky’s general theoretical statement. Position B must explain what reform would have made the rational-agent model applicable in the mortgage market without effectively replacing it with behavioural economics under a different name.

3 What can experiments tell us about human nature?

Ordinary people, under specific contrived laboratory conditions, do things they would have been confident they would never do — or so a famous run of mid-twentieth-century experiments was taken to show. The laboratory method, applied to human behaviour, isolates variables and (in principle) supports causal rather than merely correlational claims. Its findings have been treated as some of the most disturbing in social science. They are also contested: Le Texier’s 2018 archival reconstruction has substantially impeached the canonical reading of Zimbardo’s prison experiment, and the canonical reading of Milgram has been challenged by Perry’s archival work and by the Haslam–Reicher “engaged followership” reanalysis (though Burger’s 2009 partial replication cuts in the other direction). The popular take-aways from this literature have routinely outrun what the underlying experiments can support; how far the underlying findings themselves survive is what the next generation of historians of psychology is sorting out.

The Power Pose Studies and the Chain from Laboratory to Policy

In 2010, the social psychologist Amy Cuddy, together with Dana Carney and Andy Yap, published a paper claiming that holding an expansive, “high-power” posture for two minutes before a stressful evaluation increased testosterone, decreased cortisol, and improved performance.³¹ The study was based on 42 participants. Cuddy’s subsequent TED talk, “Your Body Language May Shape Who You Are,” became the second most viewed TED talk in history, with over 68 million views.³² The talk’s recommendations were adopted by career coaches, business schools, and job preparation programmes worldwide. In 2015, Eva Ranehill and colleagues attempted a pre-registered replication with 200 participants and found no hormonal effects.³³ In 2016, Dana Carney — one of the original authors — publicly stated that she no longer believed the power pose effect was real.³⁴ The episode illustrates the specific mechanism by which the replication crisis propagates harm: a study with 42 participants generates a headline finding; a TED talk reaches 68 million people; career advice based on unreplicated research enters millions of job interviews; and by the time the failure to replicate is established, the advice has become unchallengeable common sense.

3.1 Milgram’s Obedience Experiments

Stanley Milgram designed an experiment in 1961–1962 in the basement laboratory of Linsly-Chittenden Hall at Yale University, published in 1963, that became one of the most replicated and discussed studies in psychology (“Behavioral Study of Obedience,” Journal of Abnormal and Social Psychology 67.4, 1963).³⁵ Participants — forty male volunteers per condition, recruited by newspaper advertisement and paid $4.50 — were told they were part of a Yale study on the effects of punishment on learning. Each played the role of “teacher,” while a confederate (a mild-mannered accountant named Mr Wallace) played the role of “learner,” strapped to a chair in an adjacent room with electrodes attached to his wrist. The teacher was instructed to administer electric shocks to the learner using a fake shock generator with thirty switches — labelled in 15 V increments from “Slight Shock” (15 V) through “Danger: Severe Shock” (375 V) to an unmarked “XXX” (450 V) — whenever the learner gave a wrong answer on a paired-associates word task. The shocks were fake; the learner’s protests, screams, and pounding on the wall were tape-recorded scripts standardised across participants; at 300 V the learner pounded on the wall and demanded to be released, and from 315 V onward fell silent — a silence the experimenter instructed the teacher to treat as a wrong answer.

When the experimenter (dressed in a grey laboratory coat) instructed the teacher to continue in the face of the learner’s apparent distress, 65% of participants in Milgram’s first variation administered what they believed were potentially lethal shocks at the maximum voltage. Milgram expected this rate to be around 1–2%.³⁶ The 65% figure is the one most often quoted, but Milgram ran eighteen variations of the experiment and obedience rates ranged from 0% (when two confederates rebelled) to 92.5% (when the teacher only had to relay orders rather than press the shock lever). Gina Perry’s 2013 archival reconstruction shows that Milgram and his collaborators were selective in which conditions they emphasised. The phenomenon is real; the canonical figure is one slice of it.

What does this prove? Several interpretations are in play:

That ordinary people will commit atrocities when instructed to by legitimate authority (the “situationist” interpretation).
That people in ambiguous situations defer to those who appear to know what they are doing.
That human moral psychology is more fragile and context-dependent than we like to believe.

The Milgram studies have been heavily criticised on ethical grounds: participants experienced significant distress, and many reported lasting psychological effects. The APA revised its ethical guidelines substantially in response. Whether the knowledge gained — about human obedience — justifies the harm to participants is a question without an agreed answer.

Milgram himself connected his findings to the Holocaust — ordinary Germans obeying orders. Hannah Arendt’s phrase “the banality of evil” (from Eichmann in Jerusalem, 1963, published the same year as Milgram’s study)³⁷ addresses the same phenomenon.

3.2 The Stanford Prison Experiment

Philip Zimbardo’s Stanford Prison Experiment, run over six days in August 1971 and funded by the US Office of Naval Research,³⁸ ³⁹ assigned twenty-four male student volunteers — selected from seventy-five applicants screened as the most psychologically stable — randomly to the roles of “guards” (n=12, working three at a time in eight-hour shifts) and “prisoners” (n=12) in a mock prison constructed in the basement of Jordan Hall, the Stanford psychology department. Prisoners were “arrested” at their homes by the Palo Alto police, booked, and dressed in numbered smocks; guards were issued khaki uniforms, mirrored sunglasses, and wooden batons (which they were told not to use). The experiment was scheduled to last two weeks; it was terminated after six days when it became clear that the “guards” were engaging in increasingly sadistic behaviour toward the “prisoners” — including forced exercise, sleep deprivation, and humiliation — and the “prisoners” were showing signs of genuine psychological distress, with five released early after acute breakdowns.

What Zimbardo claimed. The situation — the roles, the uniforms, the institutional context — produced the behaviour, not the individual characters of the participants. The same people, randomly assigned to different roles, behaved differently. For decades this was the canonical demonstration of situationism: character is weak; context is strong.

What the evidence actually supports. The canonical reading does not survive scrutiny. Jonah Lehrer’s investigations and later researchers (notably Thibault Le Texier’s archival work, published in French in 2018 and in American Psychologist in 2019)⁴⁰ revealed that Zimbardo had coached his “guards” to be tough at the experiment’s start; that he himself occupied the role of “prison superintendent” that made him an active participant, not a neutral observer; that the “prisoners” who broke down most dramatically may have been performing the distress they took to be expected of them; and that the experiment’s protocols were not remotely adequate for extracting a causal claim about situational power. The study is now regarded as methodologically flawed and ethically compromised. The weaker conclusion it can still support: strong institutional roles, plus an active authority figure shaping behaviour, can produce cruelty faster than participants expect — which is worth knowing, but is a much narrower claim than “situation determines character.”

3.3 The Replication Crisis

Beginning around 2011, it became clear that a significant proportion of landmark psychology experiments could not be reproduced. The Open Science Collaboration’s systematic replication attempt (2015) reproduced only 39% of 100 published psychology studies.⁴¹ Among social psychology studies — including the kind of high-profile situationist work represented by Milgram and Zimbardo — the replication rate was lower still.

The structural causes of the replication crisis are well understood: p-hacking (analysing data in multiple ways and reporting only the one that reaches statistical significance); small samples that produce unstable estimates; publication bias (journals publish positive results, not failed replications or null findings); and questionable research practices (deciding to stop collecting data once a significant result emerges, adding or removing participants after data collection begins).

The epistemological implication: a published finding in a peer-reviewed psychology journal is not reliable evidence of a true effect. That does not mean psychology finds nothing — but it requires a more sceptical reading of the literature than was previously common.

A parallel methodological response in economics — what is sometimes called the credibility revolution — has tightened what counts as a defensible causal claim from observational data. Joshua Angrist and Jörn-Steffen Pischke, in Mostly Harmless Econometrics (2009), make the benchmark explicit: a randomised trial is what you would like to run; absent that, the researcher’s job is to find a “natural experiment” — some accident of policy, geography, or biology that assigns the variable of interest as if at random.⁴²

Cross-reference: The Natural Sciences unit’s lesson “Can science be objective?” covers the replication crisis as it affects both natural and human sciences, citing the Nature 2016 survey. The two treatments describe the same underlying problem from different disciplinary angles — read them together for the full picture.

Their canonical example is Angrist and Krueger’s 1991 study. Compulsory-schooling laws set a minimum age at which children may leave school; depending on quarter of birth, two children with the same intended school-leaving age end up with slightly different total years of schooling — the calendar, not their ambition, decides the difference. That accident of birth-month is what econometricians call an instrument: a variable that nudges schooling without itself depending on family background, ability, or motivation, and so lets the researcher read off schooling’s effect on adult wages without those confounders contaminating the estimate. The credibility revolution’s defenders take this to specify the narrow conditions under which observational social science can establish causation. Its critics — Angus Deaton on the imperialism of randomised controlled trials, James Heckman on external validity, structural and ethnographic schools more broadly — argue that the conditions are too narrow to recover most questions the human sciences need to answer, and that the methodological prestige of the technique has narrowed what economists ask.

3.4 Questions to Argue About

Milgram showed that 65% of ordinary people will administer apparent lethal shocks when instructed by an authority figure. What exactly follows from this? Does it tell us something general about human nature, or something about behaviour in highly specific contexts?
The Stanford Prison Experiment has been criticised as methodologically flawed and ethically compromised. But its findings — that role assignment shapes behaviour — are widely cited. Can flawed studies produce valid insights?
The replication crisis shows that many landmark findings don’t hold up. Does this mean the human sciences have been producing false knowledge, or just that they are correcting themselves over time?
What ethical constraints should govern psychological experiments? Is there any level of harm to participants that can be justified by the knowledge the experiment produces?

Forced Fork: Does Milgram Show That Situation Determines Action?

Position A (a strong-but-bounded situationist reading): Even after Perry’s archival critique and the partial replications (Burger 2009, capped at 150 V on ethical grounds, found 70% continuing past that threshold — not directly comparable to Milgram’s headline 65% to 450 V, but close to Milgram’s own 82.5% pass-through at the 150 V mark),⁴³ the core finding survives: ordinary people, given legitimate-seeming authority, gradient compliance demands, and diffused responsibility, participate in harm at rates that radically exceed their own and observers’ predictions. The historical record of bureaucratic atrocity — Eichmann, the Wehrmacht’s “ordinary men,” the Rwandan génocidaires — is broadly consistent. We over-explain evil by reference to evil people; situational and institutional engineering is the lever that actually moves the dial.

Position B (situationism overstated; character-by-situation interaction is the real finding): The “65%” headline obscures the more interesting datum: obedience rates ranged from 0% to 92.5% across Milgram’s eighteen variations, depending on small features of the setup (whether the experimenter was physically present, whether peer confederates rebelled, whether the teacher had to press the lever themselves). This is not a finding that “situation determines character”; it is a finding that specific situational features interact with individual disposition to produce specific behavioural outcomes. The post-2010 reanalysis literature — Haslam and Reicher in particular — argues that participants who obeyed were not “submitting to authority” but identifying with the experimenter’s project of producing scientific knowledge; the operative variable was perceived legitimacy, not authority per se.⁴⁴

Choose one. If you pick A, name a situational lever you would build into institutional design (military, medical, corporate) on the strength of Milgram, and say what evidence would convince you the lever does not in fact reduce harm. If you pick B, explain what survives of the Milgram research programme once the strong situationist reading is given up — and whether the residue is robust enough to do the moral work that Milgram, Arendt, and the Holocaust-explanation literature have asked it to do.

4 Can the human sciences make predictions?

If the human sciences cannot predict anything, it is unclear in what sense they are sciences rather than sophisticated storytelling about the past. The natural sciences make their reputation on prediction — general relativity tells us in advance where starlight will bend; quantum mechanics tells us in advance what particles will do. If the human sciences can also predict, the predictions should be checkable. If they cannot, we should be honest about what they are instead. The record, when examined honestly, is mixed at best — and in some cases, actively instructive about the difficulty of the enterprise.

Philip Tetlock’s Forecasting Tournaments and the Expert Problem

In 1984, the political scientist Philip Tetlock began a study that would run for twenty years, asking political experts to make specific, verifiable predictions about future events — election outcomes, economic trends, wars, diplomatic agreements. The sample eventually included 284 experts making nearly 28,000 forecasts. Tetlock published the results in Expert Political Judgment (2005).⁴⁵ The findings were uncomfortable: the average expert barely outperformed a simple random baseline, and was consistently worse than basic statistical models. Worse, the most prominent experts — those with regular media appearances and confident public pronouncements — performed no better than their less visible colleagues. Tetlock identified two thinking styles: “foxes” who drew on many frameworks and held views tentatively, and “hedgehogs” who organised everything around a single big idea and expressed views with great confidence. Foxes outperformed hedgehogs substantially. The study prompted a US intelligence community programme (the Good Judgment Project) that trained civilians in probabilistic forecasting; top “superforecasters” in the programme consistently outperformed professional intelligence analysts with access to classified information.⁴⁶ The lesson was not that prediction is impossible, but that calibrated uncertainty — knowing what you don’t know — is more epistemically valuable than confident expertise.

4.1 Economic Forecasting

The discipline of economics has a poor record of predicting economic crises. The 2008 financial crisis was not predicted by most mainstream economists; the 2020 COVID-19-induced recession was genuinely unpredictable, but the depth and duration of the subsequent recovery was again misjudged by most forecasters. The IMF, the World Bank, and national treasury departments routinely produce GDP forecasts that are significantly wrong within two years.

This failure has been explained in several ways. Complexity: economies are systems with billions of interacting agents, each responding to expectations about what others will do; small changes in initial conditions produce large differences in outcomes (sensitivity to initial conditions). Self-defeating prophecy: economic forecasts affect the behaviour of agents who read them, changing the conditions being forecast. Model inadequacy: standard macroeconomic models assume rational agents, smooth adjustment processes, and equilibrium — assumptions that fail catastrophically during crises.

The physicist-turned-financial-theorist Jean-Philippe Bouchaud has argued that economics suffers from “physics envy” — the desire to achieve the precision and predictive power of physics using models that are not accurate descriptions of economic reality.⁴⁷ The mathematical sophistication of the models can be inversely related to their real-world accuracy.

4.2 Malthus and the Demographic Prediction That Failed

Thomas Malthus, in An Essay on the Principle of Population (1798),⁴⁸ argued that population growth inevitably outstrips food production (because population grows geometrically while food production grows arithmetically), leading to periodic catastrophes of famine, disease, and war that restore population to sustainable levels. His prediction: given 18th-century trajectories, mass starvation was mathematically inevitable.

The Malthusian failure is not evidence that the human sciences cannot make predictions. It is evidence that linear extrapolation of current trends fails when structural changes occur. The interesting question is whether structural changes themselves are predictable — and whether cliodynamics or related approaches can identify when they are coming.

The prediction failed. Why? Malthus did not anticipate the Agricultural Revolution, the Industrial Revolution, the Green Revolution of the 1960s, or the demographic transition (the consistent finding that as societies industrialise, birth rates fall). Each of these involved non-linear changes that were not extrapolations of existing trends. Historical prediction based on extrapolating current trends fails when the conditions generating those trends change.

4.3 Cliodynamics

Peter Turchin, in Ages of Discord (2016) and earlier work, proposes a mathematical approach to historical dynamics he calls cliodynamics.⁴⁹ His model identifies two cycles of political instability: a roughly 50-year cycle driven by generational dynamics, and a longer “secular cycle” of 200–300 years driven by elite overproduction (growing numbers of elite aspirants competing for a limited number of elite positions) and popular immiseration (the declining share of income going to labour). When both cycles peak simultaneously, political violence increases.

In 2010, Turchin predicted that the US would experience severe political instability around 2020.⁵⁰ The 2020 racial justice protests, the COVID-19 political crisis, and the January 6, 2021 Capitol attack have been cited as confirmations.

The sceptical objections: the prediction is broad enough to encompass a range of outcomes; the model has few genuine out-of-sample tests; the categories (what counts as “political instability,” “elite overproduction”) involve significant interpretive choices; and there is a risk of selection bias — selecting historical cases that fit the model while ignoring those that don’t. Turchin’s response is that his model makes quantitative predictions (with confidence intervals) that can be tested, not just post-hoc rationalisations.

4.4 Popper’s Challenge

Popper argued that the social sciences should be judged by the same falsifiability standard as natural science: produce falsifiable predictions, test them, and revise or abandon theories that fail. He argued, in The Poverty of Historicism (1957),⁵¹ against any social science that makes claims about inevitable historical trajectories (he was targeting Marxist historical materialism). Grand historical laws, he argued, cannot be tested and should be abandoned.

The question is whether Popper’s criterion is appropriate for the human sciences, given their subject matter. History does not repeat; controlled experiments are rarely possible; the “ceteris paribus” conditions required for a clean test are almost never satisfied. Perhaps the human sciences require different standards of rigour that are adapted to their subject matter rather than imported from physics.

4.5 Questions to Argue About

Economic forecasting has a poor record. Does this mean economics is not a science? Or that it is a science of a different kind from physics?
Malthus’s prediction failed because he didn’t anticipate structural change. Is structural change in principle predictable? Or does this mark a fundamental limit on social science prediction?
Turchin predicted US political instability around 2020. The events of 2020–21 seem to confirm this. Is this genuine scientific prediction, or is the model flexible enough to accommodate any outcome?
Popper argues that social sciences should make falsifiable predictions or stop claiming to be sciences. Is this the right standard for the human sciences? What alternative criteria of rigour might be more appropriate?

Forced Fork: Was Turchin’s 2020 Prediction Science?

In Secular Cycles (2009)⁵² and a 2010 Nature commentary, Peter Turchin predicted a peak in US political instability around 2020. The indicators he cited — elite overproduction, rising inequality, declining state capacity — all moved in the direction he forecast, and 2020–2021 produced the January 6 Capitol assault, the largest mass protests since 1968, and several months of visibly elevated political violence.

Position A (it was science; cliodynamics has a track record the social sciences should respect): Turchin specified in 2010 — a decade before the event — a mechanism (elite overproduction + state fiscal stress), a rough window (around 2020), and indicators that had to rise in a specific pattern for the model to be on track. All three moved as predicted. The cliodynamic methodology has a wider track record than the single 2020 case: Jack Goldstone’s Revolution and Rebellion in the Early Modern World (1991) had earlier identified the same demographic-fiscal cycle structure across the seventeenth-century crises (English Civil War, Fronde, Ottoman Celali revolts); Turchin’s own out-of-sample fits to the 1870s and 1920s US cycles were on the record before 2020. The prediction is at least as precise as many in climate science or epidemiology that no one disputes as scientific. The social sciences can make reliable predictions when they work with structural indicators over long time horizons.

Position B (it was sociologically informed storytelling, not prediction): Turchin’s “prediction” lacked exactly the feature that distinguishes a scientific prediction from a lucky gesture. A scientific prediction specifies in advance what would count as its disconfirmation. Turchin’s framework is flexible enough that many alternative outcomes in 2020 — low-level polarisation, a different kind of instability, no January 6 — could equally have been presented as consistent with the secular cycle. And on the output side: instability in 2020 is overdetermined by the pandemic, by inequality, by Trump specifically, by social media — each of which is a live alternative explanation that Turchin’s model cannot adjudicate. Calling this a successful scientific prediction is a post-hoc narrative dressed as forecasting.

Choose one. If you pick A, specify a second Turchin-style prediction, made in advance today, that you would take as a genuine test of his framework over the next decade — and the indicator changes that would count as its failure. If you pick B, explain what a legitimate prediction in the domain of political instability could look like, given the genuine difficulty of isolating variables in complex historical systems.

5 How do culture and society shape what we know — and what we study?

The human sciences study human beings. But human scientists are also human beings — members of particular societies, shaped by particular educational systems, funded by particular institutions. This sounds like a truism, the sort of thing you could put on a poster with a sunset photograph. It is not a truism. It is a specific claim with specific consequences: their research is conducted on particular populations, using methods developed in particular cultural contexts, and this creates a systematic risk of mistaking culturally specific findings for universal truths — then building entire disciplines on the mistake.

The Mead-Freeman Controversy and the Limits of Fieldwork

In 1928, Margaret Mead published Coming of Age in Samoa,⁵³ based on nine months of fieldwork on the island of Ta’ū. The book argued that Samoan adolescent girls experienced sexuality freely and without the psychological turbulence characterising American adolescence — evidence, Mead concluded, that the storm and stress of Western teenage life was culturally constructed rather than biologically inevitable. It sold millions of copies and became foundational to cultural anthropology and progressive education. In 1983, the New Zealand anthropologist Derek Freeman published Margaret Mead and Samoa,⁵⁴ arguing that Mead had been systematically misled by her informants — who told her what she wanted to hear — and that Samoan adolescence was in fact characterised by considerable constraint, jealousy, and violence. Freeman has himself been challenged: Paul Shankman and others argue that Freeman selectively used his own evidence and that the strongest “hoaxing” version of his thesis is not supportable from the archival record.⁵⁵ Where matters now stand: Mead’s specific factual claims were probably overstated; Freeman’s complete reversal probably overstates the correction. The dispute does illustrate a structural feature of fieldwork both sides accept: the observer’s positioning (Mead’s gender, youth, nationality; Freeman’s later arrival, gender, theoretical commitments) shapes what informants choose to tell them and what gets recorded as data. The WEIRD bias — the systematic over-reliance on Western, Educated, Industrialised, Rich, Democratic samples in psychology — is the same observer-positioning problem scaled up to a discipline.

5.1 WEIRD Bias

In a landmark 2010 paper, Joseph Henrich, Steven Heine, and Ara Norenzayan documented what they called the WEIRD problem:⁵⁶ in a survey of papers in six top psychology journals between 2003 and 2007, 96% of subjects came from Western, Educated, Industrialised, Rich, and Democratic societies — populations that represent roughly 12% of the world. (The figure is sample-specific; the broader pattern is well-attested across reviews.)

More importantly, WEIRD populations are outliers on many psychological dimensions, not typical humans. The Müller-Lyer illusion (see Core) has a much larger effect on WEIRD populations than on populations who live in environments without rectangular rooms and straight lines. Concepts of fairness, as measured by standard economic games (the Ultimatum Game), vary dramatically across cultures in ways that the WEIRD literature had not captured. Self-construal (the degree to which people define themselves as independent agents vs. members of groups) varies substantially across cultures and shapes fundamental cognitive patterns.

The implication Henrich and colleagues draw: psychological universals claimed on the basis of WEIRD samples may not be universal at all but features of a particular type of society. The implication critics of the WEIRD argument draw (Norenzayan in his subsequent work; Apicella and others) is more limited: WEIRD samples bias the generality of the claims that can be made, but not every finding is equally affected, and several of the cases the critics cite (basic perceptual constancies, fundamental memory architecture) do hold up cross-culturally. How wide the WEIRD problem actually is remains an empirical question being worked through.

5.2 Margaret Mead and Derek Freeman

Margaret Mead’s Coming of Age in Samoa (1928) claimed that adolescent sexual behaviour in Samoa was largely free of the storm and stress characterising Western adolescence, demonstrating that what Americans took to be natural features of puberty were culturally determined. The book was a landmark of cultural anthropology and of progressive social thought: it provided evidence that human nature was malleable, that culture rather than biology determined behaviour.

In 1983, Derek Freeman published Margaret Mead and Samoa: The Making and Unmaking of an Anthropological Myth, arguing that Mead had been systematically misled by her informants, who told her what she wanted to hear, and that Samoan adolescence was not in fact unusually harmonious. Freeman’s critique was itself contested: he was accused of political motivation (he was a critic of cultural determinism), of selectively using his own evidence, and of misrepresenting Mead’s actual claims.

The Mead-Freeman debate was partly about facts and partly about theoretical priors. Mead was a cultural determinist who found evidence for cultural determinism. Freeman was a biological relativist who found evidence against it. Neither was purely disinterested. This is not a scandal; it is how science works — but it is also why replication and independent verification matter.

The controversy is epistemologically instructive not because it proves Mead wrong (the jury is still out on specific factual claims) but because it illustrates the structural vulnerabilities of anthropological fieldwork: the observer’s expectations shape what they observe; fieldwork is conducted under conditions that make independent verification difficult; and the theoretical commitments of the field shape what kinds of findings get published and cited.

5.3 Relativism and Universalism

The question that the Mead-Freeman debate frames: is there a universal human nature, or are humans sufficiently plastic that culture determines everything important?

Relativism: human behaviour, values, and ways of knowing are culturally specific. What counts as rational, what counts as beautiful, what counts as a good life — these vary so much across cultures that they cannot be measured by a single universal standard. The attempt to apply universal standards is typically an exercise of cultural imperialism.

Universalism: the anthropological evidence for cultural variation is real but overstated. Some things do not vary — basic emotional expressions, certain cognitive capacities, a range of social structures and prohibitions. The debate is not between variation and universality but about where the limits of variation lie.

Paul Ekman’s research on facial expressions of emotion (1971) claimed to find six universal basic emotions (happiness, sadness, fear, disgust, anger, surprise) expressed in culturally invariant ways.⁵⁷ This has been challenged by Lisa Feldman Barrett’s theory of constructed emotion, which argues that emotional expressions are more culturally variable than Ekman claimed.⁵⁸ The dispute is ongoing.

A note on how live this binary actually is. Working anthropology has largely moved past the relativism-versus-universalism stand-off that TOK textbooks treat as the central question. After Geertz’s interpretive turn, after Said’s Orientalism, and after the reflexive critique of “the native point of view,” most contemporary fieldworkers practise something closer to calibrated humility — a methodological assumption that observed differences require careful explanation before judgement, without thereby committing to incommensurability or to the claim that no cross-cultural comparison is possible. Webb Keane’s Ethical Life (2016) and the late work of Marshall Sahlins are representative; the Cultural Anthropology journal has run a long line of “Beyond the Native Point of View” debates in the same direction.⁵⁹ The binary is still useful as a TOK exam frame, because it sharpens the issues; but treating it as the live state of the discipline overstates how cleanly the field divides.

5.4 Questions to Argue About

The WEIRD bias shows that psychology has been generalising from a non-representative population. Does this invalidate existing psychological findings, or only make them less general?
Margaret Mead found evidence for cultural determinism because she was looking for it. Does this mean anthropological fieldwork is inevitably circular? Or are there methods that can discipline it?
Is anthropological relativism — the principle that cultures should be understood on their own terms — an epistemological principle (the most accurate method) or an ethical one (cultures deserve respect)? Does the distinction matter?
If some psychological universals (like the Müller-Lyer illusion’s effects) vary across cultures, what remains of claims about universal human nature? How would you distinguish genuine universals from WEIRD generalizations?

Forced Fork: What Does the Müller-Lyer Cross-Cultural Finding Show?

The Müller-Lyer illusion — two lines of identical length, one with arrowheads pointing outward and the other with arrowheads pointing inward — is one of the most-tested visual illusions in psychology. When Segall, Campbell, and Herskovits tested it across cultures in the 1960s,⁶⁰ they found that American college students showed strong susceptibility (the lines “look” very different in length), while people from the San peoples of the Kalahari showed almost none. The effect has been replicated many times since. The question: what is the right generalisation from these data?

Position A (the WEIRD critique is broadly right; most “universal” psychology is not universal): The Müller-Lyer result is the clean case: an effect that every American textbook reported as a fact about the human visual system turned out to be a fact about visual systems trained on a carpentered environment (rectangular buildings, intersecting lines, perspective art). A hundred years of psychology generalised from a biased sample and called it human nature. The result is not an isolated oddity — similar WEIRD-vs-non-WEIRD gaps have been found for other perceptual and cognitive tests. The conclusion is painful but straightforward: psychology’s corpus, built on WEIRD subjects, is largely a corpus about WEIRD people, and needs a very substantial correction before any finding is assumed to generalise.

Position B (the Müller-Lyer finding is not as damaging as Position A claims): The Müller-Lyer result shows that one specific perceptual effect depends on environmental experience. It does not show that the underlying visual processing mechanism is different across cultures — it shows that the mechanism’s calibration depends on input. This is what you would expect from any trainable perceptual system. The universal is the structure of how visual cortex learns to infer depth from 2D cues; the variation is the content it learns from. WEIRD-sample findings about underlying mechanisms — not surface effects — mostly do generalise, and the handful of failures gets disproportionate attention because they are more publishable than confirmations.

Choose one. If you pick A, give a second well-known psychological finding that you believe is probably not universal, and say why. If you pick B, specify which WEIRD-sample finding you would stake on as a structural universal, and what cross-cultural disconfirmation would have to look like for you to give it up.

6 Is there a human nature?

Aristotle argued that “the city belongs to the class of things that exist by nature, and that man is by nature a political animal” — a zōon politikon, a creature of the polis whose flourishing is constituted in political community rather than added to it.⁶¹ Nobody has successfully refuted him, though a great many have tried, often with the implicit corollary that if we could just change the social arrangements, we could change the animal. The question of human nature is simultaneously empirical (what are we actually like?) and political (what does the answer justify?). Different answers have been used to justify very different social arrangements, and the political stakes have made the empirical debate hard to conduct dispassionately. We have a powerful interest in the answer — which is precisely why we should be suspicious of whoever gives it with the most confidence.

The Verdingkinder and What Swiss State Policy Did to Children

From the early nineteenth century through the 1970s, Swiss authorities — typically Gemeinden (municipalities) under cantonal supervision, often acting under the Armenfürsorge (poor-relief) law of each canton — placed children deemed at risk of being a “social welfare burden” with farm families to work in exchange for room and board.⁶² The 2014 federal study by Gisela Hauss and Thomas Gabriel for the Universities of Applied Sciences Northwestern Switzerland and Zurich estimated the historically affected population in the hundreds of thousands across the period; the Bundesarchiv holds case records on roughly 60,000.⁶³ Conditions were systematically poor by the standards of Swiss family life of the period: 14-hour working days, separation from siblings, denial of schooling beyond minimum legal requirements, sexual and physical abuse, and the routine power of farm families to refuse children’s contact with their birth families. The 1981 revision of Article 397a of the Swiss Civil Code formally ended the administrative detention mechanisms that supplied the system. On 11 April 2013 the Federal Council issued a formal apology to surviving Verdingkinder and to victims of related forms of fürsorgerische Zwangsmassnahmen (administrative compulsory measures).⁶⁴ On 30 September 2016 the Federal Assembly enacted the Bundesgesetz über die Aufarbeitung der fürsorgerischen Zwangsmassnahmen vor 1981 (AFZFG); over 9,000 surviving Verdingkinder applied for compensation under the CHF 300-million Wiedergutmachungsfonds. Long-term-outcomes research (Marco Leuenberger’s 2014 doctoral dissertation, Thomas Huonker’s 2003 archival study, Paul Hugger’s earlier ethnographic work) documents persistent attachment difficulties, depression rates substantially above population baseline, and reduced life expectancy among surviving Verdingkinder.⁶⁵ The Bucharest Early Intervention Project of the 2000s — a randomised controlled trial of foster-care versus continued institutional care for 136 Romanian orphans, treated below in the body — is the methodologically cleaner controlled study of the same question. The Swiss case has the inverse evidential structure: an enormous, historically unconcealed cohort across two centuries, with retrospective rather than randomised data, and a federal apology that has made the long-term-outcome studies a matter of public record rather than research curiosity. Most students will have grandparents whose own school years overlapped with the system. The Hobbes/Locke/Rousseau/Pinker question — what a person can become without specific developmental conditions — has, in the Verdingkinder, Swiss state-archival data running across two centuries and a Federal Council apology delivered within their parents’ adult lifetimes.

6.1 The Bucharest Early Intervention Project

In the late 1990s, Romania’s transition from the Ceaușescu regime — whose pronatalist policies had institutionalised over 100,000 children — left orphanages whose conditions were severe by any measure. From 2000, a research team led by Charles A. Nelson (Harvard / Boston Children’s Hospital), Nathan A. Fox (University of Maryland), and Charles H. Zeanah (Tulane) ran the Bucharest Early Intervention Project (BEIP), a randomised controlled trial that recruited 136 institutionalised children aged 6–31 months from six Bucharest orphanages.⁶⁶ Children were randomly assigned either to continued institutional care (the existing standard for Romanian orphans) or to high-quality foster care that the project arranged and supervised. Assessments at 30, 42, and 54 months showed that children placed in foster care before about 24 months had substantially better outcomes on cognitive testing, language development, and brain electrical activity (EEG and event-related-potential measures) than those who remained institutionalised; children placed after 24 months showed measurable but smaller gains.⁶⁷ Sixteen-year follow-up data showed persistent EEG, attachment, and psychiatric-symptom differences between the institutional and foster-care groups; the foster-care group remained substantially better off but neither reached the typical-development comparison group.⁶⁸ The research raised acute ethical questions — the team had to argue that the control arm was the existing standard of care and that the trial was defensible only because the project was simultaneously producing the foster-care alternative the study would later show was better. The BEIP is the contemporary, randomised, ethically-supervised counterpart to the Verdingkinder (info box above) and to the case of Genie (treated below) — three different evidential structures (RCT, historical state cohort, single tragic biography) addressing the same question of what becomes of children deprived of normal developmental conditions.

6.2 The Genie Case (1970)

In November 1970, a thirteen-year-old girl known to researchers as Genie was discovered in a house in Arcadia, California, where she had spent her entire childhood in isolation, strapped to a potty chair in a darkened room, forbidden to make noise, beaten if she did. Her father believed she was brain-damaged and enforced near-total sensory and linguistic deprivation. When Genie was found, she could not speak and had no language. Linguist Susan Curtiss spent years attempting to teach her English; Genie acquired a vocabulary and a rudimentary ability to communicate. She never acquired grammar. Curtiss argued that Genie’s case was direct evidence for Chomsky’s theory of a critical period for language acquisition:⁶⁹ the capacity for language is biological and time-limited, and if the critical window closes without exposure, some aspects of linguistic competence are permanently unavailable. The case sits at the intersection of Hobbes, Locke, Rousseau, and Pinker. It is not a clean illustration of any of their positions: Genie was not a blank slate written on by culture, but nor was her biological endowment sufficient to produce full language without environmental input at the right time. Human nature, on this reading, is neither a fixed essence nor an entirely social construction but a set of developmental potentials that require specific environmental conditions to actualise. The Bucharest Early Intervention Project (treated above in the body) is the methodologically rigorous controlled successor — randomised, large-sample, ethically supervised — to the question that Genie’s biography could only test through a single tragic life; the Verdingkinder archive (info box above) provides the third evidential register, a Swiss state-record cohort across two centuries.

6.3 The Philosophical Background

Three foundational positions, simplified but not caricatured:

Hobbes (Leviathan, 1651): in the “state of nature,” human life is “solitary, poor, nasty, brutish, and short.”⁷⁰ Without the coercive apparatus of the state, humans will fight each other for scarce resources. Human nature is competitive, self-interested, and requires external constraint to produce social order.

Locke (Two Treatises of Government, 1689):⁷¹ humans are rational, cooperative, and capable of governing themselves through consent. The state exists not to suppress natural brutishness but to protect natural rights that exist prior to it. Human nature is not benign, but it is not primarily aggressive either.

Rousseau (Discourse on the Origin and Foundation of Inequality Among Men, 1755):⁷² humans are naturally good and peaceful; it is society — the introduction of property, inequality, and competition — that corrupts them. “Man is born free, and everywhere he is in chains.”⁷³

These are not just historical curiosities. They shape contemporary debates about criminal justice (are criminals bad people or products of bad circumstances?), economic policy (do people need incentives and constraints to work productively, or do they work from intrinsic motivation?), and political organisation (do democracies require educated, engaged citizens, or can they function with strategic self-interested voters?).

6.4 Pinker’s Case for Human Nature

Steven Pinker, in The Blank Slate (2002),⁷⁴ argues that the dominant view in social science — what he calls the “blank slate” (no innate human nature), “noble savage” (natural goodness corrupted by civilisation), and “ghost in the machine” (mind separate from brain) — is empirically refuted by the evidence from evolutionary psychology, genetics, and neuroscience.

Humans have a nature shaped by millions of years of evolution: cognitive architecture, emotional systems, social instincts, and behavioural tendencies that are part of what we are rather than products of socialisation. This does not mean these tendencies are unchangeable or morally justified; it means they must be understood rather than denied.

Pinker’s statistics have been challenged by Nassim Taleb (for underweighting tail risks — the possibility of catastrophic violence),⁷⁵ John Gray (for ignoring cycles of violence that make long-run trends misleading),⁷⁶ and Edward Herman and David Peterson (for contestable choices in what counts as “violence”). The methodological debates are substantive.

Pinker’s The Better Angels of Our Nature (2011)⁷⁷ extends the argument: despite our evolved tendencies toward violence, rates of violent death (per capita) have declined dramatically over millennia, and the decline can be explained by the growth of states, trade, literacy, and the expansion of moral empathy. Human nature is malleable in important ways — not infinitely, but enough to produce a measurable decline in violence.

6.5 Sartre’s Counter

Jean-Paul Sartre’s existentialist counter begins with a slogan:

“Existence precedes essence.” — Jean-Paul Sartre, Existentialism is a Humanism (lecture 1945, published 1946)⁷⁸

For Sartre, there is no fixed human nature that precedes individual existence. We are not made with a predetermined essence, the way a paper knife is made with a predetermined function. We exist first, define ourselves through our choices, and create our own nature in the process. This is the source of both human freedom and human anguish: there is no nature to appeal to, no pre-given function to fulfil. We are “condemned to be free.”⁷⁹

The empirical critique of Sartre, advanced most forcefully by Pinker and the evolutionary-psychology programme, is that the blank-slate view of human nature is implausible: there are documented species-typical cognitive architectures, emotional systems, and developmental constraints (the Verdingkinder, the Bucharest Early Intervention Project, and the Genie case, all treated above, are examples on three different evidential registers). Whether this refutes Sartre depends on the reading of “essence precedes existence” one starts with. The strongest reading — that humans have no species-typical cognitive endowments and are infinitely malleable — is hard to defend after a half-century of cognitive science. The weaker and more interesting reading — that whatever endowment there is underdetermines what an individual human becomes, leaving genuine choice in the gap — is compatible with the evidence and is closer to what Sartre actually argued in the 1945 lecture. The load-bearing claims are about the phenomenology of agency, the experience of choosing without a script, and what it means to take responsibility for one’s choices rather than blaming them on nature, society, or God — not a metaphysical denial of biology.

6.6 Questions to Argue About

Is the debate between “human nature is fixed” and “humans are blank slates” a genuinely empirical dispute? Or is it partly a political one, with empirical findings used to support pre-existing commitments?
If evolutionary psychology shows that humans have evolved tendencies toward violence, dominance, and tribalism, what follows politically? Does knowing this change anything — or does it just redescribe what we already knew?
Sartre says there is no human nature — we define ourselves through choices. But if our choices are themselves shaped by neurology, genetics, and socialisation, is the “freedom” Sartre describes real?
What is at stake, politically, in the claim that humans are naturally good (Rousseau) versus naturally brutish (Hobbes)? Can the empirical question be answered without addressing the political stakes?

Forced Fork: Has Violence Actually Declined?

Steven Pinker’s Better Angels of Our Nature (2011) argues that violence per capita has declined steeply across human history — homicide rates from medieval to early-modern to modern Europe, war deaths per capita across the long peace since 1945, the eradication of cruel punishments. Has it?

Position A (yes — the trend is robust): Pinker draws on multiple independent data sources — the Eisner European homicide compilation, the UCDP/PRIO armed conflict dataset, the Correlates of War project, criminological time-series — that converge on the same descending curve. The convergence is the evidence: independent groups, different methodologies, similar conclusions. The mechanisms Pinker identifies (state monopolies on legitimate violence, Hobbes; the “civilising process,” Norbert Elias; expanded circles of moral concern, Peter Singer) supply causal stories consistent with the descriptive trend.

Position B (no — the apparent decline is an artefact of coding and tail-risk underweighting): Nassim Nicholas Taleb’s “The ‘Long Peace’ is a Statistical Illusion” (2015, with Pasquale Cirillo) argues that the per-capita-deaths-from-war distribution has power-law tails — extreme events are concentrated and rare, and 75 years without a war on the scale of 1939–45 is statistically unsurprising even if the underlying violence-generating process has not changed at all. John Gray, The Silence of Animals (2013), pushes the broader point: Pinker’s smoothing presupposes the very Enlightenment progress narrative the data are supposed to support, and the mid-20th century alone produced more violence than the entire preceding millennium of European history. The decline is in the eye of the smoother.

Choose one. Position A must say what kind of evidence would falsify the long-peace claim, given that Taleb’s tail-risk argument predicts long quiet stretches under both the “violence has declined” and the “violence has not declined” hypotheses. Position B must say whether any large-scale demographic claim about human violence over centuries can in principle be supported, and if not, what alternative epistemic stance the historian of violence is supposed to adopt.

7 What is the relationship between the human sciences and power?

Every human science category is a political act as well as a scientific one. To define “mental illness” is to decide who needs treatment and who needs policing. To measure “intelligence” is to rank people and justify the ranking. To classify societies as “developed” and “developing” is to establish a hierarchy with specific beneficiaries. Human science knowledge is not produced in a vacuum. It is funded by governments and corporations; it is applied in clinical, judicial, and policy settings; and its categories — mental illness, race, intelligence — have shaped who gets included in and excluded from social goods. The relationship between knowledge and power in the human sciences is not incidental. It is structural.

DSM-5’s Gender Dysphoria Reclassification (2013) and ICD-11 (2019)

In May 2013, the American Psychiatric Association published DSM-5, the fifth edition of its diagnostic manual.⁸⁰ One of its most contested changes concerned the diagnosis previously listed as Gender Identity Disorder. In DSM-IV (1994) the diagnosis had been placed in the chapter on “Sexual and Gender Identity Disorders,” and the diagnostic criteria identified the disorder with cross-gender identification itself. DSM-5 renamed the diagnosis Gender Dysphoria, moved it into its own chapter, and reformulated the criteria so that the disorder lay in the clinically significant distress arising from the incongruence between experienced gender and assigned sex — not in the incongruence itself. The change was the outcome of a working-group process chaired by Kenneth Zucker (whose own clinical practice and research views were independently contested) and Jack Drescher (concerned specifically with depathologisation), running from 2008 through 2012.⁸¹ In June 2018 the World Health Organization went further: ICD-11 moved Gender Incongruence out of the chapter on Mental and Behavioural Disorders entirely, placing it in a new chapter on Conditions Related to Sexual Health, with the explanation that “gender incongruence is a phenomenon that exists across the world and is not in itself a mental disorder.”⁸² ICD-11 came into force on 1 January 2022. The reclassification trajectory — disorder, to dysphoria-as-disorder, to non-mental-disorder condition — was contested at every step. Clinicians who treated gender-incongruent young people split into roughly three positions: that the category should be retained as a mental disorder (some older clinical literature); that the DSM-5 dysphoria framing was the correct settlement (the APA majority); and that any pathologising framework was a category mistake (the WHO and most LGBTQ+ rights organisations). The 1973 APA removal of homosexuality from DSM-II (treated immediately below) is the structural precedent — and the case Foucault’s argument about pouvoir/savoir has been most often illustrated by. Both cases combine accumulating research evidence with political mobilisation; in both, what changed was a category whose stake for the people it classified made the category itself a contested object.

7.1 The 1973 Removal of Homosexuality from the DSM

Until 1973, the Diagnostic and Statistical Manual of Mental Disorders classified homosexuality as a psychiatric disorder — specifically, as a “sociopathic personality disturbance” in DSM-I (1952), retained through DSM-II. The classification was used to justify criminal prosecution, involuntary institutionalisation, electroconvulsive therapy, and chemical castration of gay and lesbian individuals. The relevant scientific evidence had been available for over a decade: Evelyn Hooker’s 1957 study had given the Rorschach, Thematic Apperception Test, and Make-A-Picture-Story tests to thirty homosexual and thirty heterosexual men, matched for age, IQ, and education, and asked expert clinicians (blinded to orientation) to identify pathology and to pick the homosexual subjects from the ratings. They could not — performance was at chance.⁸³ The 1973 removal did not result from new scientific evidence Hooker’s work had overlooked. It resulted from a sustained campaign by gay-rights activists — including Frank Kameny’s confrontational disruption of the 1970 and 1971 APA conventions in San Francisco and Washington — combined with the internal work of psychiatrist Robert Spitzer, who in 1973 drafted a reformulation that removed “homosexuality” from DSM-II while introducing a new diagnosis, “sexual orientation disturbance,” for individuals distressed by their orientation. On 15 December 1973 the APA Board of Trustees adopted Spitzer’s reformulation 13–0 with two abstentions; opposing psychiatrists then forced a 1974 membership referendum, which ratified the Board’s decision by 5,854 to 3,810 — about 58 per cent in favour of removal.⁸⁴ The episode is the clearest illustration of Foucault’s argument about the relationship between psychiatric knowledge and power: the diagnostic category had not been produced by neutral scientific inquiry but had reflected, and reinforced, the dominant sexual norms of mid-century American society. Its removal did not resolve the philosophical question — it opened it, by showing that the category’s production was a social and political process that could be reversed by social and political action. The 2013 DSM-5 reclassification of gender dysphoria and the 2018 ICD-11 reclassification of gender incongruence (info box above) are the same dispute, two generations on, with a different category in the dock.

7.2 Foucault’s Archaeology

Michel Foucault, in Discipline and Punish (1975)⁸⁵ and The History of Sexuality (1976),⁸⁶ argues that the human sciences are not neutral descriptions of human nature but instruments of what he calls power/knowledge (pouvoir/savoir) — the intertwining of knowledge production and social control. To classify, describe, and categorise human behaviour, on Foucault’s account, is to exercise power over it: to define what counts as normal and what counts as pathological, who counts as a subject and who as an object.

Foucault has critics. Ian Hacking accepts much of the historical analysis but distinguishes “interactive kinds” (categories that affect the people they classify) from “indifferent kinds” (natural-scientific categories that do not), arguing that Foucault’s blanket claim about all human-scientific classification overshoots the data — some psychiatric categories track real biological substrates regardless of how the category is socially deployed.⁸⁷ Nancy Fraser presses a different objection: if all knowledge is power, the framework cannot distinguish knowledge that liberates (the homosexuality declassification) from knowledge that oppresses (the original classification), since both involve power, knowledge, and their interweaving.⁸⁸ These are friendly criticisms within the broadly post-Foucauldian literature; they bound the claim rather than refuting it.

The psychiatric categorisation of homosexuality as a mental disorder is the case Foucauldians treat as paradigmatic. It appeared in the DSM from 1952 to 1973 (procedural detail in the info-box above). Hooker’s 1957 evidence had been on the record for over a decade before the removal without moving the institutional apparatus; what moved it was the combination of that evidence, the work of internal psychiatric reformers (Kameny, Silverstein, Spitzer), and sustained political pressure from the gay rights movement.

The Foucauldian reading: the classification reflected and enforced social norms about sexuality, and its dissolution required political work. The non-Foucauldian reading: the APA corrected an empirical error that its members were professionally and politically slow to correct, and the relevant evidence was scientific rather than political. Hacking’s interactive-kind framework offers a third synthetic option — the category was both an empirical mistake and a social act whose effects on those classified shaped what the category came to mean — though Foucauldians and realists about psychiatric kinds dispute whether Hacking’s middle position is genuinely a synthesis or a concession to the side opposite to the one each of them defends.

7.3 IQ Tests and Immigration Restriction

In the early 20th century, the American eugenics movement used psychometric research on intelligence to support immigration restriction. Henry Goddard administered IQ tests (translated versions of the Binet-Simon scale) to immigrants arriving at Ellis Island in 1912 and concluded that 83% of Jews, 80% of Hungarians, 79% of Italians, and 87% of Russians were “feeble-minded.”⁸⁹ These findings were presented as objective science and were cited in Congressional debates that led to the Immigration Act of 1924, which dramatically restricted immigration from Southern and Eastern Europe.

Stephen Jay Gould’s The Mismeasure of Man (1981, revised 1996)⁹⁰ is the standard critical history of IQ research and scientific racism. Gould’s own analysis has itself been challenged on methodological grounds, illustrating that critiques of science can be as methodologically contestable as the science they critique.

The methodological failures were spectacular: the tests were given in English to recent immigrants who spoke no English; the tests measured cultural familiarity with American middle-class life, not cognitive capacity; the samples were not representative. But the framing of the results as objective scientific measurement gave them a rhetorical power that more evidently political claims would not have had.

7.4 Decolonising Anthropology

Anthropology developed as a discipline in the 19th and early 20th centuries, while the major European powers were administering colonial empires. Colonial administrators needed to understand the societies they governed; anthropologists often produced that understanding. Talal Asad’s edited volume Anthropology and the Colonial Encounter (1973) made the entanglement explicit, and the field has been working with the resulting questions ever since.⁹¹ The historical record does not, however, support a simple “anthropology was a colonial tool” reading: Franz Boas and his American students (Mead, Benedict, Hurston) used the discipline to argue against the racial hierarchies the European powers were imposing, and several British social anthropologists (Evans-Pritchard among them) were treated with suspicion by the colonial administrations they were supposed to serve.⁹²

The decolonisation programme — the attempt to reconstruct the discipline’s methods, categories, and institutional structures in ways that do not replicate colonial hierarchies — is an active research agenda associated with Linda Tuhiwai Smith (Decolonizing Methodologies, 1999), the Subaltern Studies collective, and more recent work by Faye V. Harrison and Eduardo Viveiros de Castro. It has both wide internal acceptance (most contemporary fieldworkers grant the basic point about positionality and consent) and contested edges: whether all Western analytic categories are colonial in the relevant sense (Adam Kuper argues they are not, in The Reinvention of Primitive Society, 2005); whether alternative “indigenous methodologies” can sustain cross-cultural comparison without collapsing into the relativism the WEIRD literature criticises; and whether the institutional reforms proposed (community veto over research outputs, indigenous co-authorship requirements) help or hinder the production of knowledge that travels.⁹³

The “tribe” example is the canonical case for the decolonisers: the term was used by colonial administrators to impose fixed ethnic categories on more fluid social formations and was then absorbed into academic anthropology as a natural unit of analysis.⁹⁴ The example is real but less universal than the strongest decolonisation arguments imply: kinship-based segmentary organisation is a documented social form (Evans-Pritchard’s Nuer, Marshall Sahlins’s analysis of Polynesian chiefdoms) whose existence does not depend on the colonial label. The methodological lesson is narrower than the political slogan: imported categories must be tested against local data rather than assumed; whether they survive that test is a case-by-case question, not a general one.

The deeper methodological tension that decolonisation surfaces — whether any social science can be both cross-culturally generalisable and culturally specific — is unsolved and is one of the central methodological tensions in contemporary anthropology, sociology, and development economics.

7.5 Questions to Argue About

Foucault argues that the human sciences produce “power/knowledge” — that scientific classification is always also an exercise of power. Does this mean human science knowledge is less valid than natural science knowledge? Or does it just add a dimension of analysis that natural science lacks?
Homosexuality was classified as a mental disorder until 1973, then reclassified by a political process rather than new scientific evidence. What does this tell us about the DSM as a scientific instrument?
The IQ tests administered to Ellis Island immigrants produced systematically biased results that were presented as objective science. How should this history affect our assessment of current psychometric research?
Can anthropology be “decolonised” while remaining a rigorous academic discipline? Or does decolonisation require changes that would undermine the conditions for rigorous inquiry?

Forced Fork: Did the DSM-5 Reclassification of Gender Dysphoria Show That Diagnostic Categories Create the Disorders They Name?

The case is in the info-box above. DSM-IV (1994) called it Gender Identity Disorder and located the disorder in the cross-gender identification itself. DSM-5 (2013) renamed it Gender Dysphoria and located the disorder in the distress arising from the incongruence. ICD-11 (2018, in force 2022) moved Gender Incongruence out of the mental-disorders chapter altogether. In each step, scientific literature accumulated alongside political mobilisation; in each step, the reclassification was contested at the time and remains contested at policy implementation. The 1973 APA removal of homosexuality (treated above in the body) is the structural precedent.

Position A (constitutive — Hacking’s looping effect): The 2013 reclassification, like the 1973 case before it, is strong evidence that the original Gender Identity Disorder category was not produced by neutral inquiry. Categories revised under sustained political mobilisation, expert disagreement, and shifting cultural norms are at least partly constituted by the social norms that surround them. When a diagnostic category (ADHD, PTSD, borderline personality disorder, gender dysphoria) is introduced or revised, people are diagnosed who previously would not have been, experience themselves through its framework, and are treated accordingly. Diagnosis does not invent the distress; it shapes what the distress becomes. The ICD-11 move is the strongest version of the point: the WHO has, in effect, ruled that the category of mental disorder no longer applies to the underlying phenomenon — a change in classification that does not correspond to any change in the phenomenon being classified.

Position B (psychiatric realism — Kendler 2008): Both reclassifications show the DSM correcting errors in how it characterised an underlying phenomenon; the corrections presuppose a fact of the matter. Hooker’s 1957 work had shown the homosexuality classification empirically unsupported in 1973. Subsequent research and clinical experience showed that the Gender Identity Disorder framing pathologised cross-gender identification rather than the distress that may or may not accompany it; the reclassification corrected that. The looping effect is real for socially fluid categories, but disorders like schizophrenia and major depression present cross-culturally with stable neurobiological signatures (dopaminergic, HPA-axis) and track underlying kinds independent of the label. Reclassification is the discipline working as it should.

Choose one. The hardest case is PTSD: before the diagnosis was introduced (1980), Vietnam veterans suffered from what was called “combat exhaustion” or “shell shock” — different labels, different social frameworks, different treatment. Does the different label mean a different disorder? And if the disorder is partly constituted by how it is understood, what follows for the universality of psychological diagnoses across cultures — including categories like gender dysphoria whose application varies sharply by jurisdiction in 2025?

8 Can the human sciences be value-free?

Weber had a word for it — Wertfreiheit, value-freedom — and he was honest enough to admit that he was describing an aspiration, not a description. The ideal holds that the scientist’s job is to describe and explain, not to prescribe or judge. Facts about how societies work should be kept separate from values about how they ought to work. This seems methodologically reasonable. The problem is that every act of measurement is also an act of valuation — you cannot choose what to count without already having decided, at some level, what matters. The ideal turns out to be much more difficult to achieve than it looks — and, as the Swedish economist Gunnar Myrdal argued, the attempt to pursue it actively conceals the values that are already operative.

The IMF, the Washington Consensus, and Argentina 2001

In the 1990s, Argentina implemented a series of economic reforms strongly encouraged by the International Monetary Fund — fiscal discipline, privatisation, trade liberalisation, and the pegging of the Argentine peso to the US dollar at a one-to-one exchange rate. The reforms were initially successful and Argentina was held up by the IMF as a model for developing economies. By 2001, the convertibility plan had made Argentine exports uncompetitive, unemployment had risen to 25 per cent, and a series of bank runs prompted the government to restrict cash withdrawals. The corralito — freezing bank accounts — triggered street protests that killed 39 people and brought down the government. Argentina defaulted on $100 billion of public debt, the largest sovereign default in history at that time.⁹⁵ The IMF’s post-crisis review acknowledged multiple failures of judgement.⁹⁶ The episode illustrates several layers of the epistemological problem in economics: the Washington Consensus was not a set of empirical findings derived from dispassionate analysis but a set of ideological commitments dressed in the language of technical economics; the feedback between economic theory and economic reality made prediction non-independent of the predictions being made; and the institutional incentives of the IMF were not separable from the knowledge claims it was making.

8.1 Weber’s Wertfreiheit

Weber articulated the ideal clearly: the social scientist must keep their value judgements out of their scientific work.⁹⁷ Not because values are unimportant, but because the scientist’s authority comes from their command of facts and methods, not from their moral commitments. A sociologist who lets their preference for socialism or liberalism distort their empirical findings has abused their professional role.

Weber was not naive about the difficulty. He knew that values inevitably shape which questions social scientists ask, which phenomena they find worth studying, and which categories they use. But he thought that rigorous self-discipline could manage these influences sufficiently for empirical findings to have objective standing.

8.2 Myrdal’s Critique

Gunnar Myrdal, the Swedish economist and sociologist, argued in The Political Element in the Development of Economic Theory (1930)⁹⁸ that Weber’s ideal is systematically unreachable, and that the attempt to pursue it actively hides the values that are actually operative.

Economic categories are not neutral. “Unemployment” as a category defines some people as being in an economically relevant relationship to the labour market and others as not; the choice of who counts as “unemployed” (not those in unpaid care work, not the “voluntarily unemployed,” not the long-term sick) is a political choice. “Growth” as the primary measure of economic success embeds a valuation of certain kinds of economic activity over others. Choosing GDP as the measure of national welfare is itself a value judgement — it says that welfare consists in material production and consumption, not in leisure, community, environmental quality, or spiritual life.

Bhutan’s Gross National Happiness Index (introduced in the 1970s by King Jigme Singye Wangchuck) attempts to measure national welfare through nine domains including living standards, time use, psychological wellbeing, cultural resilience, and ecological diversity.⁹⁹ The index reflects Buddhist values about the good life. GDP reflects different values. Both are value-laden; Bhutan’s is more transparent about it.

The values are not introduced by individual scientists making bad choices; they are built into the apparatus of measurement and the categories of analysis. Making them explicit — asking “whose interests does this framework serve?” — is not a corruption of social science; it is a form of scientific rigour.

8.3 Positive and Normative

The standard distinction in economics: positive economics describes how the economy works (“a minimum wage increase of £1 reduces employment by X%”); normative economics prescribes how it ought to work (“the minimum wage should be increased”). The positive/normative distinction is supposed to demarcate the scientist’s domain from the politician’s.

The difficulty: even if positive findings are value-free, the choice of which positive questions to investigate is value-laden. If economists predominantly study the effects of minimum wage increases on employment and rarely study the effects of low wages on worker health and family stability, that selection reflects values even if each individual study is “positive.” The choice of what to measure, what to call a “cost” and what to call a “benefit,” and what causal pathways to investigate are all value-laden choices embedded in apparently neutral analysis.

8.4 Questions to Argue About

Weber says the social scientist should keep value judgements out of their scientific work. Is this achievable? Or does every choice of category, method, and research question already involve value commitments?
Myrdal argues that the attempt to achieve value-free social science hides the values that are actually operating. Is explicit value commitment in social science more honest, or more dangerous?
GDP measures national welfare in a specific way that reflects specific values about what matters. What would it mean to replace GDP with a different measure? Who should decide what to measure?
The positive/normative distinction in economics is supposed to separate facts from values. But can they ever be cleanly separated in practice? Give an example where you think they bleed into each other.

Forced Fork: Is GDP a Measure of What Matters?

Position A: GDP should be replaced as the primary measure of national wellbeing by a more comprehensive indicator — Sen and Nussbaum’s capabilities approach,¹⁰⁰ Bhutan’s Gross National Happiness index, or the Human Development Index. GDP measures economic activity, not welfare; it counts expenditure on disaster cleanup and cancer treatment as contributions to wellbeing; it ignores unpaid domestic labour, environmental degradation, and the distribution of income. A society that maximises GDP is not necessarily a flourishing one.

Position B: GDP should be retained as the primary metric precisely because it is simple, comparable, and reliably measured. Every proposed replacement — happiness indices, capabilities measures, composite indices — either introduces contestable normative assumptions into the measurement, loses comparability across cultures and time, or is too complex to communicate meaningfully to policymakers. The limitations of GDP are real but well-understood; the limitations of its alternatives are obscure and underappreciated.

Choose one. If you choose Position A, specify exactly what you would measure instead and how — and explain why governments would adopt your measure, given the political incentives to optimise whatever is measured. If you choose Position B, explain what you say to the objection that optimising GDP has produced genuine material poverty for millions of people in societies with high average GDP.

9 Media

Adam McKay, The Big Short (2015) — An unusually epistemologically sophisticated film about the 2008 financial crisis; it explains financial instruments accurately while dramatising the gap between what the models assumed and what the world did. The device of having characters break the fourth wall to explain concepts is unusual and effective.
Daniel Kahneman, Thinking, Fast and Slow (2011) — Part IV (“Choices”) examines the gap between rational choice theory and actual human decision-making and remains the strongest material in the book. The Part I social-priming chapters have largely failed to replicate; Kahneman himself acknowledged in 2017 that he had “placed too much faith in underpowered studies.” Read Part IV as the load-bearing argument and Part I as a cautionary tale about how recently we have learned which findings will hold up.
Hannah Arendt, The Origins of Totalitarianism (1951) — The third section, “Totalitarianism,” examines how political violence is produced not by uniquely evil individuals but by institutional and ideological structures. The connection to Milgram’s later findings is direct and profound.
Joshua Oppenheimer, The Act of Killing (2012) — perpetrators of the 1965 Indonesian massacres re-enact their killings for the camera in the style of Hollywood genres. The film raises questions about what human beings are capable of, and how they understand themselves, that no experimental psychology can fully address.
Asghar Farhadi, A Separation (2011) — A divorce in contemporary Tehran becomes a sustained meditation on testimony, class, religious obligation, and what it costs to discover the truth about another person’s actions. The film makes visible how human-scientific knowledge — sociological, legal, psychological — is produced under conditions of conflicting interest and incomplete information.
George Orwell, The Road to Wigan Pier (1937) — Part I is a documentary account of working-class life in northern England; Part II is an essay on socialism and class that is methodologically self-aware about the observer’s position. A model of engaged social science that does not pretend to be value-free.
Studs Terkel, Working (1974) — Oral history interviews with 133 Americans about their experience of work. The methodology is explicitly qualitative and testimonial; the book is both a rich empirical source and an argument about what matters in the study of economic life.
Robert Sapolsky, Behave: The Biology of Humans at Our Best and Worst (2017) — A neurobiologist traces any human behaviour through its immediate neural causes, its hormonal context, its developmental history, and its evolutionary origins. The most important demonstration that biology and social science must be integrated rather than sequenced: you cannot understand human rationality, morality, or cooperation without all the levels simultaneously.

10 Bibliography

Angrist, Joshua D. and Jörn-Steffen Pischke. Mostly Harmless Econometrics: An Empiricist’s Companion. Princeton: Princeton University Press, 2009.

Arendt, Hannah. Eichmann in Jerusalem: A Report on the Banality of Evil. New York: Viking Press, 1963.

Aristotle. Politics. Trans. C. D. C. Reeve. Indianapolis: Hackett, 1998. (Reeve translation queued for the project library; not yet held. The companion Reeve translations of De Anima and Metaphysics are held.)

Asad, Talal, ed. Anthropology and the Colonial Encounter. London: Ithaca Press, 1973.

Asad, Talal. Genealogies of Religion: Discipline and Reasons of Power in Christianity and Islam. Baltimore: Johns Hopkins University Press, 1993.

Bayer, Ronald. Homosexuality and American Psychiatry: The Politics of Diagnosis. Rev. ed. Princeton: Princeton University Press, 1987.

Barrett, Lisa Feldman. How Emotions Are Made: The Secret Life of the Brain. Boston: Houghton Mifflin Harcourt, 2017.

Bouchaud, Jean-Philippe. “Economics Needs a Scientific Revolution.” Nature 455 (2008): 1181.

Burger, Jerry M. “Replicating Milgram: Would People Still Obey Today?” American Psychologist 64.1 (2009): 1–11.

Carney, Dana R., Amy J. C. Cuddy, and Andy J. Yap. “Power Posing: Brief Nonverbal Displays Affect Neuroendocrine Levels and Risk Tolerance.” Psychological Science 21.10 (2010): 1363–1368.

Cohan, William D. House of Cards: How Wall Street’s Gamblers Broke Capitalism. New York: Doubleday, 2009.

Curtiss, Susan. Genie: A Psycholinguistic Study of a Modern-Day “Wild Child”. New York: Academic Press, 1977.

Davidson, Donald. “Actions, Reasons, and Causes.” Journal of Philosophy 60.23 (1963): 685–700. Reprinted in Martin and McIntyre, eds., Readings in the Philosophy of Social Science, ch. 43.

Ekman, Paul. “Universals and Cultural Differences in Facial Expressions of Emotion.” In Nebraska Symposium on Motivation, ed. J. Cole, 207–283. Lincoln: University of Nebraska Press, 1971.

Elster, Jon. Explaining Social Behavior: More Nuts and Bolts for the Social Sciences. Rev. ed. Cambridge: Cambridge University Press, 2015.

Foucault, Michel. Discipline and Punish: The Birth of the Prison. 1975. Trans. Alan Sheridan. New York: Pantheon Books, 1977.

Foucault, Michel. The History of Sexuality. Vol. 1. 1976. Trans. Robert Hurley. New York: Pantheon Books, 1978.

Fraser, Nancy. Unruly Practices: Power, Discourse and Gender in Contemporary Social Theory. Minneapolis: University of Minnesota Press, 1989.

Freeman, Derek. Margaret Mead and Samoa: The Making and Unmaking of an Anthropological Myth. Cambridge, MA: Harvard University Press, 1983.

Gould, Stephen Jay. The Mismeasure of Man. Rev. ed. New York: Norton, 1996.

Hacking, Ian. The Social Construction of What? Cambridge, MA: Harvard University Press, 1999.

Haidt, Jonathan. The Righteous Mind: Why Good People Are Divided by Politics and Religion. New York: Pantheon Books, 2012.

Haslam, S. Alexander, and Stephen D. Reicher. “Contesting the ‘Nature’ of Conformity: What Milgram and Zimbardo’s Studies Really Show.” PLoS Biology 10.11 (2012): e1001426.

Henrich, Joseph, Steven J. Heine, and Ara Norenzayan. “The Weirdest People in the World?” Behavioral and Brain Sciences 33.2–3 (2010): 61–83.

Henrich, Joseph. The WEIRDest People in the World: How the West Became Psychologically Peculiar and Particularly Prosperous. New York: Farrar, Straus and Giroux, 2020.

Hobbes, Thomas. Leviathan. London: Andrew Crooke, 1651.

Hooker, Evelyn. “The Adjustment of the Male Overt Homosexual.” Journal of Projective Techniques 21.1 (1957): 18–31.

Kahneman, Daniel. Thinking, Fast and Slow. New York: Farrar, Straus and Giroux, 2011.

Kahneman, Daniel and Amos Tversky. “Prospect Theory: An Analysis of Decision Under Risk.” Econometrica 47.2 (1979): 263–291.

Kuper, Adam. The Reinvention of Primitive Society: Transformations of a Myth. London: Routledge, 2005.

Le Texier, Thibault. Histoire d’un mensonge: Enquête sur l’expérience de Stanford. Paris: La Découverte, 2018.

Levitt, Steven D. and John A. List. “Was There Really a Hawthorne Effect at the Hawthorne Plant? An Analysis of the Original Illumination Experiments.” American Economic Journal: Applied Economics 3.1 (2011): 224–238.

Locke, John. Two Treatises of Government. London: Awnsham Churchill, 1689.

MacKenzie, Donald. An Engine, Not a Camera: How Financial Models Shape Markets. Cambridge, MA: MIT Press, 2006.

Malthus, Thomas Robert. An Essay on the Principle of Population. London: J. Johnson, 1798.

Martin, Michael, and Lee C. McIntyre, eds. Readings in the Philosophy of Social Science. Cambridge, MA: MIT Press, 1994. Includes Hempel’s “Logic of Functional Analysis” (ch. 22), Cohen’s “Functional Explanation in Marxism” (ch. 24), and Davidson’s “Actions, Reasons, and Causes” (ch. 43).

Mead, Margaret. Coming of Age in Samoa. New York: William Morrow, 1928.

Milgram, Stanley. “Behavioral Study of Obedience.” Journal of Abnormal and Social Psychology 67.4 (1963): 371–378.

Myrdal, Gunnar. The Political Element in the Development of Economic Theory. 1930. Trans. Paul Streeten. London: Routledge and Kegan Paul, 1953.

Open Science Collaboration. “Estimating the Reproducibility of Psychological Science.” Science 349.6251 (2015): aac4716.

Perry, Gina. Behind the Shock Machine: The Untold Story of the Notorious Milgram Psychology Experiments. New York: The New Press, 2013.

Pinker, Steven. The Blank Slate: The Modern Denial of Human Nature. New York: Viking, 2002.

Pinker, Steven. The Better Angels of Our Nature: Why Violence Has Declined. New York: Viking, 2011.

Popper, Karl. The Poverty of Historicism. London: Routledge and Kegan Paul, 1957.

Ranehill, Eva, Anna Dreber, Magnus Johannesson, Susanne Leiberg, Sunhae Sul, and Roberto A. Weber. “Assessing the Robustness of Power Posing: No Effect on Hormones and Risk Tolerance in a Large Sample of Men and Women.” Psychological Science 26.5 (2015): 653–656.

Rosenberg, Alexander. Philosophy of Social Science. Boulder: Westview Press, 1988.

Rousseau, Jean-Jacques. Discourse on the Origin and Foundation of Inequality Among Men. Amsterdam: Marc-Michel Rey, 1755.

Rousseau, Jean-Jacques. The Social Contract. Amsterdam: Marc-Michel Rey, 1762.

Sapolsky, Robert M. Behave: The Biology of Humans at Our Best and Worst. New York: Penguin Press, 2017.

Sartre, Jean-Paul. Existentialism is a Humanism. Lecture 1945, published 1946. Trans. Carol Macomber. New Haven: Yale University Press, 2007.

Segall, Marshall H., Donald T. Campbell, and Melville J. Herskovits. The Influence of Culture on Visual Perception. Indianapolis: Bobbs-Merrill, 1966.

Shankman, Paul. The Trashing of Margaret Mead: Anatomy of an Anthropological Controversy. Madison: University of Wisconsin Press, 2009.

Smith, Adam. The Wealth of Nations. London: W. Strahan and T. Cadell, 1776.

Stiglitz, Joseph E. Globalization and Its Discontents. New York: W. W. Norton, 2002.

Stocking, George W., Jr. Race, Culture, and Evolution: Essays in the History of Anthropology. New York: Free Press, 1968.

Tetlock, Philip E. Expert Political Judgment: How Good Is It? How Can We Know? Princeton: Princeton University Press, 2005.

Tetlock, Philip E., and Dan Gardner. Superforecasting: The Art and Science of Prediction. New York: Crown, 2015.

Thaler, Richard H. and Cass R. Sunstein. Nudge: Improving Decisions About Health, Wealth, and Happiness. New Haven: Yale University Press, 2008.

Turchin, Peter and Sergey A. Nefedov. Secular Cycles. Princeton: Princeton University Press, 2009.

Turchin, Peter. “Political Instability May Be a Contributor in the Coming Decade.” Nature 463 (4 February 2010): 608.

Turchin, Peter. Ages of Discord: A Structural-Demographic Analysis of American History. Chaplin, CT: Beresta Books, 2016.

Tversky, Amos and Daniel Kahneman. “Availability: A Heuristic for Judging Frequency and Probability.” Cognitive Psychology 5 (1973): 207–232.

Tversky, Amos and Daniel Kahneman. “Judgment Under Uncertainty: Heuristics and Biases.” Science 185 (1974): 1124–1131.

Tversky, Amos and Daniel Kahneman. “The Framing of Decisions and the Psychology of Choice.” Science 211.4481 (1981): 453–458.

Weber, Max. The Methodology of the Social Sciences. 1904–1917. Trans. Edward A. Shils and Henry A. Finch. Glencoe, IL: Free Press, 1949.

Weber, Max. The Protestant Ethic and the Spirit of Capitalism. 1905. Trans. Talcott Parsons. London: Allen and Unwin, 1930.

Winch, Peter. The Idea of a Social Science and Its Relation to Philosophy. London: Routledge and Kegan Paul, 1958.

Zimbardo, Philip G. The Lucifer Effect: Understanding How Good People Turn Evil. New York: Random House, 2007.

11 Notes

Swiss National Bank press release, 6 September 2011, announcing a minimum exchange rate of CHF 1.20 per euro and unlimited willingness to buy euros to defend it. Standard reference: SNB Quartalsheft 2/2012; Daniel Kaufmann and Sandra Hanslin, “What is the SNB’s Floor Worth?”, Swiss National Bank Working Paper 2014-08; Adrien Faure, La banque centrale et le franc fort (Geneva: Slatkine, 2015). [VERIFY]↩︎
Swiss National Bank, Annual Report 2014, with foreign currency reserves of CHF 510 billion against Swiss GDP of CHF 642 billion at year-end 2014; the proportion (~85 % of GDP) was the highest among major central banks. By 2022 reserves had peaked at over CHF 950 billion. [VERIFY]↩︎
Swiss National Bank press release, 15 January 2015, 10:30 CET, announcing discontinuation of the EUR/CHF minimum exchange rate. The franc traded as low as EUR 0.85 per CHF within minutes; closed the day at approximately CHF 1.05 per euro. Coverage: Financial Times, Neue Zürcher Zeitung, Tages-Anzeiger, Bloomberg, Reuters, 15–17 January 2015. The Polish frankowicze mortgage litigation, initiated 2015 and ongoing, is summarised in CJEU case C-260/18 Dziubak v Raiffeisen, judgment 3 October 2019; the Polish Supreme Court resolution III CZP 25/22 of April 2024 favoured borrowers. KOF Konjunkturforschungsstelle ETH Zürich, KOF Studien Nr. 80 (2017), estimated the export-margin loss to Swiss firms at CHF 4.5 billion in 2015. [VERIFY]↩︎
Max Weber, The Methodology of the Social Sciences, trans. Edward A. Shils and Henry A. Finch (Glencoe, IL: Free Press, 1949) — see “‘Objectivity’ in Social Science and Social Policy” (1904), pp. 49–112, for the value-relevance / Verstehen programme. The companion locus is “Basic Sociological Terms” in Economy and Society (1922), §1, where Weber defines sociology as the interpretive understanding of social action.↩︎
Max Weber, Die protestantische Ethik und der Geist des Kapitalismus (1904–05); trans. Parsons (Routledge, 1930). Weber’s argument runs across Chs III–V: the doctrine of the calling (Ch III), Calvinist worldly asceticism (Ch IV), and the conclusion that this asceticism “did its part in building the tremendous cosmos of the modern economic order” (Ch V).↩︎
Peter Winch, The Idea of a Social Science and Its Relation to Philosophy (1958), Chapter I.↩︎
Jon Elster, Explaining Social Behavior: More Nuts and Bolts for the Social Sciences, rev. ed. (2015), Chapter 1 (“Explanation”), in the section on “What is to be explained?” The verbatim sentence quoted is from the discussion that introduces methodological individualism as a working premise of the book; Elster qualifies it in the same passage by allowing “harmless shorthand” reference to households, firms, or nations where individual-level data are unavailable.↩︎
Carl G. Hempel, “The Logic of Functional Analysis” (1959), reprinted as Chapter 22 in Michael Martin and Lee C. McIntyre, eds., Readings in the Philosophy of Social Science (1994). The phrasing “the rain dance or some functional equivalent” comes from the editors’ summary in Part V (p. 375) of the same volume; Hempel’s own argument rests on the existence of “functional equivalents, or functional substitutes” for any given cultural item, which destroys the deductive form of functional explanation.↩︎
G. A. Cohen, “Functional Explanation in Marxism,” from Karl Marx’s Theory of History: A Defense (Princeton University Press, 1978), 278–296; reprinted as Chapter 24 in Martin and McIntyre, Readings in the Philosophy of Social Science (1994). Cohen argues that “consequence laws” of the form “if X would be useful, then X comes to exist” can be confirmed without prior knowledge of the selecting mechanism.↩︎
Jon Elster, Explaining Social Behavior (rev. ed. 2015), Chapter 11 on functional explanation; the verbatim “worthless” judgement is Elster’s on functional arguments that fail to provide a feedback mechanism (the example given is the claim that codes of honour exist among the urban aristocracy because the nobility “needed” duels). Elster’s earlier Explaining Technical Change (Cambridge University Press, 1983) develops the critique at book length.↩︎
Donald Davidson, “Actions, Reasons, and Causes,” Journal of Philosophy 60.23 (7 November 1963): 685–700; reprinted as Chapter 43 in Martin and McIntyre, Readings in the Philosophy of Social Science (1994), pp. 675–690. The “rationalization is a species of causal explanation” formulation appears at the opening of Section I (p. 704 in the reprint); the thesis “the primary reason for an action is its cause” is the second of Davidson’s two structuring claims, stated on p. 705.↩︎
Donald Davidson, “Actions, Reasons, and Causes,” Journal of Philosophy 60.23 (7 November 1963): 685–700; reprinted as Chapter 43 in Martin and McIntyre, Readings in the Philosophy of Social Science (1994), pp. 675–690. The “rationalization is a species of causal explanation” formulation appears at the opening of Section I (p. 704 in the reprint); the thesis “the primary reason for an action is its cause” is the second of Davidson’s two structuring claims, stated on p. 705.↩︎
Alexander Rosenberg, Philosophy of Social Science (Westview, 1988), Chapter 2 (“The Explanation of Human Action”), in the section “Reasons and Causes” (pp. 36–47). Rosenberg’s framing of the post-Davidsonian settlement: “The difference between reasons and causes is crucial, and every account of the explanation of human action must face it. The difference between them is sometimes difficult to keep clear, especially if, as most social scientists hold, beliefs and desires are at the same time both the reasons for actions and their causes” (p. 37). Rosenberg’s discussion of functionalism is in Chapter 5, “Functional Analysis and Functional Explanation” and “The Trouble with Functionalism” (pp. 158–169).↩︎
F. J. Roethlisberger and William J. Dickson, Management and the Worker: An Account of a Research Program Conducted by the Western Electric Company, Hawthorne Works, Chicago (Cambridge, MA: Harvard University Press, 1939) — the official chronicle of the 1924–1932 illumination, relay-assembly, and bank-wiring studies. For the contemporaneous interpretive frame, see Elton Mayo, The Human Problems of an Industrial Civilization (1933), Chapters 3–5.↩︎
Steven D. Levitt and John A. List, “Was There Really a Hawthorne Effect at the Hawthorne Plant? An Analysis of the Original Illumination Experiments,” American Economic Journal: Applied Economics 3.1 (2011): 224–238.↩︎
Donald MacKenzie, An Engine, Not a Camera (2006), Chapter 1 (“Performing Theory”) — the term “Barnesian performativity” comes from the sociologist Barry Barnes.↩︎
Morewedge et al. (2015) find durable reductions in confirmation bias, attribution error, and anchoring from a single training video; Arkes (1991) and others show many biases are robust to warnings and incentives. Honest summary: debiasing works for some biases under some conditions; the “Kahneman published a book and the population changed” framing overstates the evidence.↩︎
Amos Tversky and Daniel Kahneman, “The Framing of Decisions and the Psychology of Choice,” Science 211.4481 (1981): 453–458.↩︎
Adam Smith, An Inquiry into the Nature and Causes of the Wealth of Nations (1776), Book IV, Chapter II — the phrase “invisible hand” appears only once in the work.↩︎
Adam Smith, An Inquiry into the Nature and Causes of the Wealth of Nations (1776), Book I, Chapter II for the butcher-brewer-baker passage; Smith’s prior account in The Theory of Moral Sentiments (1759), Part I, grounds social life in sympathy (fellow-feeling) before self-interest. For the canonical instrumentalist defence of rational-choice modelling, see Milton Friedman, “The Methodology of Positive Economics,” in Essays in Positive Economics (1953), 3–43; Gary S. Becker, The Economic Approach to Human Behavior (1976), Chapter 1, extends the apparatus to non-market domains. For the philosophical recalibration, Amartya Sen, “Rational Fools,” Philosophy & Public Affairs 6.4 (1977): 317–344.↩︎
Daniel Kahneman, Thinking, Fast and Slow (2011), especially Parts III (“Overconfidence”) and IV (“Choices”).↩︎
Amos Tversky and Daniel Kahneman, “Judgment Under Uncertainty: Heuristics and Biases,” Science 185 (1974): 1124–1131.↩︎
Daniel Kahneman and Amos Tversky, “Prospect Theory: An Analysis of Decision Under Risk,” Econometrica 47.2 (1979): 263–291.↩︎
Amos Tversky and Daniel Kahneman, “Availability: A Heuristic for Judging Frequency and Probability,” Cognitive Psychology 5 (1973): 207–232.↩︎
Richard H. Thaler and Cass R. Sunstein, Nudge: Improving Decisions About Health, Wealth, and Happiness (2008).↩︎
Daniel Kahneman, Thinking, Fast and Slow (2011), Chapter 4 (“The Associative Machine”) — the section explicitly titled “The Marvels of Priming” treats John Bargh’s “elderly priming” walking-speed study, the “Florida effect,” and money primes as load-bearing examples of System 1 automaticity. Stéphane Doyen, Olivier Klein, Cora-Lise Pichon, and Axel Cleeremans, “Behavioral Priming: It’s All in the Mind, but Whose Mind?” PLoS ONE 7.1 (2012): e29081, is the high-profile failure to replicate Bargh. Daniel Kahneman, open letter to social-priming researchers, posted by Ulrich Schimmack on 14 February 2017: “I placed too much faith in underpowered studies… I have changed my views about the strength of the priming effects.” For the broader pattern, see the replication-crisis discussion below.↩︎
For the Bear Stearns sale (16 March 2008, $2/share rising to $10/share, with a $30 billion Federal Reserve loan against Bear’s mortgage assets), see William D. Cohan, House of Cards: How Wall Street’s Gamblers Broke Capitalism (Doubleday, 2009), Chapters 1–4; the opening pages of Part I anchor the 16 March 2008 sale to JP Morgan Chase. For the Lehman Brothers Chapter 11 filing of 15 September 2008 — $639 billion in assets, the largest US bankruptcy on record — see Rosalind Z. Wiggins, Thomas Piontek, and Andrew Metrick, “The Lehman Brothers Bankruptcy A: Overview,” Journal of Financial Crises 1.1 (2014): 39–62. For the Federal Reserve’s $85 billion loan to AIG of 16 September 2008 against a 79.9% equity stake — driven by AIG Financial Products’ credit-default-swap exposure on collateralised debt obligations — see Robert McDonald and Anna Paulson, “AIG in Hindsight,” Journal of Economic Perspectives 29.2 (2015): 81–106, and the Congressional Research Service report Government Assistance for AIG: Summary and Cost (R42953, updated 2013).↩︎
Raghuram G. Rajan’s “Has Financial Development Made the World Riskier?” — Jackson Hole Symposium, August 2005 — warned that financial-sector incentives were generating tail risk; he was widely dismissed at the time. Nouriel Roubini’s IMF warnings of September 2006 are documented in Roubini and Mihm, Crisis Economics (Penguin, 2010), Ch. 1. For the institutional-design diagnosis, see Admati and Hellwig, The Bankers’ New Clothes (Princeton, 2013).↩︎
Daniel Kahneman, Thinking, Fast and Slow (2011), Part III, Chapter 19 (“The Illusion of Understanding”). The sentence appears on p. 212 of the Farrar, Straus and Giroux hardcover edition.↩︎
Robert M. Sapolsky, Behave: The Biology of Humans at Our Best and Worst (2017), Chapter 1 (“Behavior”) for the layered-explanation framework.↩︎
Dana R. Carney, Amy J. C. Cuddy, and Andy J. Yap, “Power Posing: Brief Nonverbal Displays Affect Neuroendocrine Levels and Risk Tolerance,” Psychological Science 21.10 (2010): 1363–1368.↩︎
Amy Cuddy, “Your Body Language May Shape Who You Are,” TEDGlobal 2012, filmed June 2012 in Edinburgh; archived at https://www.ted.com/talks/amy_cuddy_your_body_language_may_shape_who_you_are (accessed 26 April 2026). The talk is consistently ranked, by TED’s own published view counts, as the second most viewed TED talk of all time (after Sir Ken Robinson’s “Do Schools Kill Creativity?”).↩︎
Eva Ranehill et al., “Assessing the Robustness of Power Posing: No Effect on Hormones and Risk Tolerance in a Large Sample of Men and Women,” Psychological Science 26.5 (2015): 653–656.↩︎
Dana R. Carney, “My Position on ‘Power Poses’” (statement posted to her Berkeley Haas faculty page, 26 September 2016).↩︎
Stanley Milgram, “Behavioral Study of Obedience,” Journal of Abnormal and Social Psychology 67.4 (1963): 371–378; and Obedience to Authority: An Experimental View (1974).↩︎
For the archival re-examination, see Gina Perry, Behind the Shock Machine: The Untold Story of the Notorious Milgram Psychology Experiments (2013), which draws on the Milgram papers at Yale and documents participant coercion, the experimenter’s scripted departures from protocol, and the less-than-universal acceptance of the cover story.↩︎
Hannah Arendt, Eichmann in Jerusalem: A Report on the Banality of Evil (1963), especially the Epilogue.↩︎
Philip G. Zimbardo, The Lucifer Effect: Understanding How Good People Turn Evil (2007), Chapters 2–9, gives Zimbardo’s own retrospective account.↩︎
Craig Haney, Curtis Banks, and Philip Zimbardo, “A Study of Prisoners and Guards in a Simulated Prison,” Naval Research Reviews 9 (1973): 1–17 — the first scholarly write-up of the August 1971 study, funded by the Office of Naval Research; reprinted as “Interpersonal Dynamics in a Simulated Prison,” International Journal of Criminology and Penology 1 (1973): 69–97.↩︎
Thibault Le Texier, Histoire d’un mensonge: Enquête sur l’expérience de Stanford (2018), based on the Stanford Prison Experiment archive at Stanford University Libraries; the English-language summary of the same archival work is Thibault Le Texier, “Debunking the Stanford Prison Experiment,” American Psychologist 74.7 (2019): 823–839. Le Texier documents that Zimbardo and his collaborators briefed and supervised the guards toward the experiment’s intended findings (rather than letting role-occupancy alone produce the cruelty), that the protocol and conclusions were partly written in advance, and that several of the most-cited “breakdowns” among prisoners involved participants performing the distress they took to be expected. Zimbardo has contested the framing while not, in the main, contesting the archival record.↩︎
Open Science Collaboration, “Estimating the Reproducibility of Psychological Science,” Science 349.6251 (2015): aac4716.↩︎
Joshua D. Angrist and Jörn-Steffen Pischke, Mostly Harmless Econometrics: An Empiricist’s Companion (2009), Chapter 1 (“Questions about Questions”) for the four research FAQs and the role of the “ideal experiment”; Chapter 4 (“Instrumental Variables in Action”) for the Angrist–Krueger 1991 quarter-of-birth study. Their formulation: “We hope to find natural or quasi-experiments that mimic a randomized trial by changing the variable of interest while other factors are kept balanced.”↩︎
Jerry M. Burger, “Replicating Milgram: Would People Still Obey Today?” American Psychologist 64.1 (2009): 1–11. Burger reproduced a partial Milgram protocol under modern ethics constraints, stopping participants at the 150 V switch (the point at which the learner first protests). 70% of his base-condition participants had to be stopped as they prepared to continue past 150 V — not significantly different from Milgram’s 82.5% pass-through at the same point. The 70% figure cannot be compared directly to Milgram’s “65% to 450 V” headline, since Burger could not test the higher voltages.↩︎
Haslam and Reicher’s “engaged followership” reading argues the Milgram data are better explained by participants’ active identification with the experimenter’s scientific project than by passive obedience: the 2014 experimental analogue (Haslam, Reicher, and Birney, Journal of Social Issues 70.3) shows that prods framed as orders (“you have no other choice, you must go on”) produced less compliance than prods framed as appeals to the value of the experiment. See also Haslam and Reicher, PLoS Biology (2012).↩︎
Philip E. Tetlock, Expert Political Judgment: How Good Is It? How Can We Know? (Princeton: Princeton University Press, 2005), Chapters 2–3 for the design and aggregate-accuracy results, Chapter 4 for the fox/hedgehog distinction, Chapter 5 for the inverse relationship between media prominence and forecasting calibration.↩︎
Philip E. Tetlock and Dan Gardner, Superforecasting: The Art and Science of Prediction (New York: Crown, 2015), Chapters 1–4 for the IARPA Aggregative Contingent Estimation tournament results, the design of the Good Judgment Project (with Barbara Mellers and Don Moore), and the comparison with intelligence-community analysts.↩︎
Jean-Philippe Bouchaud, “Economics Needs a Scientific Revolution,” Nature 455 (2008): 1181.↩︎
Thomas Robert Malthus, An Essay on the Principle of Population (1798), Chapters I–II for the geometric/arithmetic argument.↩︎
Peter Turchin, Ages of Discord: A Structural-Demographic Analysis of American History (2016), Chapters 2–3.↩︎
Peter Turchin, “Political Instability May Be a Contributor in the Coming Decade,” Nature 463 (4 February 2010): 608.↩︎
Karl Popper, The Poverty of Historicism (1957), especially the Introduction and Chapters III–IV.↩︎
Peter Turchin and Sergey A. Nefedov, Secular Cycles (2009), especially Chapter 1 (“The Theory”) for the elite-overproduction mechanism.↩︎
Margaret Mead, Coming of Age in Samoa (1928), particularly Chapters V–VIII on adolescent life.↩︎
Derek Freeman, Margaret Mead and Samoa: The Making and Unmaking of an Anthropological Myth (1983), Part II on the refutation.↩︎
Paul Shankman, The Trashing of Margaret Mead: Anatomy of an Anthropological Controversy (Madison: University of Wisconsin Press, 2009), Chapters 5–8, documents what Shankman calls Freeman’s “hoaxing” thesis — the late claim that Mead’s main informants were lying to her about Samoan sexuality as a joke — and shows that Freeman’s own archival evidence does not support it. Shankman is sympathetic to the view that Mead overstated the cultural-determinism case but rejects the strong refutation Freeman built on top of it.↩︎
Joseph Henrich, Steven J. Heine, and Ara Norenzayan, “The Weirdest People in the World?” Behavioral and Brain Sciences 33.2–3 (2010): 61–83 — the original meta-review and 96% / 12% statistic. Henrich’s book-length development of the argument, The WEIRDest People in the World: How the West Became Psychologically Peculiar and Particularly Prosperous (New York: Farrar, Straus and Giroux, 2020), traces the proposed mechanism (Western Church marriage prohibitions disrupting kin-based institutions) at chapter length and supersedes the 2010 paper as the standard reference for the broader argument.↩︎
Paul Ekman, “Universals and Cultural Differences in Facial Expressions of Emotion,” in Nebraska Symposium on Motivation, ed. J. Cole (1971), 207–283.↩︎
Lisa Feldman Barrett, How Emotions Are Made: The Secret Life of the Brain (2017), Chapters 1–3 for the theory of constructed emotion.↩︎
Webb Keane, Ethical Life: Its Natural and Social Histories (Princeton: Princeton University Press, 2016), Introduction and Chapter 1, develops a position on cross-cultural ethical comparison that is neither pure relativism nor unitary universalism. Marshall Sahlins’s late essays — including “What Kinship Is — And Is Not” (Chicago: University of Chicago Press, 2013) — defend a similar middle ground. For the methodological self-critique of “the native point of view,” see Clifford Geertz, “From the Native’s Point of View’: On the Nature of Anthropological Understanding,” Bulletin of the American Academy of Arts and Sciences 28.1 (1974): 26–45, and the Cultural Anthropology journal’s running debates in the 2000s and 2010s.↩︎
Marshall H. Segall, Donald T. Campbell, and Melville J. Herskovits, The Influence of Culture on Visual Perception (1966), Chapter 3 on the Müller-Lyer cross-cultural data.↩︎
Aristotle, Politics I.2, 1253ª2, trans. Ernest Barker, rev. R. F. Stalley (Oxford World’s Classics, 1995). The verbatim formulation: “From these considerations it is evident that the city belongs to the class of things that exist by nature, and that man is by nature a political animal.” Followed at 1253ª7 by: “It is thus clear that man is a political animal, in a higher degree than bees or other gregarious animals.” The Greek phrase ho anthrōpos phusei politikon zōon sits in Stalley’s editorial note to this passage.↩︎
The administrative compulsory measures (fürsorgerische Zwangsmassnahmen) under which Verdingkinder were placed include: Verdingkinderwesen proper (children placed with farm families or smaller employers in exchange for labour); Heimplatzierung (placement in homes/institutions); administrative Versorgung (administrative detention of “wayward” adolescents in correctional institutions, often for behaviour that did not constitute a criminal offence). Standard institutional reference: Bundesarchiv, Verdingung — Anstaltsversorgung — Adoption: Aspekte fürsorgerischer Zwangsmassnahmen vor 1981 (Bern: BBL, 2015). Comprehensive scholarly survey: Markus Furrer et al., eds., Fürsorge und Zwang: Fremdplatzierung von Kindern und Jugendlichen in der Schweiz 1850–1980 (Basel: Schwabe, 2014). [VERIFY]↩︎
Gisela Hauss, Thomas Gabriel, and Martin Lengwiler, eds., Fremdplatziert: Heimerziehung in der Schweiz, 1940–1990 (Zürich: Chronos, 2018), establishes the methodological basis for current estimates. The Verdingkinder historical population is conservatively estimated at “in the hundreds of thousands” across the period 1800–1980; the Bundesarchiv holds case records on approximately 60,000 individuals, mostly post-1900. The variance in estimates reflects the heterogeneity of the placement mechanisms and the loss of cantonal records. [VERIFY]↩︎
Federal Council statement of 11 April 2013 (delivered by Justice Minister Simonetta Sommaruga at the Kulturzentrum Paul Klee, Bern), formally apologising to the Verdingkinder and to victims of administrative compulsory measures. The 2014 federal Studienkommission under Hans-Ulrich Schiedt produced a definitive overview report. The Bundesgesetz über die Aufarbeitung der fürsorgerischen Zwangsmassnahmen (AFZFG, BBl 2016 7459) was passed by the Federal Assembly on 30 September 2016 and provides for individual Solidaritätsbeiträge of up to CHF 25,000 from a fund of CHF 300 million. Over 9,000 surviving Verdingkinder applied; final-state evaluation in BFS / SECO report 2023. [VERIFY]↩︎
Marco Leuenberger, Heimkinder im Kanton Bern, 1950–2000 (PhD dissertation, University of Bern, 2014); Thomas Huonker, Diagnose: «moralisch defekt». Kastration, Sterilisation und Rassenhygiene im Dienste des Schweizer Sozialstaates und zum Nutzen von Wissenschaft und Industrie 1890–1970 (Zürich: Orell Füssli, 2003); Paul Hugger, Kinder zur Pflege gegeben: Das Schicksal der Verdingkinder als Aspekt schweizerischer Sozialgeschichte (Zürich: Limmat, 1998). For a synthesis of the long-term-outcome research see Andreas Maercker, ed., Trauma und Posttraumatische Belastungsstörung (4th ed., Berlin: Springer, 2018), Chapter 22 (specifically on Verdingkinder and the Swiss compulsory-measures cohort). [VERIFY]↩︎
Charles A. Nelson III, Charles H. Zeanah, Nathan A. Fox, Peter J. Marshall, Anna T. Smyke, and Donald Guthrie, “Cognitive Recovery in Socially Deprived Young Children: The Bucharest Early Intervention Project,” Science 318.5858 (21 December 2007): 1937–1940. The randomised controlled trial enrolled 136 children aged 6–31 months from six institutions in Bucharest. The protocol and ethical review are detailed in Charles A. Nelson III, Nathan A. Fox, and Charles H. Zeanah, Romania’s Abandoned Children: Deprivation, Brain Development, and the Struggle for Recovery (Cambridge, MA: Harvard University Press, 2014), Appendix A. [VERIFY]↩︎
Nelson et al., “Cognitive Recovery in Socially Deprived Young Children” (2007). The 30-, 42-, and 54-month assessments showed substantial IQ gains for the foster-care group relative to continued-institutional, with the largest effects for children placed before approximately 24 months. EEG data published in Marshall et al., “A Comparison of the Electroencephalogram Between Institutionalized and Community Children in Romania,” Journal of Cognitive Neuroscience 16.8 (2004): 1327–1338, and follow-up papers. [VERIFY]↩︎
Sixteen-year follow-ups: Charles H. Zeanah, Nathan A. Fox, and Charles A. Nelson, “The Bucharest Early Intervention Project: Case Study in the Ethics of Mental Health Research,” Journal of Nervous and Mental Disease 200.3 (2012): 243–247 (ethical review of the trial design); Florin Tibu, Kathryn L. Humphreys, Charles H. Zeanah, et al., “Psychopathology in Young Children Reared in Severely Depriving Conditions: The Bucharest Early Intervention Project,” Journal of the American Academy of Child & Adolescent Psychiatry 55.10 (2016); subsequent ERP/EEG follow-ups in Vanderwert et al., 2010 and Debnath et al., 2020 (16-year follow-up). [VERIFY]↩︎
Susan Curtiss, Genie: A Psycholinguistic Study of a Modern-Day “Wild Child” (New York: Academic Press, 1977), §11.1 (“The Critical Period”) on Lenneberg’s hypothesis and Genie’s residual grammatical deficits despite vocabulary acquisition. For the underlying framework, see Eric H. Lenneberg, Biological Foundations of Language (1967) and Noam Chomsky, Aspects of the Theory of Syntax (1965).↩︎
Thomas Hobbes, Leviathan (1651), Part I, Chapter XIII (“Of the Natural Condition of Mankind concerning Their Felicity, and Misery”), p. 78 of the original-spelling text: “the life of man, solitary, poore, nasty, brutish, and short.”↩︎
John Locke, Two Treatises of Government (1689), especially the Second Treatise, Chapters II (“Of the State of Nature”), V (“Of Property”), and VIII–IX (on the origin and ends of political society).↩︎
Jean-Jacques Rousseau, Discourse on the Origin and Foundation of Inequality Among Men (1755), Part I on the “state of nature.”↩︎
The line “Man is born free, and everywhere he is in chains” opens Jean-Jacques Rousseau, The Social Contract (1762), Book I, Chapter 1 — not the Discourse on Inequality. It is commonly misattributed to the earlier work.↩︎
Steven Pinker, The Blank Slate: The Modern Denial of Human Nature (2002), especially Part I (Chapters 1–3) on the three “doctrines.”↩︎
Nassim Nicholas Taleb, “The ‘Long Peace’ is a Statistical Illusion” (working paper, 2012); developed formally in Pasquale Cirillo and Nassim Nicholas Taleb, “On the Statistical Properties and Tail Risk of Violent Conflicts,” Physica A: Statistical Mechanics and its Applications 452 (2016): 29–45. Steven Pinker’s reply, “Fooled by Belligerence,” is archived at https://stevenpinker.com/files/pinker/files/comments_on_taleb_by_s_pinker_1.pdf (accessed 26 April 2026).↩︎
John Gray, “Steven Pinker is Wrong about Violence and War,” The Guardian, 13 March 2015, https://www.theguardian.com/books/2015/mar/13/john-gray-steven-pinker-wrong-violence-war-declining (accessed 26 April 2026). Edward S. Herman and David Peterson’s “Reality Denial: Steven Pinker’s Apologetics for Western-Imperial Violence,” ZNet (2012), develops the contestable-coding objection.↩︎
Steven Pinker, The Better Angels of Our Nature: Why Violence Has Declined (2011), Chapters 2–7 for the historical data.↩︎
Jean-Paul Sartre, Existentialism is a Humanism (lecture 1945, published 1946).↩︎
Jean-Paul Sartre, Being and Nothingness (1943), Part IV, Chapter 1; the phrase “l’homme est condamné à être libre” is also developed in Existentialism is a Humanism (1946).↩︎
American Psychiatric Association, Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (Arlington, VA: American Psychiatric Publishing, 18 May 2013). [VERIFY]↩︎
DSM-5 Workgroup on Sexual and Gender Identity Disorders, chaired by Kenneth J. Zucker; Jack Drescher served as the APA officer concerned with depathologisation. Key working-group papers: Kenneth J. Zucker et al., “Memo Outlining Evidence for Change for Gender Identity Disorder in the DSM-5” (October 2011); Jack Drescher, “Queer Diagnoses: Parallels and Contrasts in the History of Homosexuality, Gender Variance, and the Diagnostic and Statistical Manual,” Archives of Sexual Behavior 39 (2010): 427–460; American Psychiatric Association, “Gender Dysphoria” fact sheet, 2013. [VERIFY]↩︎
World Health Organization, International Classification of Diseases, Eleventh Revision (ICD-11), formal endorsement at the 72nd World Health Assembly on 25 May 2019, in force from 1 January 2022. The relocation of Gender Incongruence from Chapter 5 (Mental and Behavioural Disorders) to a new Chapter 17 (Conditions Related to Sexual Health) was decided by the Working Group on the Classification of Sexual Disorders and Sexual Health, chaired by Geoffrey M. Reed; rationale set out in Geoffrey M. Reed et al., “Disorders Related to Sexuality and Gender Identity in the ICD-11: Revising the ICD-10 Classification Based on Current Scientific Evidence, Best Clinical Practices, and Human Rights Considerations,” World Psychiatry 15.3 (2016): 205–221. [VERIFY]↩︎
Evelyn Hooker, “The Adjustment of the Male Overt Homosexual,” Journal of Projective Techniques 21.1 (1957): 18–31. Hooker’s design — matched samples, blind clinician ratings on Rorschach, TAT, and MAPS — is widely credited as the first methodologically clean test of the diagnostic claim and is treated by Drescher (cited above) as the empirical anchor of the later declassification.↩︎
For the procedural and political history the standing source is Ronald Bayer, Homosexuality and American Psychiatry: The Politics of Diagnosis (Princeton: Princeton University Press, rev. ed. 1987), Chapters 4–5 (Kameny’s 1970–71 APA convention disruptions, Spitzer’s drafting of the DSM-II reformulation, and the 15 December 1973 Board of Trustees vote — Bayer’s Chapter 5 is the canonical narrative of the internal psychiatric politics, drawing on the Board minutes themselves). For the 1974 membership-referendum tally (5,854 yes / 3,810 no, ~58% in favour of retaining the Board’s removal decision), see Jack Drescher, “Out of DSM: Depathologizing Homosexuality,” Behavioral Sciences 5.4 (2015): 565–575.↩︎
Michel Foucault, Discipline and Punish: The Birth of the Prison (1975), Part III (“Discipline”) for the analytic of power/knowledge.↩︎
Michel Foucault, The History of Sexuality, Vol. 1 (1976), Part IV (“The Deployment of Sexuality”).↩︎
Ian Hacking, The Social Construction of What? (Cambridge, MA: Harvard University Press, 1999), Chapters 4–5; the “interactive kinds” / “indifferent kinds” distinction is developed in “The Looping Effects of Human Kinds,” in Dan Sperber, David Premack, and Ann James Premack, eds., Causal Cognition (Oxford: Clarendon Press, 1995), 351–394. Hacking’s worked examples include child abuse, multiple personality, and autism.↩︎
Nancy Fraser, “Foucault on Modern Power: Empirical Insights and Normative Confusions” (1981; in Unruly Practices, Minnesota, 1989). The charge: Foucault’s framework lacks the resources to ground the normative judgements his own historical work depends on.↩︎
Henry H. Goddard, “Mental Tests and the Immigrant,” Journal of Delinquency 2 (1917): 243–277. For critical historical context, see Stephen Jay Gould, The Mismeasure of Man (rev. ed. 1996), Chapter 5.↩︎
Stephen Jay Gould, The Mismeasure of Man, revised edition (1996), especially Chapters 5–7. Gould’s own craniometric re-analyses have been contested by Jason E. Lewis et al., “The Mismeasure of Science: Stephen Jay Gould versus Samuel George Morton on Skulls and Bias,” PLoS Biology 9.6 (2011).↩︎
Talal Asad, ed., Anthropology and the Colonial Encounter (Ithaca Press, 1973), is more careful than the “anthropology = colonial science” slogan: it documents specific institutional dependencies (Rhodes-Livingstone, the Colonial Social Science Research Council) without claiming the discipline’s content is reducible to its funding. Asad’s Genealogies of Religion (Johns Hopkins, 1993) extends the same critical method to the anthropology of religion. [VERIFY: 1973 volume not held; substituted via Asad’s Genealogies (1993, id 1457)]↩︎
George W. Stocking Jr., Race, Culture, and Evolution: Essays in the History of Anthropology (New York: Free Press, 1968), Chapters 8–9 on Boas’s anti-racialist programme; Edmund Leach’s introduction to the second edition of E. E. Evans-Pritchard, Witchcraft, Oracles, and Magic Among the Azande (Oxford: Clarendon Press, 1976), notes Evans-Pritchard’s strained relations with the Anglo-Egyptian Sudan administration.↩︎
Adam Kuper, The Reinvention of Primitive Society: Transformations of a Myth (London: Routledge, 2005), especially Chapters 6–8. Kuper, himself sharply critical of the colonial tradition in anthropology, argues that some recent “indigenous knowledge” frameworks reproduce the essentialism they were meant to overturn.↩︎
Aidan Southall, “The Illusion of Tribe,” Journal of Asian and African Studies 5.1–2 (1970): 28–50, is the canonical paper on the colonial production of “tribe.” For the longer history, see Peter Ekeh, “Social Anthropology and Two Contrasting Uses of Tribalism in Africa,” Comparative Studies in Society and History 32.4 (1990): 660–700.↩︎
For the casualty figures, the corralito mechanism, and the December 2001 default, see Paul Blustein, And the Money Kept Rolling In (and Out): Wall Street, the IMF, and the Bankrupting of Argentina (New York: PublicAffairs, 2005), Chapters 9–11. The IMF’s own retrospective is the Independent Evaluation Office report, The IMF and Argentina, 1991–2001 (Washington, DC: IMF, 2004).↩︎
Joseph E. Stiglitz, Globalization and Its Discontents (New York: W. W. Norton, 2002), especially Chapter 3 (“Freedom to Choose?”) on Washington Consensus conditionality and Chapter 8 (“The IMF’s Other Agenda”) on institutional incentives. Stiglitz’s revised account in Globalization and Its Discontents Revisited (2017) extends the argument to Argentina specifically.↩︎
Max Weber, “The Meaning of ‘Ethical Neutrality’ in Sociology and Economics” (1917), in The Methodology of the Social Sciences, trans. Edward A. Shils and Henry A. Finch (Glencoe, IL: Free Press, 1949), pp. 1–47. Weber’s defence of Wertfreiheit — the separation of empirical analysis from practical evaluation — and his admission that the choice of research questions is itself value-laden.↩︎
Gunnar Myrdal, The Political Element in the Development of Economic Theory (1930; English trans. 1953), especially Chapters I and VIII.↩︎
The phrase “Gross National Happiness is more important than Gross National Product” is attributed to King Jigme Singye Wangchuck in interviews with the Financial Times (1972) shortly after his accession to the throne. For the formal nine-domain index developed by Bhutan’s Centre for Bhutan Studies, see Karma Ura et al., An Extensive Analysis of GNH Index (Thimphu: Centre for Bhutan Studies, 2012), Chapters 2–3.↩︎
Amartya Sen, Inequality Reexamined (Oxford: Clarendon Press, 1992), Chapters 3–5 for the functioning/capability distinction; expanded in Sen, Development as Freedom (1999), Chapters 1–4. Martha Nussbaum, Women and Human Development: The Capabilities Approach (Cambridge: Cambridge University Press, 2000), Chapter 1, develops Sen’s framework into a list of ten “central human capabilities” with explicit normative content.↩︎