Goodhart's law is an adage often stated as, "When a measure becomes a _target, it ceases to be a good measure".[1] It is named after British economist Charles Goodhart, who is credited with expressing the core idea of the adage in a 1975 article on monetary policy in the United Kingdom:[2]
Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes.[3]
It was used to criticize the British Thatcher government for trying to conduct monetary policy on the basis of _targets for broad and narrow money,[4] but the law reflects a much more general phenomenon.[5]
Priority and background
editNumerous concepts are related to this idea, at least one of which predates Goodhart's statement.[6] Notably, Campbell's law likely has precedence, as Jeff Rodamar has argued, since various formulations date to 1969.[7] Other academics had similar insights at the time. Jerome Ravetz's 1971 book Scientific Knowledge and Its Social Problems[8] also predates Goodhart, though it does not formulate the same law. He discusses how systems in general can be gamed, focuses on cases where the goals of a task are complex, sophisticated, or subtle. In such cases, the persons possessing the skills to execute the tasks properly seek their own goals to the detriment of the assigned tasks. When the goals are instantiated as metrics, this could be seen as equivalent to Goodhart and Campbell's claim.
Shortly after Goodhart's publication, others suggested closely related ideas, including the Lucas critique (1976). As applied in economics, the law is also implicit in the idea of rational expectations, a theory in economics that states that those who are aware of a system of rewards and punishments will optimize their actions within that system to achieve their desired results. For example, if an employee is rewarded by the number of cars sold each month, they will try to sell more cars, even at a loss.
While it originated in the context of market responses, the law has profound implications for the selection of high-level _targets in organizations.[3] Jon Danielsson states the law as
Any statistical relationship will break down when used for policy purposes.
And suggested a corollary for use in financial risk modelling:
A risk model breaks down when used for regulatory purposes.[9]
Mario Biagioli related the concept to consequences of using citation impact measures to estimate the importance of scientific publications:[10][11]
All metrics of scientific evaluation are bound to be abused. Goodhart's law [...] states that when a feature of the economy is picked as an indicator of the economy, then it inexorably ceases to function as that indicator because people start to game it.
Generalization
editLater writers generalized Goodhart's point about monetary policy into a more general adage about measures and _targets in accounting and evaluation systems. In a book chapter published in 1996, Keith Hoskin wrote:
'Goodhart's Law' – That every measure which becomes a _target becomes a bad measure – is inexorably, if ruefully, becoming recognized as one of the overriding laws of our times. Ruefully, for this law of the unintended consequence seems so inescapable. But it does so, I suggest, because it is the inevitable corollary of that invention of modernity: accountability.[12]
In a 1997 paper responding to the work of Hoskin and others on financial accounting and grades in education, anthropologist Marilyn Strathern expressed Goodhart's Law as "When a measure becomes a _target, it ceases to be a good measure," and linked the sentiment to the history of accounting stretching back into Britain in the 1800s:
When a measure becomes a _target, it ceases to be a good measure. The more a 2.1 examination performance becomes an expectation, the poorer it becomes as a discriminator of individual performances. Hoskin describes this as 'Goodhart's law', after the latter's observation on instruments for monetary control which led to other devices for monetary flexibility having to be invented. However, _targets that seem measurable become enticing tools for improvement. The linking of improvement to commensurable increase produced practices of wide application. It was that conflation of 'is' and 'ought', alongside the techniques of quantifiable written assessments, which led in Hoskin's view to the modernist invention of accountability. This was articulated in Britain for the first time around 1800 as 'the awful idea of accountability' (Ref. 3, p. 268).[1]
Examples
edit- San Francisco Declaration on Research Assessment – 2012 manifesto against using the journal impact factor to assess a scientist's work. The statement denounces several problems in science and as Goodhart's law explains, one of them is that measurement has become a _target. The correlation between h-index and scientific awards is decreasing since widespread usage of h-index.[13]
- International Union for Conservation of Nature's measure of extinction can be used to remove environmental protections, which resulted in IUCN becoming more conservative in labeling something as extinct.[14][15]
- In healthcare, the misapplication of metrics can lead to adverse outcomes. For instance, hospitals striving to reduce Length of Stay (LOS) may inadvertently discharge patients prematurely, leading to increased emergency readmissions.[16]
- According to How to Read Numbers, the law applied to the British government response to the COVID-19 pandemic when it announced a _target of 100,000 COVID-19 tests per day—initially a _target for tests actually carried out and later for maximum capacity of test-taking. The number of useful diagnostic tests was far lower than the government-reported number when it announced it had met the _target.[17]
- In the HBO television show, The Wire, Roland 'Prezbo' Pryzbylewski draws parallels between public school policy of teaching to the test and improving crime statistics through reclassification of those crimes as "juking the stats." He notes that they both are examples of strategies that are ineffective at substantive improvements in learning and public safety respectively because they cater to ineffective metrics of standardized test scores and reported crime rates.[18]
See also
edit- Campbell's law – Adage about perverse incentives – "The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures"
- Cobra effect – Incentive that has a contrary result – when incentives designed to solve a problem end up rewarding people for making it worse
- Confirmation bias – Bias confirming existing attitudes – the tendency to search for and recall information that confirms or supports one's prior beliefs
- Hawthorne effect – Social phenomenon by which being observed causes behavioral changes – when people modify an aspect of their behavior in response to their awareness of being observed
- Gaming the system – Concepts in the philosophy of law – manipulating rules and procedures to obtain a desired outcome
- Lucas critique – 1970s paradigm shift in economic thought, named for American economist Robert Lucas – it is naive to try to predict the effects of a change in economic policy entirely on the basis of relationships observed in historical data
- McNamara fallacy – Erroneous reasoning based solely on numeric metrics – involves making a decision based solely on quantitative observations (or metrics) and ignoring all others
- Metric fixation – Tendency for decision-makers to place excessively large emphases on selected metrics
- Model collapse – Degradation of AI models trained on synthetic data
- Overfitting – Flaw in mathematical modelling – an analysis that corresponds too closely or exactly to a particular set of data
- Peter principle – Management concept by Laurence J. Peter – individuals are promoted based on success in their previous roles, and not the role of the new position
- Reflexivity (social theory) – Circular relationships between cause and effect
- Reification (fallacy) – Fallacy of treating an abstraction as if it were a real thing
- Map-territory relations – Relationship between an object and a representation of that object – sometimes referred to via the quote "the map is not the territory". It is a type of reification fallacy, wherein a model of something is treated as if it were exactly like the actual thing being modeled. Goodhart's law addresses a subset of map-territory problems.
- Specification gaming – Artificial intelligence concept
- Surrogation – Psychological phenomenon; replacement of a construct with its measurement – in business, when a measure of a construct of interest evolves to replace that construct
- Uncertainty principle – Foundational principle in quantum physics – When an attempt is made to measure one aspect more precisely, that attempt influences the measurability of another aspect
References
edit- ^ a b Strathern, Marilyn (1997). "'Improving ratings': audit in the British University system". European Review. 5 (3). John Wiley & Sons: 305–321. doi:10.1002/(SICI)1234-981X(199707)5:3<305::AID-EURO184>3.0.CO;2-4. S2CID 145644958.
- ^ Goodhart, Charles (1975). "Problems of Monetary Management: The UK Experience". Papers in Monetary Economics. Papers in monetary economics 1975; 1; 1. - [Sydney]. - 1975, p. 1-20. Vol. 1. Sydney: Reserve Bank of Australia.
- ^ a b Goodhart, Charles (1975). "Problems of Monetary Management: The UK Experience". In Courakis, Anthony S. (ed.). Inflation, Depression, and Economic Policy in the West. Totowa, New Jersey: Barnes and Noble Books (published 1981). p. 116. ISBN 0-389-20144-8.
- ^ Smith, David (1987). The Rise And Fall of Monetarism. London: Penguin Books. ISBN 9780140227543.
- ^ Manheim, David; Garrabrant, Scott (2018). "Categorizing Variants of Goodhart's Law". arXiv:1803.04585 [cs.AI].
- ^ Manheim, David (29 September 2016). "Overpowered Metrics Eat Underspecified Goals". ribbonfarm. Retrieved 26 January 2017.
- ^ Rodamar, Jeffery (28 November 2018). "There ought to be a law! Campbell versus Goodhart". Significance. 15 (6): 9. doi:10.1111/j.1740-9713.2018.01205.x.
- ^ Ravetz, Jerome R. (1971). Scientific knowledge and its social problems. New Brunswick, New Jersey: Transaction Publishers. pp. 295–296. ISBN 1-56000-851-2. OCLC 32779931.
- ^ Daníelsson, Jón (July 2002). "The Emperor has no Clothes: Limits to Risk Modelling". Journal of Banking & Finance. 26 (7): 1273–1296. CiteSeerX 10.1.1.27.3392. doi:10.1016/S0378-4266(02)00263-7.
- ^ Biagioli, Mario (12 July 2016). "Watch out for cheats in citation game" (PDF). Nature. 535 (7611): 201. Bibcode:2016Natur.535..201B. doi:10.1038/535201a. PMID 27411599.
- ^ Varela, Diego; Benedetto, Giacomo; Sanchez-Santos, Jose Manuel (30 December 2014). "Editorial statement: Lessons from Goodhart's law for the management of the journal". European Journal of Government and Economics. 3 (2): 100–103. doi:10.17979/ejge.2014.3.2.4299. hdl:2183/23376. S2CID 152551763. Retrieved 8 February 2022.
- ^ Hoskin, Keith (1996). The 'awful idea of accountability': inscribing people into the measurement of objects.
- ^ Koltun, V; Hafner, D (2021). "The h-index is no longer an effective correlate of scientific reputation". PLOS ONE. 16 (6): e0253397. arXiv:2102.03234. Bibcode:2021PLoSO..1653397K. doi:10.1371/journal.pone.0253397. PMC 8238192. PMID 34181681.
Our results suggest that the use of the h-index in ranking scientists should be reconsidered, and that fractional allocation measures such as h-frac provide more robust alternatives.
Companion webpage - ^ Mooers, Arne (2022-05-23). "When is a species really extinct?". The Conversation. Retrieved 2023-06-23.
- ^ Martin, T. E.; Bennett, G. C.; Fairbairn, A.; Mooers, A. O. (March 2023). "'Lost' taxa and their conservation implications". Animal Conservation. 26 (1): 14–24. Bibcode:2023AnCon..26...14M. doi:10.1111/acv.12788. ISSN 1367-9430. S2CID 248846699.
- ^ Babar, Sultan M. (2023-11-08). "The Cobra Effect in Healthcare: Goodhart's Law and the Pitfalls of Misguided Metrics". sultan.babar.me. Retrieved 2024-09-25.
- ^ Chivers, Tom; Chivers, David (2021). "22: Goodhart's Law". How to Read Numbers. Weidenfeld & Nicolson. ISBN 9781474619974.
- ^ Revanka, Roshan (January 20, 2016). "Juking the Stats". medium.com. Retrieved August 14, 2024.
Further reading
edit- Chrystal, K. Alec; Mizen, Paul D. (12 November 2001). "Goodhart's Law: Its Origins, Meaning and Implications for Monetary Policy" (PDF). Retrieved 3 July 2020.
- Manheim, David (9 June 2016). "Goodhart's Law and Why Measurement is Hard". ribbonfarm. Retrieved 3 July 2020.
- Malone, Kenny; Gonzalez, Sarah; Horowitz-Ghazi, Alexi; Goldmark, Alex (21 November 2018). "The Laws Of The Office". Planet Money (Podcast). NPR. Retrieved 3 July 2020.
- Muller, Jerry Z. (2018). The Tyranny of Metrics. Princeton University Press. ISBN 978-0-691-19126-3.