Forecasters should be tested by the Brier score and not just by the calibration score, which can always be made arbitrarily small. The Brier score is the sum of the calibration score and the refinement score; the latter measures how good the sorting into bins with the same forecast is, and thus attests to expertise.  This raises the question of whether one can gain calibration without losing expertise, which we refer to as calibeating.  We provide an easy way to calibeat any forecast, by a deterministic online procedure. We moreover show that calibeating can be achieved by a stochastic procedure that is itself calibrated, and then extend the results to simultaneously calibeating multiple procedures, and to deterministic procedures that are continuously calibrated.
A formal write-up of the simple proof (1995) of the existence of calibrated forecasts by the minimax theorem, which moreover shows that N^3 periods suffice to guarantee a 1/N calibration error.
Abraham Neyman Elon Kohlberg. 2021. “Demystifying the Math of the Coronavirus”. Publisher's Version Abstract
We provide an elementary mathematical description of the spread of the coronavirus. We explain two fundamental relationships: How the rate of growth in new infections is determined by the effective reproductive number ; and how the effective reproductive number is affected by social distancing. By making a key approximation, we are able to formulate these relationships very simply and thereby avoid complicated mathematics. The same approximation leads to an elementary method for estimating the effective reproductive number.
Eyal Winter Constantine Sorokin. 2021. “Pure Information Design in Classic Auctions”. Publisher's Version Abstract
In many auction environments sellers are better informed about bidders' valuations than the bidders themselves. For such environments we derive a sharp and general optimal policy of information transmission in the case of independent private values. Under this policy bidders whose (ex-post) valuation is below a certain threshold are provided with all the information (about their valuations), but those bidders whose valuation lies below the threshold receive no information whatsoever. Surprisingly, the threshold expressed in percentiles is independent of the probability distribution over bidders' ex-post valuations; it depends solely on the number of bidders. Similar results are also derived for the bidder-optimal policy. Our analysis builds on the approach of Bayesian persuasion  and on a linearity of sellers' revenues as a function of the inverse distribution. This latter property allows us to use important results on stochastic comparisons.
In the framework of a private-value auction first-price, we consider the seller as a player in a game with the buyers in which he has private information about their realized values. We ask whether the seller can benefit by using his private information strategically. We find that in fact, depending upon his information, set of signals, and commitment power the seller may indeed increase his revenue by strategic transmission of his information. We study mainly the case of partial truthful commitment (VC) in which the seller can commit to send only truthful (verifiable) messages. We show that in the case of two buyers with values distributed independently uniformly on [0,1], a seller informed of the private values of the buyers, can achieve a revenue close to 1/2 by sending verifiable messages (compared to 1/3 in the standard auction), and this is the largest revenue that he can reach with any signaling strategy and any level of commitment. The case studied here provides valuable insight into the issue of strategic use of information which applies more generally.
A stumper is a riddle whose solution is typically so elusive that it does not come to mind, at least initially - leaving the responder stumped. Stumpers work by eliciting a (typically visual) representation of the narrative, in which the solution is not to be found. In order to solve the stumper, the blocking representation must be changed, which does not happen to most respondents. I have collected all the riddles I know at this time that qualify, in my opinion, as stumpers. I have composed a few, and tested many. Whenever rates of correct solutions were available, they are included, giving a rough proxy for difficulty
Robert J. Aumann. 2021. “Why Consciousness?”. Publisher's Version Abstract
Emotions specially desire and the objects of desire, like enjoyment and satisfaction drive much of what we do; indeed they drive all we do that is not recurrent. They are thus indispensable to human life. Inter alia, emotions enable the operation of incentives like hunger for eating that motivate us to perform tasks that are vital to our lives. We suggest that the adaptive function of consciousness is to enable emotions to operate.
A set of sensors is used to identify which of the users, from a pre-specified set of users, is currently using a device. Each sensor provides a name of a user and a real number representing its level of confidence in the assessment. However, the sensors measure different signals for different traits that are largely unrelated. To be able to implement a policy based on these measurements, one needs to aggregate the information provided by all sensors. We use an axiomatic approach to provide several reasonable trust functions. We show that by providing a few desirable properties we can derive several solutions that are characterized by these properties. Our analysis makes use of an important result by Kolmogorov (1930).
Maya Bar-Hillel Daniel Kahneman. 2020. “Comment: Laplace and Cognitive Illusions”. Publisher's Version Abstract
Reports in the 1970s of cognitive illusions in judgments of uncertainty had been anticipated by Laplace 150 years earlier. We discuss Miller and Gelman's remark that Laplace's anticipation of the main ideas of the heuristics and biases approach "gives us a new perspective on these ideas as more universal and less contingent on particular developments [that came much] later."
Eyal Winter Alex Gershkov. 2020. “Exploitative Priority Service”. Publisher's Version Abstract
We analyze the implications of introducing priority service on customers' welfare. In monopoly markets, introducing priority service decreases the customers' surplus despite increasing the assignment efficiency: the monopolist extracts from customers a total payment higher than the total efficiency gain generated by the service and hence leaves customers worse off compared with the situation where no priority is offered at all. In duopoly markets with homogeneous customers the equilibrium price and customers' welfare coincide with the monopoly outcome where this monopolist faces half of the market. With heterogeneous customers as well priority reduces the aggregated consumers' welfare. Our conclusion is that priority service erects barriers to competition that are embedded in the nature of the service provided, with the victims of these barriers primarily being agents with low willingness or low ability to pay for the priority.
Maya Bar-Hillel Yigal Attali. 2020. “False Allure of Fast Lures, The”. Publisher's Version Abstract
The Cognitive Reflection Test (CRT) allegedly measures the tendency to override the prepotent incorrect answers to some special problems, and to engage in further reflection. A growing literature suggests that the CRT is a powerful predictor of performance in a wide range of tasks. This research has mostly glossed over the fact that the CRT is composed of math problems. The purpose of this paper is to investigate whether numerical CRT items do indeed call upon more than is required by standard math problems, and whether the latter predict performance in other tasks as well as the CRT. In Study 1 we selected from a bank of standard math problems items that, like CRT items, have a fast lure, as well as others which do not. A 1-factor model was the best supported measurement model for the underlying abilities required by all three item types. Moreover, the quality of all these items "CRT and math problems alike "as predictors of performance on a set of choice and reasoning tasks did not depend on whether or not they had a fast lure, but rather only on their quality as math items. In other words, CRT items seem not to be a special  category of math problems, although they are quite excellent ones. Study 2 replicated these results with a different population and a different set of math problems.
Stafford (2018) found that female chess players outperform expectations when playing against men, in a study of data from over 5.5 million official games around the world. I examined whether that result could stem from not controlling for the ages of both players, as female players tend to be much younger than male players. Using the same data as Stafford, I was able to replicate his main result only when the opponent s age was ignored. When the ages of both players were included in the analysis, the gender-composition effect was reversed. Further analyses using other data demonstrated the robustness of this pattern, re-establishing that female chess players underperform when playing against men. Prior to Stafford s paper, the leading premise was that women encounter psychological obstacles that prevent them from performing at their normal capacity against men. My commentary continues that line of evidence and is consistent with the stereotype-threat explanation.
In the standard Bayesian framework the data are assumed to be generated by a distribution parametrized by ¸ in a parameter space , over which a prior distribution is defined. A Bayesian statistician quantifies the belief that the true parameter is ¸_0 in by its posterior probability given the observed data. We investigate the behavior of the posterior belief in ¸_0 when the data are generated under some parameter ¸_1, which may or may not be be the same as ¸_0. Starting from stochastic orders, specifically, likelihood ratio dominance, that obtain for resulting distributions of posteriors, we consider monotonicity properties of the posterior probabilities as a function of the sample size when data arrive sequentially. While the ¸_0-posterior is monotonically increasing (i.e., it is a submartingale) when the data are generated under that same ¸_0, it need not be monotonically decreasing in general, not even in terms of its overall expectation, when the data are generated under a different ¸_1; in fact, it may keep going up and down many times. In the framework of simple iid coin tosses, we show that under certain conditions the overall expected posterior of ¸_0 eventually becomes monotonically decreasing when the data are generated under ¸_1 ¸_0. Moreover, we prove that when the prior is uniform this expected posterior is a log-concave function of the sample size, by developing an inequality that is related to Tur¡n's inequality for Legendre polynomials.
The base-rate fallacy is people's tendency to ignore base rates in favor of, e.g., individuating information (when such is available), rather than integrate the two. This tendency has important implications for understanding judgment phenomena in many clinical, legal, and social-psychological settings. An explanation of this phenomenon is offered, according to which people order information by its perceived degree of relevance, and let high-relevance information dominate low-relevance information. Information is deemed more relevant when it relates more specifically to a judged target case. Specificity is achieved either by providing information on a smaller set than the overall population, of which the target case is a member, or when information can be coded, via causality, as information about the specific members of a given population. The base-rate fallacy is thus the result of pitting what seem to be merely coincidental, therefore low-relevance, base rates against more specific, or causal, information. A series of probabilistic inference problems is presented in which relevance was manipulated with the means described above, and the empirical results confirm the above account. In particular, base rates will be combined with other information when the two kinds of information are perceived as being equally relevant to the judged case.
Dean P. Foster Sergiu Hart. 2019. “Forecast-Hedging and Calibration”. Publisher's Version Abstract
Calibration means that for each forecast x the average of the realized actions in the periods in which the forecast was x is, in the long run, close to x. Calibration can always be guaranteed (Foster and Vohra 1998), but it requires the forecasting procedure to be stochastic. By contrast, smooth calibration, which combines in a continuous manner nearby forecasts, can be guaranteed by a deterministic procedure (Foster and Hart 2018). In the present paper we develop the concept of forecast-hedging, which consists of choosing the forecasts in such a way that, no matter what the realized action will be, the expected forecasting track record can only improve. This approach integrates the existing calibration results by obtaining them all from the same simple basic argument, and at the same time differentiates between them according to the forecast-hedging tools that are used: deterministic and fixed point-based vs. stochastic and minimax-based. Additional benefits are new calibration procedures in the one-dimensional case that are simpler than all known such procedures, and a short proof for deterministic smooth calibration, in contrast to the complicated existing proof.
A multi-item questionnaire concerning lay people's attitudes toward organ procurement without consent from executed prisoners was given to several hundred respondents. The items ranged from all-out condemnation ("It is tantamount to murder") to enthusiasm ("It is great to have this organ supply"). Overall, we found two guiding principles upheld by most respondents: (1) Convicts have as much a right to their bodies and organs as other people, so the practice should be judged by the same standards as those that guide organ procurement from any donor. Procuring organs without consent is wrong. (2) Benefiting from those organs should be held to more lenient standards than are demanded for their procurement. So, benefitting from these ill-gotten organs should be tolerated.
Bar-Hillel, Noah and Frederick (2018) studied a class of riddles they called stumpers, which have simple, but curiously elusive, solutions. A canonical example is: "Andy is Bobbie's brother, but Bobbie is not Andy's brother. How come?" Though not discussed there, we found that the ability to solve stumpers correlates significantly with performance on items resembling the CRT (Cognitive Reflection Test) but not with performance on items from the CRAT (Compound Remote Associates Test). We report those results here.
We consider the basic setup of one seller, one buyer, and one good, where the seller is risk averse, and characterize the mechanism that maximizes the seller's expected utility. In contrast to the risk-neutral case, where a single deterministic price is optimal, we show that in the risk averse case the optimal mechanism consists of a continuum of lotteries.
The Bayesian posterior probability of the true state is stochastically dominated by that same posterior under the probability law of the true state. This generalizes to notions of "optimism" about posterior probabilities.