Overall methodology

Our core scientific breakthrough is to probabilistically predict the future improvement rate of any technology, building on 20 years of research at MIT. The core principle of our technological forecasting methodology builds on four important insights. First, we use manually collected reliable estimates of the yearly rate of performance improvement for different technologies, using specific metrics such as the amount of energy stored per kilogram by the best batteries every year or the amount of energy stored per dollar spent for the best solar photovoltaic available every year. Second, we identify sets of patents that represent 1757 coherent technologies in the entire US patent office database, thus describing 97.2% of the US patents from 1976 to 2015. Third, we calculate the centrality of patents belonging to each of the 1757 technologies identified in the overall US patent citation network using an algorithm that we previously implemented in the research group for this purpose. Fourth, we input the value of the average patent centrality for each of these technologies into a predictive method that was previously trained with real performance data. Thus, our algorithms provide predictions of the improvement rate per annum for nearly all definable technologies for the first time. 


Long-term performance improvement rates are defined as the “… trend of non-dominated (i.e. record-breaker) performance data points for the overall technology domain (not for individual product generations, individual companies or components)”.

It's important that the metrics be constructed as a measure of technical benefit over technical cost- for an combustion engine, it can be W/L, W/Kg or W/$. The rates reflect the improvement in the envelope of technical performance, not average performance, for a technology as we can trade one for the other. From the empirical data we have, for a single technology most of these metrics tend to improve at similar rates (Benson, 2014, page 208). This doesn't mean that a technology is equally good on all metrics. Certainly, Lithium ion is better at energy density and Lead-acid is better at cost. It just means the metrics for one technology improve at a close enough rate.

An understanding of improvement rates can improve investment strategy in a number of ways. See below a list of potential impacts of K on investment strategy.

  1. Ks help analyze development time scales. Private sector players are optimized to spot and develop short horizon technologies. Organizations can prioritize important technologies with longer time horizons which private sector players pass on with more accurate return expectations than SMEs/popular narrative.
  2. From a purely financial perspective, fast moving technologies tend to have better returns profiles and possibly shorter development cycles than slower technologies. Advanced research projects in those areas can provide faster payoffs and faster development of capabilities. These technologies are ideal for VCs, SMBs, Corporates etc for which financial returns are important
  3. Slow moving technologies would be good candidates for budget cuts if need be while limiting the impact on technical advancement in a given technology area. These are also areas to start looking for functional alternatives/ substitutes if they are critical.

To sum up, depending on the criteria of interest for a given technology (such as size, weight, power, cost or availability), a forecast on the improvement rate can help guide the investment strategy to achieve specific targets and capabilities. Thus, we will have a higher probability of picking winners and losers for technologies that have not matured yet, helping optimize investment strategy and time horizon. 

Magee et al. (2016) have built on prior work by Dosi and Arthur to define Technological Domain (TD) as “The set of artifacts that fulfill a specific generic function utilizing a particular, recognizable body of knowledge.” Magee et al. (2016) have further employed the concept of technological domains to obtain reliable empirical estimates of technology improvement rates for 30 domains over periods of decades. A consistent body of empirical evidence shows that performance improvements for individual technologies follow exponential trends over the long-term, consistent with constant yearly rates of improvement (Koh and Magee 2006; Nagy et al. 2013; Farmer and Lafond 2016; Magee et al. 2016). Studies of large sets of such data (Farmer & Lafond, 2016; Magee et al., 2016) agree that random walk around the exponential (constant yearly % increase with noise) is the most appropriate description.​
Short term segments of these noisy exponentials can be described as S curves but these do not hold up in the long-term as evident in the Farmer and Lafond analysis (2016). While different technologies all improve exponentially, they do so at different rates (Koh and Magee 2006; 2008; Magee et al. 2016). However, rate of improvement is constant (at least to a good approximation) over time in a domain (Farmer & Lafond, 2016) and for different productivity metrics within a domain (Magee et al., 2016) as shown in Figure 1.

Empirical improvement rate data
Source- Magee, C. L., Basnet, S., Funk, J. L., & Benson, C. L. (2016). Quantitative empirical trends in technical performance. Technological Forecasting and Social Change, 104, 237–246. 

By far, the most accurate and reliable indicator is a measure of the centrality of a technology’s patents in the overall U.S. patent citation network, as shown in Triulzi et al. (2020). More precisely, technologies whose patents cite very central patents tend to also have faster improvement rates, possibly as a result of enjoying more spillovers from advances in other technologies and/or because of a wider use of fast improving technologies by other technologies, proxied by patent citations. The measure of patent centrality used is a normalized version of the “Search Path Node Pair” (SPNP) index proposed by Hummon and Doreian (1989) and operationalized in a fast algorithm by Batagelj (2003) for directly acyclical graphs and popularized by, among others, Verspagen (2007) to identify the main paths of technological development in a patent citation network. The SPNP index is a measure of information centrality, conceptually similar to the random-walk betweenness centrality (see Figure 2). It measures how often a given node shows up on any path of any length connecting any given pairs of nodes in the network. Therefore, central patents are like information hubs in the citation network, representing inventions that are related technologically by a path of improvements to many other inventions that appeared before and after them.

Calculation of Centrality

The computation of centrality of cited patents in the overall patent citation network is illustrated above. Nodes are patents and arrows are citations. Patents depicted in blue are the most central in each filing year. These centralities are computed before the green and the red patents are filed.

Source- Triulzi, G., Alstott, J., & Magee, C. L. (2020). Estimating technology performance improvement rates by mining patent data. Technological Forecasting and Social Change, 158, 120100. 

Triulzi et al. (2020), normalized the centrality index by randomizing the citation network under a set of constraints, such as the indegree and outdegree of each patent, the share of citations made by each patent that goes to the same main technology field of the focal patent and the age of the citing-cited pair for each citation (for more information see Appendix B in Singh et al. (2020). This makes centrality comparable for patents granted in different moments in time and assigned to different technology fields, which, in turn, allows computing a comparable average centrality for patents across technology domains. The latter was shown to have a correlation of 0.8 with the log of the yearly improvement rate.


The patents corresponding to technological domains defined above can be reliably found using the classification overlap method (COM) described by Benson and Magee (2013; 2015). COM is an improvement over the traditional keyword search and the classification search and makes patent retrieval repeatable. Singh et al. (2020) obtained 1757 domains describing 97.2% of the US patents from 1976 to 2015, by extending, inverting and automating COM. 

Following the methodology from Triulzi et al. (Triulzi, Alstott, and Magee 2020), described above, Singh et al. (2020) further obtained the estimated improvement rates for all 1757 domains. For each of the 5083263 utility patents granted by the USPTO between 1976 and 2015, the normalized centrality index was computed using the same citation network randomization procedure presented in Triulzi et al. (2020). The average centrality of patents in each of the 1757 identified technology domains was computed and used obtain the estimated yearly performance improvement rate.  See Home for more details.