Crystal Ball for Interest Rates

rw-book-cover

Metadata

Highlights

  • While having more data is helpful, large pretrained models make it practical to build viable systems using a very small labeled training set — perhaps just a handful of examples specific to your application. (View Highlight)
  • With self-supervised learning, pretraining can happen on unlabeled data. So, technically, the model did need a lot of data for training, but that was unlabeled, general text or image data. Then, even with only a small amount of labeled, task-specific data, you can get good performance. (View Highlight)
  • The 2010s were the decade of large supervised models, I think the 2020s are shaping up to be the decade of large pretrained models. However, there is one important caveat: This approach works well for unstructured data (text, vision and audio) but not for structured data, and the majority of machine learning applications today are built on structured data. (View Highlight)
  • diverse unstructured data found on the web generalize to a variety of unstructured data tasks of the same input modality. This is because text/images/audio on the web have many similarities to whatever specific text/image/audio task you might want to solve. But structured data such as tabular data is much more heterogeneous (View Highlight)
  • Why it matters: The EU is on the leading edge of regulating AI. As with many national-level efforts, Europe’s investigations into social media algorithms could reduce harms and promote social well-being well beyond the union’s borders. We’re thinking: This is a welcome step. Governments need to understand technology before they can craft thoughtful regulations to manage it. ECAT looks like a strong move in that direction. (View Highlight)
  • We’re thinking: Custom models built by teams outside the tech sector are gaining steam. Bloomberg itself — which makes most of its money providing financial data — trained a BLOOM-style model on its corpus and found that it performed financial tasks significantly better than a general-purpose model. (View Highlight)
  • TinyML shows promise for bringing deep learning to applications where electrical power is scarce, processing in the cloud is impractical, and/or data privacy is paramount. The trick is to get high-performance algorithms to run on hardware that offers limited computation, memory, and electrical power. (View Highlight)