the conceptual pillars

(scroll down)

(scroll down)

The Roadmap to Superalignment


Superalignment rests on four pillars:



To get the point across as quickly as


  • Communication

  • Compute

  • Values

  • Humanity
    Superalignment rests on four pillars:



  • s discussed below. This is merely a conceptual roadmap, for the sake of brevity. We are very willing to discuss all details omitted here.


    Communication

    Superalignment is by and large a (bidirectional) communication problem. The risk of catastrophe because a superintelligence misunderstood our desired outcome, or because we misunderstood its thought process, is much greater than the risk of inherent malevolence for no apparent reason. Truthful communication is therefore crucial. 


We postulate that the truthfulness of biological information transfer is related to how directly it impacts human wellbeing. For example, a compliment can be either kind or cruel depending on whether it is said sincerely or sarcastically. Hardly so for punching someone in the face: the intent is unmistakably to hurt. In between, light and sound can scare much more immediately and intensely than words, which is why they are used to great effect in movies. We therefore propose a hierarchy for communication truthfulness based on how far removed the communication signal is from harm to wellbeing.

The Roadmap to Superalignment


Superalignment rests on four pillars:



To get the point across as quickly as


  • Communication

  • Compute

  • Values

  • Humanity
    Superalignment rests on four pillars:



  • s discussed below. This is merely a conceptual roadmap, for the sake of brevity. We are very willing to discuss all details omitted here.


    Communication

    Superalignment is by and large a (bidirectional) communication problem. The risk of catastrophe because a superintelligence misunderstood our desired outcome, or because we misunderstood its thought process, is much greater than the risk of inherent malevolence for no apparent reason. Truthful communication is therefore crucial. 


We postulate that the truthfulness of biological information transfer is related to how directly it impacts human wellbeing. For example, a compliment can be either kind or cruel depending on whether it is said sincerely or sarcastically. Hardly so for punching someone in the face: the intent is unmistakably to hurt. In between, light and sound can scare much more immediately and intensely than words, which is why they are used to great effect in movies. We therefore propose a hierarchy for communication truthfulness based on how far removed the communication signal is from harm to wellbeing.