Vendôme

Posts

Showing posts from January, 2023

What AI can do: enumerated

- January 19, 2023

"Frankenstein was written during the first Industrial Revolution, a period of enormous changes that provoked confusion and anxiety for many. It asked searching questions about man's relationship with technology: are we creating a monster we cannot control, are we losing our humanity, our compassion, our ability to feel empathy and emotions?" This was written by Paolo Gallo, the Chief Human Resources officer at World Economic Forum. "Frankenstein relies on the notion that humans will inherently reject artificial intelligence as unnatural and bizarre. A great deal of that is owed to the particularly odd appearance of Frankenstein's monster... But what about when AI comes in a more attractive package, one that has real utility?" Now here is the ever-expanding list: - playing checkers - natural language processing - be an intelligent personal assistant - GANs (optimizing both a generator & discriminator) - DALL-E (optimizing both an embedding prior mode

Read >>

Stoicism

- January 19, 2023

Every rendezvous is your exultant surrender to my intransigent soul and unlimited ambition I have known pain or fear or guilt never for rationality is my best vindication Though your defiant frankness is an unrevealing sincerity I practice the principle of scarcity though indulging how the corners of your eyes crinkle and unfold I plan to let time erode Stoicism bids at the price of innocence so you have earned the right to be light-hearted I am done with melancholy all my troubled waters are charted

Read >>

How likely is deception alignment in practice?

- January 17, 2023

How likely is deceptive alignment in practice? Speaker: Buck Shlegeris What does ML inductive biases look like? 1. High path-dependence: - Different training runs can converge to very different models depending on the particular path taken through model space 2. Low path-dependence - Similar training processes converge to essentially the same, simple solution, regardless of early training dynamics Deceptive alignment in the high path-dependence world Suppose our training process is good enough that, for the model to do well, it has to fully understand what we want--- essentially what you get in the limit of doing enough adversarial training. Goal attainment for models: 1. How much marginal performance improvement do we get from each step toward the model class? 2. How many steps are needed until the model becomes a member of that class? Types of Alignment 1. Internal alignment: An internally aligned mesa-optimizer is a robustly aligned mesa-optimizer that has internalized the

Read >>

Heuristics

- January 13, 2023

DesignBoom Graphic - sculptural, AI-generated facades renaissance + baroque forms with fluid silk What is a heuristic technique? From our most credible source, I gathered that it is... An approach to problem-solve by employing a practical method, arriving at a satisfactory solution nonetheless by taking mental shortcuts and easing the cognitive load of making a decision. Although, these practices are not resistant to cognitive biases. We operate on the terms of bounded rationality, or in my own words, conditional rationality. It is conditional in the sense that within computational boundaries and situational urgency, the behavioral strategic selection is a perfected one. Heuristics converge on human psychology, delving into self-consciousness and fine-tuning, one decision upon another. Some of the most frequently used examples of heuristics are anchoring and adjustment, which is simply giving a lower or higher bound to allow for systematic adjustments and reasonably deviate from th

Read >>

Alignment landscape, learned content

- January 10, 2023

More research output on LessWrong than on arxiv, interesting How might white-box methods fit into the Alignment Plan: 1. Model internal access during training and deployment 2. The promise of AI to empower Within every research group working on ML models, we can decompose the workforce into such categories: 1. Data team (paying humans to generate data points) 2. Oversight team 3. Deployment of SGD-where-RLHF-is-the-algorithm team RLHF is Reinforcement Learning from Human Feedback, and the problems with baseline RLHF are oversight and catastrophes. Current proposals that have these problems are: 1. using AIs to help oversee (oversight) 2. Adversarial training (catastrophes) After reading Holden Karnofsky's post " [How might we align transformative AI if it’s developed very soon?] ", we can conclude that the remaining problems for current ML models are: 1. Eliciting latent knowledge 2. Easier to detect fakes than to produce fakes. For ChatGPT at least, it is difficult

Read >>

EA and the FTX collapse

- January 10, 2023

EA and the state of FTX EAGxBerkeley, Dec 2-4th, 2022 - #1 rule in cryptocurrency--- never let customers trade collateral - ensuring no gambling - Alameda Research was a customer that was allowed to do gambling during the 2021 bull market - Alameda was short about 8 billion (not known at the time) which resulted in liabilities superseding assets, committing one of the greatest accounting errors in history So what can we say about FTX liquidation in general? Perhaps the fact that it was not handling risk management well, fraudulent accounting and risky derivatives caused domino effects 1. misconduct in statements from SBF was not publicly known 2. depositors are victims (virtue ethics...) What can we learn from this mistake? 1. make sure future EA know how smart oversights work 2. more governance, and sloppy accounting because of agent/CEO demanding sloppy accounting can not be perpetuated further 3. Creating more "firewalls" What are some of the persistent problems?

Read >>