Biography:Paul Christiano (researcher)

Paul Christiano
Paul Christiano
Alma mater	Massachusetts Institute of Technology (BS); University of California, Berkeley (PhD);
Known for	AI alignment; Reinforcement learning from human feedback;
	Scientific career
Institutions	OpenAI; Alignment Research Center;
Website	paulfchristiano.com

Short description: American AI safety researcher

Paul Christiano is an American researcher in the field of artificial intelligence (AI), with a specific focus on AI alignment, which is the subfield of AI safety research that aims to steer AI systems toward human interests.^[1] He formerly led the language model alignment team at OpenAI and became founder and head of the non-profit Alignment Research Center (ARC), which works on theoretical AI alignment and evaluations of machine learning models.^[2]^[3] In 2023, Christiano was named as one of the TIME 100 Most Influential People in AI (TIME100 AI).^[3]^[4]

In September 2023, Christiano was appointed to the UK government's Frontier AI Taskforce advisory board.^[5] He is also an initial trustee on Anthropic's Long-Term Benefit Trust.^[6]

Education

Christiano attended the Harker School in San Jose, California.^[7] He competed on the U.S. team and won a silver medal at the 49th International Mathematics Olympiad (IMO) in 2008.^[7]^[8]

In 2012, Christiano graduated from the Massachusetts Institute of Technology (MIT) with a degree in mathematics.^[9]^[10] At MIT, he researched data structures, quantum cryptography, and combinatorial optimization.^[10]

He then went on to complete a PhD at the University of California, Berkeley.^[11] While at Berkeley, Christiano collaborated with researcher Katja Grace on AI Impacts, co-developing a preliminary methodology for comparing supercomputers to brains, using traversed edges per second (TEPS).^[12] He also experimented with putting Carl Shulman's donor lottery theory into practice, raising nearly $50,000 in a pool to be donated to a single charity.^[13]

Career

At OpenAI, Christiano co-authored the paper "Deep Reinforcement Learning from Human Preferences" (2017) and other works developing reinforcement learning from human feedback (RLHF).^[14]^[15] He is considered one of the principal architects of RLHF,^[3]^[6] which in 2017 was "considered a notable step forward in AI safety research", according to The New York Times .^[16] Other works such as "AI safety via debate" (2018) focus on the problem of scalable oversight – supervising AIs in domains where humans would have difficulty judging output quality.^[17]^[18]^[19]

Christiano left OpenAI in 2021 to work on more conceptual and theoretical issues in AI alignment and subsequently founded the Alignment Research Center to focus on this area.^[1] One subject of study is the problem of eliciting latent knowledge from advanced machine learning models.^[20]^[21] ARC also develops techniques to identify and test whether an AI model is potentially dangerous.^[3] In April 2023, Christiano told The Economist that ARC was considering developing an industry standard for AI safety.^[22]

Views on AI risks

He is known for his views on the potential risks of advanced AI. In 2017, Wired magazine stated that Christiano and his colleagues at OpenAI weren't worried about the destruction of the human race by "evil robots", explaining that "[t]hey’re more concerned that, as AI progresses beyond human comprehension, the technology’s behavior may diverge from our intended goals."^[23]

However, in a widely quoted interview with Business Insider in 2023, Christiano said that there is a “10–20% chance of AI takeover, [with] many [or] most humans dead.” He also conjectured a “50/50 chance of doom shortly after you have AI systems that are human level.”^[24]^[1]

Personal life

Christiano is married to Ajeya Cotra of the Open Philanthropy Project.^[25]

References

↑ ^1.0 ^1.1 ^1.2 "A.I. has a '10 or 20% chance' of conquering humanity, former OpenAI safety researcher warns" (in en). https://fortune.com/2023/05/03/openai-ex-safety-researcher-warns-ai-destroy-humanity/.
↑ Piper, Kelsey (2023-03-29). "How to test what an AI model can — and shouldn't — do" (in en). https://www.vox.com/future-perfect/2023/3/29/23661633/gpt-4-openai-alignment-research-center-open-philanthropy-ai-safety.
↑ ^3.0 ^3.1 ^3.2 ^3.3 Henshall, Will (September 7, 2023). "Paul Christiano – Founder, Alignment Research Center". TIME magazine. https://time.com/collection/time100-ai/6309030/paul-christiano/.
↑ Sibley, Jess (September 10, 2023). "The Future Is Now". Time (magazine) 202 (11/12). https://search.ebscohost.com/login.aspx?direct=true&db=a9h&AN=172374416&lang=en-gb&site=eds-live&scope=site.
↑ Skelton, Sebastian Klovig (7 September 2023). "Government AI taskforce appoints new advisory board members". ComputerWeekly.com. https://www.computerweekly.com/news/366551256/Government-AI-taskforce-appoints-new-advisory-board-members.
↑ ^6.0 ^6.1 Matthews, Dylan (25 September 2023). "The $1 billion gamble to ensure AI doesn’t destroy humanity". Vox. https://www.vox.com/future-perfect/23794855/anthropic-ai-openai-claude-2.
↑ ^7.0 ^7.1 Kehoe, Elaine (October 2008). "Mathematics People – 2008 International Mathematical Olympiad". https://www.ams.org/notices/200810/tx081001284p.pdf.
↑ Feng, Zumin; Gelca, Razvan; Le, Ian; Dunbar, Steven R. (June 2009). "NEWS AND LETTERS: 49th International Mathematical Olympiad". Mathematics Magazine 82 (e): 235–238. https://www.jstor.org/stable/27765911.
↑ "Paul F. Christiano". https://dl.acm.org/profile/81485658302.
↑ ^10.0 ^10.1 "About the Authors: Theory of Computing: An Open Access Electronic Journal in Theoretical Computer Science". https://theoryofcomputing.org/articles/v009a009/about.html.
↑ "Paul Christiano – Research Associate" (in en-GB). http://www.fhi.ox.ac.uk/.
↑ Hsu, Jeremy (26 August 2015). "Estimate: Human Brain 30 Times Faster than Best Supercomputers". IEEE Spectrum. https://spectrum.ieee.org/estimate-human-brain-30-times-faster-than-best-supercomputers.
↑ Paynter, Ben (January 31, 2017). "Take A Chance With Your Charity And Try A Donor Lottery". Fast Company. https://www.fastcompany.com/3067596/take-a-chance-with-your-charity-and-try-a-donor-lottery.
↑ Christiano, Paul F; Leike, Jan; Brown, Tom; Martic, Miljan; Legg, Shane; Amodei, Dario (2017). "Deep Reinforcement Learning from Human Preferences". Advances in Neural Information Processing Systems (Curran Associates, Inc.) 30. https://proceedings.neurips.cc/paper_files/paper/2017/hash/d5e2c0adad503c91f91df240d0cd4e49-Abstract.html.
↑ Ouyang, Long; Wu, Jeffrey; Jiang, Xu; Almeida, Diogo; Wainwright, Carroll; Mishkin, Pamela; Zhang, Chong; Agarwal, Sandhini et al. (2022-12-06). "Training language models to follow instructions with human feedback" (in en). Advances in Neural Information Processing Systems 35: 27730–27744. https://proceedings.neurips.cc/paper_files/paper/2022/hash/b1efde53be364a73914f58805a001731-Abstract-Conference.html.
↑ Metz, Cade (August 13, 2017). "Teaching A.I. Systems to Behave Themselves". The New York Times. https://www.nytimes.com/2017/08/13/technology/artificial-intelligence-safety-training.html.
↑ Irving, G.; Christiano, P.; Amodei, Dario (2018-05-02). "AI safety via debate". arXiv:1805.00899 [stat.ML].
↑ Wu, Jeff; Ouyang, Long; Ziegler, Daniel M.; Stiennon, Nissan; Lowe, Ryan; Leike, J.; Christiano, P. (2021-09-22). "Recursively Summarizing Books with Human Feedback". arXiv:2109.10862 [cs.CL].
↑ Christiano, P.; Shlegeris, Buck; Amodei, Dario (2018-10-19). "Supervising strong learners by amplifying weak experts". arXiv:1810.08575 [cs.LG].
↑ Burns, Collin; Ye, Haotian; Klein, Dan; Steinhardt, Jacob (2022). "Discovering Latent Knowledge in Language Models Without Supervision". arXiv:2212.03827 [cs.CL].
↑ Christiano, Paul; Cotra, Ajeya; Xu, Mark (December 2021). "Eliciting Latent Knowledge: How to tell if your eyes deceive you" (in en). Alignment Research Center. https://docs.google.com/document/d/1WwsnJQstPq91_Yh-Ch2XRL8H_EpsnjrC1dwZXR37PC8/edit?usp=embed_facebook.
↑ "How generative models could go wrong". The Economist. 2023-04-19. https://www.economist.com/science-and-technology/2023/04/19/how-generative-models-could-go-wrong.
↑ Newman, Lily Hay (September 2017). "Should We Worry? – Will AI Turn Against Me?". Wired. https://www.wired.com/2017/08/dont-worry-be-happy/.
↑ Nolan, Beatrice. "Ex-OpenAI researcher says there's a 50% chance AI development could end in 'doom'" (in en-US). https://www.businessinsider.com/openai-researcher-ai-doom-50-chatgpt-2023-5.
↑ Piper, Kelsey (June 2023). "A Field Guide to AI Safety". Asterisk Magazine (3). https://asteriskmag.com/issues/03/a-field-guide-to-ai-safety.

External links

Personal website

0.00

(0 votes)

Original source: https://en.wikipedia.org/wiki/Paul Christiano (researcher). Read more

[:0-1] 1.0 ^1.1 ^1.2 "A.I. has a '10 or 20% chance' of conquering humanity, former OpenAI safety researcher warns" (in en). https://fortune.com/2023/05/03/openai-ex-safety-researcher-warns-ai-destroy-humanity/.

[2] Piper, Kelsey (2023-03-29). "How to test what an AI model can — and shouldn't — do" (in en). https://www.vox.com/future-perfect/2023/3/29/23661633/gpt-4-openai-alignment-research-center-open-philanthropy-ai-safety.

[:1-3] 3.0 ^3.1 ^3.2 ^3.3 Henshall, Will (September 7, 2023). "Paul Christiano – Founder, Alignment Research Center". TIME magazine. https://time.com/collection/time100-ai/6309030/paul-christiano/.

[4] Sibley, Jess (September 10, 2023). "The Future Is Now". Time (magazine) 202 (11/12). https://search.ebscohost.com/login.aspx?direct=true&db=a9h&AN=172374416&lang=en-gb&site=eds-live&scope=site.

[5] Skelton, Sebastian Klovig (7 September 2023). "Government AI taskforce appoints new advisory board members". ComputerWeekly.com. https://www.computerweekly.com/news/366551256/Government-AI-taskforce-appoints-new-advisory-board-members.

[:3-6] 6.0 ^6.1 Matthews, Dylan (25 September 2023). "The $1 billion gamble to ensure AI doesn’t destroy humanity". Vox. https://www.vox.com/future-perfect/23794855/anthropic-ai-openai-claude-2.

[:4-7] 7.0 ^7.1 Kehoe, Elaine (October 2008). "Mathematics People – 2008 International Mathematical Olympiad". https://www.ams.org/notices/200810/tx081001284p.pdf.

[8] Feng, Zumin; Gelca, Razvan; Le, Ian; Dunbar, Steven R. (June 2009). "NEWS AND LETTERS: 49th International Mathematical Olympiad". Mathematics Magazine 82 (e): 235–238. https://www.jstor.org/stable/27765911.

[9] "Paul F. Christiano". https://dl.acm.org/profile/81485658302.

[:2-10] 10.0 ^10.1 "About the Authors: Theory of Computing: An Open Access Electronic Journal in Theoretical Computer Science". https://theoryofcomputing.org/articles/v009a009/about.html.

[11] "Paul Christiano – Research Associate" (in en-GB). http://www.fhi.ox.ac.uk/.

[12] Hsu, Jeremy (26 August 2015). "Estimate: Human Brain 30 Times Faster than Best Supercomputers". IEEE Spectrum. https://spectrum.ieee.org/estimate-human-brain-30-times-faster-than-best-supercomputers.

[13] Paynter, Ben (January 31, 2017). "Take A Chance With Your Charity And Try A Donor Lottery". Fast Company. https://www.fastcompany.com/3067596/take-a-chance-with-your-charity-and-try-a-donor-lottery.

[14] Christiano, Paul F; Leike, Jan; Brown, Tom; Martic, Miljan; Legg, Shane; Amodei, Dario (2017). "Deep Reinforcement Learning from Human Preferences". Advances in Neural Information Processing Systems (Curran Associates, Inc.) 30. https://proceedings.neurips.cc/paper_files/paper/2017/hash/d5e2c0adad503c91f91df240d0cd4e49-Abstract.html.

[15] Ouyang, Long; Wu, Jeffrey; Jiang, Xu; Almeida, Diogo; Wainwright, Carroll; Mishkin, Pamela; Zhang, Chong; Agarwal, Sandhini et al. (2022-12-06). "Training language models to follow instructions with human feedback" (in en). Advances in Neural Information Processing Systems 35: 27730–27744. https://proceedings.neurips.cc/paper_files/paper/2022/hash/b1efde53be364a73914f58805a001731-Abstract-Conference.html.

[16] Metz, Cade (August 13, 2017). "Teaching A.I. Systems to Behave Themselves". The New York Times. https://www.nytimes.com/2017/08/13/technology/artificial-intelligence-safety-training.html.

[17] Irving, G.; Christiano, P.; Amodei, Dario (2018-05-02). "AI safety via debate". arXiv:1805.00899 [stat.ML].

[18] Wu, Jeff; Ouyang, Long; Ziegler, Daniel M.; Stiennon, Nissan; Lowe, Ryan; Leike, J.; Christiano, P. (2021-09-22). "Recursively Summarizing Books with Human Feedback". arXiv:2109.10862 [cs.CL].

[19] Christiano, P.; Shlegeris, Buck; Amodei, Dario (2018-10-19). "Supervising strong learners by amplifying weak experts". arXiv:1810.08575 [cs.LG].

[20] Burns, Collin; Ye, Haotian; Klein, Dan; Steinhardt, Jacob (2022). "Discovering Latent Knowledge in Language Models Without Supervision". arXiv:2212.03827 [cs.CL].

[21] Christiano, Paul; Cotra, Ajeya; Xu, Mark (December 2021). "Eliciting Latent Knowledge: How to tell if your eyes deceive you" (in en). Alignment Research Center. https://docs.google.com/document/d/1WwsnJQstPq91_Yh-Ch2XRL8H_EpsnjrC1dwZXR37PC8/edit?usp=embed_facebook.

[22] "How generative models could go wrong". The Economist. 2023-04-19. https://www.economist.com/science-and-technology/2023/04/19/how-generative-models-could-go-wrong.

[23] Newman, Lily Hay (September 2017). "Should We Worry? – Will AI Turn Against Me?". Wired. https://www.wired.com/2017/08/dont-worry-be-happy/.

[24] Nolan, Beatrice. "Ex-OpenAI researcher says there's a 50% chance AI development could end in 'doom'" (in en-US). https://www.businessinsider.com/openai-researcher-ai-doom-50-chatgpt-2023-5.

[25] Piper, Kelsey (June 2023). "A Field Guide to AI Safety". Asterisk Magazine (3). https://asteriskmag.com/issues/03/a-field-guide-to-ai-safety.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

Anonymous

Search

Biography:Paul Christiano (researcher)

Namespaces

More

Page actions

Contents

Education

Career

Views on AI risks

Personal life

References

External links

Navigation

Navigation

Help

Translate

Wiki tools

Wiki tools

Anonymous

Search

Biography:Paul Christiano (researcher)

Education

Career

Views on AI risks

Personal life

References

External links

Navigation

Wiki tools

Page tools

Other projects

Categories