Bibliography Scala Software Engineering

Programming in Scala, Fifth Edition, by Martin Odersky – B097KT5XNK ISBN-13: 978-0997148008, June 15, 2021

See: Programming in Scala Fifth Edition, by Martin Odersky, Publisher ‏ : ‎ Artima Press; 5th edition (June 15, 2021)

Fair Use Source:

This book is the authoritative tutorial on the Scala programming language, co-written by the language’s designer, Martin Odersky. This fifth edition is a major rewrite of the entire book, adding new material to cover the many changes in Scala 3.0. In fact we have added so much new material that we split the book into two volumes. This volume is a tutorial of Scala and functional programming.

Praise for the earlier editions of Programming in Scala

Programming in Scala is probably one of the best programming books I’ve ever read. I like the writing style, the brevity, and the thorough explanations. The book seems to answer every question as it enters my mind—it’s always one step ahead of me. The authors don’t just give you some code and take things for granted. They give you the meat so you really understand what’s going on. I really like that.

  • Ken Egervari, Chief Software Architect

Programming in Scala is clearly written, thorough, and easy to follow. It has great examples and useful tips throughout. It has enabled our organization to ramp up on the Scala language quickly and efficiently. This book is great for any programmer who is trying to wrap their head around the flexibility and elegance of the Scala language.

  • Larry Morroni, Owner, Morroni Technologies, Inc.

The Programming in Scala book serves as an excellent tutorial to the Scala language. Working through the book, it flows well with each chapter building on concepts and examples described in earlier ones. The book takes care to explain the language constructs in depth, often providing examples of how the language differs from Java. As well as the main language, there is also some coverage of libraries such as containers and actors.

I have found the book really easy to work through, and it is probably one of the better written technical books I have read recently. I really would recommend this book to any programmer wanting to find out more about the Scala language.

  • Matthew Todd

I am amazed by the effort undertaken by the authors of Programming in Scala. This book is an invaluable guide to what I like to call Scala the Platform: a vehicle to better coding, a constant inspiration for scalable software design and implementation. If only I had Scala in its present mature state and this book on my desk back in 2003, when co-designing and implementing parts of the Athens 2004 Olympic Games Portal infrastructure!

To all readers: No matter what your programming background is, I feel you will find programming in Scala liberating and this book will be a loyal friend in the journey.

  • Christos KK Loverdos, Software Consultant, Researcher

Programming in Scala is a superb in-depth introduction to Scala, and it’s also an excellent reference. I’d say that it occupies a prominent place on my bookshelf, except that I’m still carrying it around with me nearly everywhere I go.

  • Brian Clapper, President, ArdenTex, Inc.

Great book, well written with thoughtful examples. I would recommend it to both seasoned programmers and newbies.

  • Howard Lovatt

The book Programming in Scala is not only about how, but more importantly, why to develop programs in this new programming language. The book’s pragmatic approach in introducing the power of combining object-oriented and functional programming leaves the reader without any doubts as to what Scala really is.

  • Dr. Ervin Varga, CEO/founder, EXPRO I.T. Consulting

This is a great introduction to functional programming for OO programmers. Learning about FP was my main goal, but I also got acquainted with some nice Scala surprises like case classes and pattern matching. Scala is an intriguing language and this book covers it well.

There’s always a fine line to walk in a language introduction book between giving too much or not enough information. I find Programming in Scala to achieve a perfect balance.

  • Jeff Heon, Programmer Analyst

I bought an early electronic version of the Programming in Scala book, by Odersky, Spoon, and Venners, and I was immediately a fan. In addition to the fact that it contains the most comprehensive information about the language, there are a few key features of the electronic format that impressed me. I have never seen links used as well in a PDF, not just for bookmarks, but also providing active links from the table of contents and index. I don’t know why more authors don’t use this feature, because it’s really a joy for the reader. Another feature which I was impressed with was links to the forums (“Discuss”) and a way to send comments (“Suggest”) to the authors via email. The comments feature by itself isn’t all that uncommon, but the simple inclusion of a page number in what is generated to send to the authors is valuable for both the authors and readers. I contributed more comments than I would have if the process would have been more arduous.

Read Programming in Scala for the content, but if you’re reading the electronic version, definitely take advantage of the digital features that the authors took the care to build in!

  • Dianne Marsh, Founder/Software Consultant, SRT Solutions

Lucidity and technical completeness are hallmarks of any well-written book, and I congratulate Martin Odersky, Lex Spoon, and Bill Venners on a job indeed very well done! The Programming in Scala book starts by setting a strong foundation with the basic concepts and ramps up the user to an intermediate level & beyond. This book is certainly a must buy for anyone aspiring to learn Scala.

  • Jagan Nambi, Enterprise Architecture, GMAC Financial Services

Programming in Scala is a pleasure to read. This is one of those well-written technical books that provide deep and comprehensive coverage of the subject in an exceptionally concise and elegant manner.

The book is organized in a very natural and logical way. It is equally well suited for a curious technologist who just wants to stay on top of the current trends and a professional seeking deep understanding of the language core features and its design rationales. I highly recommend it to all interested in functional programming in general. For Scala developers, this book is unconditionally a must-read.

  • Igor Khlystov, Software Architect/Lead Programmer, Greystone Inc.

The book Programming in Scala outright oozes the huge amount of hard work that has gone into it. I’ve never read a tutorial-style book before that accomplishes to be introductory yet comprehensive: in their (misguided) attempt to be approachable and not “confuse” the reader, most tutorials silently ignore aspects of a subject that are too advanced for the current discussion. This leaves a very bad taste, as one can never be sure as to the understanding one has achieved. There is always some residual “magic” that hasn’t been explained and cannot be judged at all by the reader. This book never does that, it never takes anything for granted: every detail is either sufficiently explained or a reference to a later explanation is given. Indeed, the text is extensively cross-referenced and indexed, so that forming a complete picture of a complex topic is relatively easy.

  • Gerald Loeffler, Enterprise Java Architect

Programming in Scala by Martin Odersky, Lex Spoon, and Bill Venners: in times where good programming books are rare, this excellent introduction for intermediate programmers really stands out. You’ll find everything here you need to learn this promising language.

  • Christian Neukirchen

Programming in Scala, Fifth Edition

Martin Odersky, Lex Spoon, Bill Venners, and Frank Sommers

Artima Press, Walnut Creek, California

Martin Odersky is the creator of the Scala language and a professor at EPFL in Lausanne, Switzerland. Lex Spoon worked on Scala for two years as a post-doc with Martin Odersky. Bill Venners is president of Artima, Inc. Frank Sommers is president of Autospaces, Inc.

Artima Press is an imprint of Artima, Inc.

2070 N Broadway Unit 305, Walnut Creek, California 94597

Copyright © 2007-2021 Martin Odersky, Lex Spoon, Bill Venners, and Frank Sommers. All rights reserved.

First edition published as PrePrint™ eBook 2007

First edition published 2008

Second edition published 2010

Third edition published 2016

Fourth edition published 2019

Fifth edition published as PrePrint™ eBook 2021

Build date of this impression July 12, 2021

Produced in the United States of America

No part of this publication may be reproduced, modified, distributed, stored in a retrieval system, republished, displayed, or performed, for commercial or noncommercial purposes or for compensation of any kind without prior written permission from Artima, Inc.

All information and materials in this book are provided “as is” and without warranty of any kind.

The term “Artima” and the Artima logo are trademarks or registered trademarks of Artima, Inc. All other company and/or product names may be trademarks or registered trademarks of their owners.

to Nastaran – M.O.

to Fay – L.S.

to Siew – B.V.

to Jungwon – F.S.


Table of Contents




  1. A Scalable Language
  2. First Steps in Scala
  3. Next Steps in Scala
  4. Classes and Objects
  5. Basic Types and Operations
  6. Functional Objects
  7. Built-in Control Structures
  8. Functions and Closures
  9. Control Abstraction
  10. Composition and Inheritance
  11. Traits
  12. Packages, Imports, and Exports
  13. Pattern Matching
  14. Working with Lists
  15. Working with Other Collections
  16. Mutable Objects
  17. Scala’s Hierarchy
  18. Type Parameterization
  19. Enums
  20. Abstract Members
  21. Givens
  22. Extension Methods
  23. Typeclasses
  24. Collections in Depth
  25. Assertions and Tests



About the Authors



Watching the birth of a new programming language is a funny thing. To anyone who uses a programming language—whether you are dabbling with programming for the first time or are a grizzled career software engineer—programming languages just seem to exist. Like a hammer or axe, a programming language is a tool, already there, that enables us to perform our trade. We seldom think of how that tool came to be, what the process was to design it. We might have opinions about its design, but beyond that, we usually just put up with it and push forward.

Being there as a programming language is created, however, brings a totally different perspective. The possibilities for what could be seem endless. Yet at the same time, the programming language must satisfy what seems like an infinite list of constraints. It’s an odd tension.

New programming languages are created for many reasons: a personal desire to scratch a niggling itch, a profound academic insight, technical debt, or the benefit of hindsight of other compiler architectures—even politics. Scala 3 is a combination of some of these.

Whatever combination it may be, it all started when Martin Odersky disappeared one day, emerging a few days later to announce in a research group meeting that he had begun experimenting with bringing the DOT calculus to life by writing a new compiler from scratch.[1] Here we were, a group of PhD students and postdocs who had until recently been a big part of the development and maintenance of Scala 2. At the time, Scala was reaching what felt like unfathomable heights of success, especially for an esoteric and academic programming language from a school with a funny-sounding name in Switzerland. Scala had recently taken off among startups in the Bay Area, and Typesafe, later named Lightbend, had recently been formed to support, maintain, and manage releases of Scala 2. So why all of a sudden a new compiler and possibly a new and different programming language? Most were skeptical. Martin was undeterred.

Months passed. Like clockwork, at twelve noon every day, the entire lab would gather in the hallway connecting all of our offices. After a fair number of us and Martin had assembled, we’d venture together to one of EPFL’s many cafeterias, grab lunch, and later, an after-lunch coffee. Each day during this ritual, ideas for this new compiler were a recurrent theme. Discussions would ping pong about, anywhere from focusing on something “150%” compatible with Scala 2 (to avoid a Python 2 versus Python 3 debacle), to creating new language with full-spectrum dependent types.

One by one, the skeptics in the research group seemed to become sold by some appealing aspect of Scala 3, be it streamlining the implementation of the typechecker, the brand new compiler architecture, or the powerful additions to the type system. Over time, much of the community also came around to the idea of Scala 3 being a marked improvement over Scala 2. Different people had different reasons for this. For some, it was the improvements in readability via the decision to make braces and parentheses around conditions in conditionals optional. For others, it was improvements to the type system; for example, match types for improved type-level programming. The list went on.

Rather than blindly steaming ahead on the design of Scala 3 based on hunch alone, I can confidently assert that Scala 3 is the result of much learning from design decisions of the past, and years of conversation with the EPFL research group and the Scala community. And there was no way but to start from a clean slate, and build on clean foundations. With this from-scratch approach, what emerged is, at its core, a new programming language.

Scala 3 is a new programming language. Sure, it might be compatible with Scala 2, and it might sound like the third major release of an already-existing programming language. But don’t let that fool you. Scala 3 represents a substantial streamlining of many of the experimental ideas pioneered in Scala 2.

Perhaps what is most uniquely “Scala” about Scala 3 is what happened to implicits. Scala, since its inception, has been used by clever programmers to achieve functionality that few thought was even possible given Scala’s feature set, let alone something Scala was designed for. The feature previously-known as implicits is perhaps the most well-known feature of Scala that has been exploited to bend Scala 2 in unexpected ways. Example use cases of implicits include retroactively adding a method to a class, without extending and re-compiling that class. Or, given a type signature used in some context, automatically selecting the right implementation for that context. This is just the tip of the iceberg—we even wrote a research paper attempting to catalog the multitude of ways in which developers have used implicits.[2]

This is like providing a user with knobs and levers and leaving it to them to build a smoothly-functioning piece of machinery, like the mechanical calculator shown in Figure 2.0. But often, what comes out instead is something that looks more like a kinetic sculpture of Theo Jansen, such as the one shown in Figure 2.0, than anything with an obvious use.[3] Simply put, you give a programming community something as basic as a lever and a knob, and the intrepid will seek to find creative ways to use it. It’s human nature. But perhaps here, Scala 2’s mistake was the idea to provide something as general-purpose as something like knobs and levers in the first place.

What was intended.

What we got.

The point here is that in Scala 2, there was this endless array of possibilities for what implicits could be used for, which necessitated an entire research paper, and which the community generally couldn’t agree upon how to sanely use. No language feature should have such a murky purpose for being. And yet, here they were—implicits were seen by many as the unique and powerful feature of Scala that essentially no other language had, and by many others as a mysterious and often frustrating mechanism that would invasively rewrite code that you had written to be something else.

You may have heard the oft-repeated mantra that in many ways, Scala 3 represents a simplification of the Scala that came before it. The story of implicits is an excellent example. Cognizant of the back flips programmers were doing with implicits in an attempt to realize broader programming patterns like typeclass derivation, Martin, with the help of many others, came to the conclusion that we should not focus on implicits as a mechanism for people to use in the most general case. Rather, we should focus on what programmers want to do with implicits, and make that easier and more performant. This is where the mantra, “Scala 3 focuses on intent rather than mechanism,” comes from.

With Scala 3, rather than focus on the generality of implicits as a mechanism, the decision was made to focus on specific use-cases that developers had in mind when choosing to use implicits in the first place, and to make these patterns more direct to use. Examples include passing context or configuration information to methods implicitly, without the programmer having to explicitly thread through repetitive arguments; retroactively adding methods to classes; and converting between types, like Ints and Doubles during arithmetic. Now, Scala 3 makes these use cases available to programmers without needing to understand some “deep” intuition about how the Scala compiler resolves implicits. You can instead just focus on tasks like “add a method foo to class Bar without having to recompile it.” No PhD required. Just replace the previous notion of “implicit” with other more direct keywords that correspond to specific use cases, such as given and using. See Chapters 21 and 22 for more on this.

This story of “prioritizing intent over mechanism” doesn’t stop at the revamping of implicits. Rather, the philosophy goes on to touch upon most every aspect of the language. Examples include additions and streamlining of many aspects of Scala’s type system, from union types, to enums, to match types—or even the cleanups to Scala’s syntax: optional braces to improve readability, and more readable “quiet” syntax for ifs, elses, and whiles, resulting in conditionals that look much more English-like.

Don’t take my word for it. Whether you’re a newcomer to Scala, or an experienced Scala developer, I hope you find many of the new design ideas that have entered Scala 3 to be as refreshing and straightforward as I do!

Heather Miller

Lausanne, Switzerland

June 1, 2021


[1] DOT, or dependent object types, calculus is a series of formalizations attempting to characterize the essence of Scala’s type system.

[2] Krikava, et. al., Scala implicits are everywhere. [Kri19]

[3] For a more dynamic depiction of the kinetic sculptures of Theo Jansen, which are collectively entitled Strandbeest, see


Many people have contributed to this book and to the material it covers. We are grateful to all of them.

Scala itself has been a collective effort of many people. The design and the implementation of version 1.0 was helped by Philippe Altherr, Vincent Cremet, Gilles Dubochet, Burak Emir, Stéphane Micheloud, Nikolay Mihaylov, Michel Schinz, Erik Stenman, and Matthias Zenger. Phil Bagwell, Antonio Cunei, Iulian Dragos, Gilles Dubochet, Miguel Garcia, Philipp Haller, Sean McDirmid, Ingo Maier, Donna Malayeri, Adriaan Moors, Hubert Plociniczak, Paul Phillips, Aleksandar Prokopec, Tiark Rompf, Lukas Rytz, and Geoffrey Washburn joined in the effort to develop the second and current version of the language and tools.

Gilad Bracha, Nathan Bronson, Caoyuan, Aemon Cannon, Craig Chambers, Chris Conrad, Erik Ernst, Matthias Felleisen, Mark Harrah, Shriram Krishnamurti, Gary Leavens, David MacIver, Sebastian Maneth, Rickard Nilsson, Erik Meijer, Lalit Pant, David Pollak, Jon Pretty, Klaus Ostermann, Jorge Ortiz, Didier Rémy, Miles Sabin, Vijay Saraswat, Daniel Spiewak, James Strachan, Don Syme, Erik Torreborre, Mads Torgersen, Philip Wadler, Jamie Webb, John Williams, Kevin Wright, and Jason Zaugg have shaped the design of the language by graciously sharing their ideas with us in lively and inspiring discussions, by contributing important pieces of code to the open source effort, as well as through comments on previous versions of this document. The contributors to the Scala mailing list have also given very useful feedback that helped us improve the language and its tools.

George Berger has worked tremendously to make the build process and the web presence for the book work smoothly. As a result this project has been delightfully free of technical snafus.

Many people gave us valuable feedback on early versions of the text. Thanks goes to Eric Armstrong, George Berger, Alex Blewitt, Gilad Bracha, William Cook, Bruce Eckel, Stéphane Micheloud, Todd Millstein, David Pollak, Frank Sommers, Philip Wadler, and Matthias Zenger. Thanks also to the Silicon Valley Patterns group for their very helpful review: Dave Astels, Tracy Bialik, John Brewer, Andrew Chase, Bradford Cross, Raoul Duke, John P. Eurich, Steven Ganz, Phil Goodwin, Ralph Jocham, Yan-Fa Li, Tao Ma, Jeffery Miller, Suresh Pai, Russ Rufer, Dave W. Smith, Scott Turnquest, Walter Vannini, Darlene Wallach, and Jonathan Andrew Wolter. And we’d like to thank Dewayne Johnson and Kim Leedy for their help with the cover art, and Frank Sommers for his work on the index.

We’d also like to extend a special thanks to all of our readers who contributed comments. Your comments were very helpful to us in shaping this into an even better book. We couldn’t print the names of everyone who contributed comments, but here are the names of readers who submitted at least five comments during the eBook PrePrint™ stage by clicking on the Suggest link, sorted first by the highest total number of comments submitted, then alphabetically. Thanks goes to: David Biesack, Donn Stephan, Mats Henricson, Rob Dickens, Blair Zajac, Tony Sloane, Nigel Harrison, Javier Diaz Soto, William Heelan, Justin Forder, Gregor Purdy, Colin Perkins, Bjarte S. Karlsen, Ervin Varga, Eric Willigers, Mark Hayes, Martin Elwin, Calum MacLean, Jonathan Wolter, Les Pruszynski, Seth Tisue, Andrei Formiga, Dmitry Grigoriev, George Berger, Howard Lovatt, John P. Eurich, Marius Scurtescu, Jeff Ervin, Jamie Webb, Kurt Zoglmann, Dean Wampler, Nikolaj Lindberg, Peter McLain, Arkadiusz Stryjski, Shanky Surana, Craig Bordelon, Alexandre Patry, Filip Moens, Fred Janon, Jeff Heon, Boris Lorbeer, Jim Menard, Tim Azzopardi, Thomas Jung, Walter Chang, Jeroen Dijkmeijer, Casey Bowman, Martin Smith, Richard Dallaway, Antony Stubbs, Lars Westergren, Maarten Hazewinkel, Matt Russell, Remigiusz Michalowski, Andrew Tolopko, Curtis Stanford, Joshua Cough, Zemian Deng, Christopher Rodrigues Macias, Juan Miguel Garcia Lopez, Michel Schinz, Peter Moore, Randolph Kahle, Vladimir Kelman, Daniel Gronau, Dirk Detering, Hiroaki Nakamura, Ole Hougaard, Bhaskar Maddala, David Bernard, Derek Mahar, George Kollias, Kristian Nordal, Normen Mueller, Rafael Ferreira, Binil Thomas, John Nilsson, Jorge Ortiz, Marcus Schulte, Vadim Gerassimov, Cameron Taggart, Jon-Anders Teigen, Silvestre Zabala, Will McQueen, and Sam Owen.

We would also like to thank those who submitted comments and errata after the first two editions were published, including Felix Siegrist, Lothar Meyer-Lerbs, Diethard Michaelis, Roshan Dawrani, Donn Stephan, William Uther, Francisco Reverbel, Jim Balter, and Freek de Bruijn, Ambrose Laing, Sekhar Prabhala, Levon Saldamli, Andrew Bursavich, Hjalmar Peters, Thomas Fehr, Alain O’Dea, Rob Dickens, Tim Taylor, Christian Sternagel, Michel Parisien, Joel Neely, Brian McKeon, Thomas Fehr, Joseph Elliott, Gabriel da Silva Ribeiro, Thomas Fehr, Pablo Ripolles, Douglas Gaylor, Kevin Squire, Harry-Anton Talvik, Christopher Simpkins, Martin Witmann-Funk, Jim Balter, Peter Foster, Craig Bordelon, Heinz-Peter Gumm, Peter Chapin, Kevin Wright, Ananthan Srinivasan, Omar Kilani, Donn Stephan, Guenther Waffler.

Lex would like to thank Aaron Abrams, Jason Adams, Henry and Emily Crutcher, Joey Gibson, Gunnar Hillert, Matthew Link, Toby Reyelts, Jason Snape, John and Melinda Weathers, and all of the Atlanta Scala Enthusiasts for many helpful discussions about the language design, its mathematical underpinnings, and how to present Scala to working engineers.

A special thanks to Dave Briccetti and Adriaan Moors for reviewing the third edition, and to Marconi Lanna for not only reviewing, but providing motivation for the third edition by giving a talk entitled “What’s new since Programming in Scala.”

Bill would like to thank Gary Cornell, Greg Doench, Andy Hunt, Mike Leonard, Tyler Ortman, Bill Pollock, Dave Thomas, and Adam Wright for providing insight and advice on book publishing. Bill would also like to thank Dick Wall for collaborating on our Stairway to Scala course, which is in great part based on this book. Our many years of experience teaching Stairway to Scala helped make this book better. Lastly, Bill would like to thank Darlene Gruendl and Samantha Woolf for their help in getting the third edition completed.

Finally, we would like to thank Julien Richard-Foy for his work to bring the fourth edition of this book up to date with Scala 2.13, in particular the new collections redesign.


This book is a tutorial for the Scala programming language, written by people directly involved in the development of Scala. Our goal is that by reading this book, you can learn everything you need to be a productive Scala programmer. All examples in this book compile with Scala version 3.0.0.

Who should read this book

The main target audience for this book is programmers who want to learn to program in Scala. If you want to do your next software project in Scala, then this is the book for you. In addition, the book should be interesting to programmers wishing to expand their horizons by learning new concepts. If you’re a Java programmer, for example, reading this book will expose you to many concepts from functional programming as well as advanced object-oriented ideas. We believe learning about Scala, and the ideas behind it, can help you become a better programmer in general.

General programming knowledge is assumed. While Scala is a fine first programming language, this is not the book to use to learn programming.

On the other hand, no specific knowledge of programming languages is required. Even though most people use Scala on the Java platform, this book does not presume you know anything about Java. However, we expect many readers to be familiar with Java, and so we sometimes compare Scala to Java to help such readers understand the differences.

How to use this book

Because the main purpose of this book is to serve as a tutorial, the recommended way to read this book is in chapter order, from front to back. We have tried hard to introduce one topic at a time, and explain new topics only in terms of topics we’ve already introduced. Thus, if you skip to the back to get an early peek at something, you may find it explained in terms of concepts you don’t quite understand. To the extent you read the chapters in order, we think you’ll find it quite straightforward to gain competency in Scala, one step at a time.

If you see a term you do not know, be sure to check the glossary and the index. Many readers will skim parts of the book, and that is just fine. The glossary and index can help you backtrack whenever you skim over something too quickly.

After you have read the book once, it should also serve as a language reference. There is a formal specification of the Scala language, but the language specification tries for precision at the expense of readability. Although this book doesn’t cover every detail of Scala, it is quite comprehensive and should serve as an approachable language reference as you become more adept at programming in Scala.



[Ray99] Raymond, Eric. The Cathedral & the Bazaar: Musings on Linux and Open Source by an Accidental Revolutionary. O’Reilly, 1999.

[Blo08] Bloch, Joshua. Effective Java Second Edition. Addison-Wesley, 2008.

[Bay72] Bayer, Rudolf. Symmetric binary B-Trees: Data structure and maintenance algorithms. Acta Informatica, 1(4):290–306, 1972.

[Mor68] Morrison, Donald R. PATRICIA—Practical Algorithm To Retrieve Information Coded in Alphanumeric. J. ACM, 15(4):514–534, 1968. ISSN 0004-5411. [ DOI ]

[DeR75] DeRemer, Frank and Hans Kron. Programming-in-the large versus programming-in-the-small. In Proceedings of the international conference on Reliable software, pages 114–121. ACM, New York, NY, USA, 1975. [ DOI ]

[SPJ02] Simon Peyton Jones, Haskell 98 Language and Libraries, Revised Report. Technical report,, 2002.

[Vaz07] Vaziri, Mandana, Frank Tip, Stephen Fink, and Julian Dolby. Declarative Object Identity Using Relation Types. In Proc. ECOOP 2007, pages 54–78. 2007.

[Mey91] Meyers, Scott. Effective C++. Addison-Wesley, 1991.

[Rum04] Rumbaugh, James, Ivar Jacobson, and Grady Booch. The Unified Modeling Language Reference Manual (2nd Edition). Addison-Wesley, 2004.

[Goe06] Goetz, Brian, Tim Peierls, Joshua Bloch, Joseph Bowbeer, David Homes, and Doug Lea. Java Concurrency in Practice. Addison Wesley, 2006.

[Mey00] Meyer, Bertrand. Object-Oriented Software Construction. Prentice Hall, 2000.

[Eck98] Eckel, Bruce. Thinking in Java. Prentice Hall, 1998.

[Eva03] Evans, Eric. Domain-Driven Design: Tackling Complexity in the Heart of Software. Addison-Wesley Professional, 2003.

[Aho86] Aho, Alfred V., Ravi Sethi, and Jeffrey D. Ullman. Compilers: Principles, Techniques, and Tools. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1986. ISBN 0-201-10088-6.

[Abe96] Abelson, Harold and Gerald Jay Sussman. Structure and Interpretation of Computer Programs. The MIT Press, second edition, 1996.

[Ode03] Odersky, Martin, Vincent Cremet, Christine Röckl, and Matthias Zenger. A Nominal Theory of Objects with Dependent Types. In Proc. ECOOP’03, Springer LNCS, pages 201–225. July 2003.

[Ode11] Odersky, Martin. The Scala Language Specification, Version 2.9. EPFL, May 2011. Available on the web at (accessed April 20, 2014).

[Ode05] Odersky, Martin and Matthias Zenger. Scalable Component Abstractions. In Proceedings of OOPSLA, pages 41–58. October 2005.

[Emi07] Emir, Burak, Martin Odersky, and John Williams. Matching Objects With Patterns. In Proc. ECOOP, Springer LNCS, pages 273–295. July 2007.

[Ste99] Steele, Jr., Guy L. Growing a Language. Higher-Order and Symbolic Computation, 12:221–223, 1999. Transcript of a talk given at OOPSLA 1998.

[Str00] Strachey, Christopher. Fundamental Concepts in Programming Languages. Higher-Order and Symbolic Computation, 13:11–49, 2000.

[Kay96] Kay, Alan C. The Early History of Smalltalk. In History of programming languages—II, pages 511–598. ACM, New York, NY, USA, 1996. ISBN 0-201-89502-1. [ DOI ]

[Jav] The Java Tutorials: Creating a GUI with JFC/Swing. Available on the web at

[Lan66] Landin, Peter J. The Next 700 Programming Languages. Communications of the ACM, 9(3):157–166, 1966.

[Fow04] Fowler, Martin. Inversion of Control Containers and the Dependency Injection pattern. January 2004. Available on the web at (accesssed August 6, 2008).

[Gam95] Gamma, Erich, Richard Helm, Ralph Johnson, and John Vlissides. Design Patterns : Elements of Reusable Object-Oriented Software. Addison-Wesley, 1995.

[Kay03] Kay, Alan C. An email to Stefan Ram on the meaning of the term “object-oriented programming”, July 2003. The email is published on the web at (accesssed June 6, 2008).

[Dij70] Dijkstra, Edsger W. Notes on Structured Programming., April 1970. Circulated privately. Available at /users/EWD/ewd02xx/EWD249.PDF as EWD249 (accessed June 6, 2008).

[Ste15] Steindorfer, Michael J and Jurgen J Vinju. Optimizing hash-array mapped tries for fast and lean immutable JVM collections. In ACM SIGPLAN Notices, volume 50, pages 783–800. ACM, 2015.

[Kri19] Krikava, Filip, Heather Miller, and Jan Vitek. Scala implicits are everywhere: a large-scale study of the use of Scala implicits in the wild. In Proceedings of the ACM on Programming Languages, volume 3. ACM, 2019. [ DOI ]

About the Authors

Martin Odersky is the creator of the Scala language. He is a professor at EPFL in Lausanne, Switzerland, where since 2001 he has led the team that developed the Scala language libraries, and compiler. He is a founder of Lightbend, Inc. Lex Spoon worked on Scala for two years at EPFL and is now a software engineer at Square, Inc. Bill Venners is president of Artima, Inc. He is a community representative on the Scala Center Advisory Board, and the designer of ScalaTest. Frank Sommers is president of Autospaces, Inc.

Martin works on programming languages and systems, more specifically on the topic of how to combine object-oriented and functional programming. Since 2001 he has concentrated on designing, implementing, and refining Scala. Previously, he has influenced the development of Java as a co-designer of Java generics and as the original author of the current javac reference compiler. He is a fellow of the ACM.

Lex Spoon is a software engineer at Square, Inc., which provides easy to use business software and mobile payments. In addition to Scala, he has helped develop a wide variety of programming languages, including the dynamic language Smalltalk, the scientific language X10, and the logic language CodeQL.

Bill Venners is president of Artima, Inc., publisher of the Artima website ( and provider of Scala consulting, training, books, and tools. He is author of the book, Inside the Java Virtual Machine, a programmer-oriented survey of the Java platform’s architecture and internals. His popular columns in JavaWorld magazine covered Java internals, object-oriented design, and Jini. Bill is a community representative on the Scala Center advisory board, and is the lead developer and designer of the ScalaTest test framework and the Scalactic library for functional, object-oriented programming.

Frank Sommers is founder and president of Autospaces, Inc, a company providing workflow automation solutions to the financial services industry. Frank has been an active Scala user for over twelve years, and has worked with the language daily ever since.

Scala Glossary from the Book:


algebraic data type A type defined by providing several alternatives, each of which comes with its own constructor. It usually comes with a way to decompose the type through pattern matching. The concept is found in specification languages and functional programming languages. Algebraic data types can be emulated in Scala with case classes.

alternative A branch of a match expression. It has the form “case pattern => expression.” Another name for alternative is case.

annotation An annotation appears in source code and is attached to some part of the syntax. Annotations are computer processable, so you can use them to effectively add an extension to Scala.

anonymous class An anonymous class is a synthetic subclass generated by the Scala compiler from a new expression in which the class or trait name is followed by curly braces. The curly braces contains the body of the anonymous subclass, which may be empty. However, if the name following new refers to a trait or class that contains abstract members, these must be made concrete inside the curly braces that define the body of the anonymous subclass.

anonymous function Another name for function literal.

apply You can apply a method, function, or closure to arguments, which means you invoke it on those arguments.

argument When a function is invoked, an argument is passed for each parameter of that function. The parameter is the variable that refers to the argument. The argument is the object passed at invocation time. In addition, applications can take (command line) arguments that show up in the Array[String] passed to main methods of singleton objects.

assign You can assign an object to a variable. Afterwards, the variable will refer to the object.

auxiliary constructor Extra constructors defined inside the curly braces of the class definition, which look like method definitions named this, but with no result type.

block One or more expressions and declarations surrounded by curly braces. When the block evaluates, all of its expressions and declarations are processed in order, and then the block returns the value of the last expression as its own value. Blocks are commonly used as the bodies of functions, for expressions, while loops, and any other place where you want to group a number of statements together. More formally, a block is an encapsulation construct for which you can only see side effects and a result value. The curly braces in which you define a class or object do not, therefore, form a block, because fields and methods (which are defined inside those curly braces) are visible from the outside. Such curly braces form a template.

bound variable A bound variable of an expression is a variable that’s both used and defined inside the expression. For instance, in the function literal expression (x: Int) => (x, y), both variables x and y are used, but only x is bound, because it is defined in the expression as an Int and the sole argument to the function described by the expression.

by-name parameter A parameter that is marked with a => in front of the parameter type, e.g., (x: => Int). The argument corresponding to a by-name parameter is evaluated not before the method is invoked, but each time the parameter is referenced by name inside the method. If a parameter is not by-name, it is by-value.

by-value parameter A parameter that is not marked with a => in front of the parameter type, e.g., (x: Int). The argument corresponding to a by-value parameter is evaluated before the method is invoked. By-value parameters contrast with by-name parameters.

class Defined with the class keyword, a class may either be abstract or concrete, and may be parameterized with types and values when instantiated. In “new ArrayString“, the class being instantiated is Array and the type of the value that results is Array[String]. A class that takes type parameters is called a type constructor. A type can be said to have a class as well, as in: the class of type Array[String] is Array.

closure A function object that captures free variables, and is said to be “closed” over the variables visible at the time it is created.

companion class A class that shares the same name with a singleton object defined in the same source file. The class is the singleton object’s companion class.

companion object A singleton object that shares the same name with a class defined in the same source file. Companion objects and classes have access to each other’s private members. In addition, any implicit conversions defined in the companion object will be in scope anywhere the class is used.

contravariant A contravariant annotation can be applied to a type parameter of a class or trait by putting a minus sign (-) before the type parameter. The class or trait then subtypes contravariantly with—in the opposite direction as—the type annotated parameter. For example, Function1 is contravariant in its first type parameter, and so Function1[Any, Any] is a subtype of Function1[String, Any].

covariant A covariant annotation can be applied to a type parameter of a class or trait by putting a plus sign (+) before the type parameter. The class or trait then subtypes covariantly with—in the same direction as—the type annotated parameter. For example, List is covariant in its type parameter, so List[String] is a subtype of List[Any].

currying A way to write functions with multiple parameter lists. For instance def f(x: Int)(y: Int) is a curried function with two parameter lists. A curried function is applied by passing several arguments lists, as in: f(3)(4). However, it is also possible to write a partial application of a curried function, such as f(3).

declare You can declare an abstract field, method, or type, which gives an entity a name but not an implementation. The key difference between declarations and definitions is that definitions establish an implementation for the named entity, declarations do not.

define To define something in a Scala program is to give it a name and an implementation. You can define classes, traits, singleton objects, fields, methods, local functions, local variables, etc. Because definitions always involve some kind of implementation, abstract members are declared not defined.

direct subclass A class is a direct subclass of its direct superclass.

direct superclass The class from which a class or trait is immediately derived, the nearest class above it in its inheritance hierarchy. If a class Parent is mentioned in a class Child’s optional extends clause, then Parent is the direct superclass of Child. If a trait is mentioned in Child’s extends clause, the trait’s direct superclass is the Child’s direct superclass. If Child has no extends clause, then AnyRef is the direct superclass of Child. If a class’s direct superclass takes type parameters, for example class Child extends Parent[String], the direct superclass of Child is still Parent, not Parent[String]. On the other hand, Parent[String] would be the direct supertype of Child. See supertype for more discussion of the distinction between class and type.

equality When used without qualification, equality is the relation between values expressed by `==’. See also reference equality.

expression Any bit of Scala code that yields a result. You can also say that an expression evaluates to a result or results in a value.

filter An if followed by a boolean expression in a for expression. In for(i <- 1 to 10; if i % 2 == 0), the filter is “if i % 2 == 0”. The value to the right of the if is the filter expression.

filter expression A filter expression is the boolean expression following an if in a for expression. In for(i <- 1 to 10; if i % 2 == 0), the filter expression is “i % 2 == 0”.

first-class function Scala supports first-class functions, which means you can express functions in function literal syntax, i.e., (x: Int) => x + 1, and that functions can be represented by objects, which are called function values.

for comprehension Another name for for expression.

free variable A free variable of an expression is a variable that’s used inside the expression but not defined inside the expression. For instance, in the function literal expression (x: Int) => (x, y), both variables x and y are used, but only y is a free variable, because it is not defined inside the expression.

function A function can be invoked with a list of arguments to produce a result. A function has a parameter list, a body, and a result type. Functions that are members of a class, trait, or singleton object are called methods. Functions defined inside other functions are called local functions. Functions with the result type of Unit are called procedures. Anonymous functions in source code are called function literals. At run time, function literals are instantiated into objects called function values.

function literal A function with no name in Scala source code, specified with function literal syntax. For example, (x: Int, y: Int) => x + y.

function value A function object that can be invoked just like any other function. A function value’s class extends one of the FunctionN traits (e.g., Function0, Function1) from package scala, and is usually expressed in source code via function literal syntax. A function value is “invoked” when its apply method is called. A function value that captures free variables is a closure.

functional style The functional style of programming emphasizes functions and evaluation results and deemphasizes the order in which operations occur. The style is characterized by passing function values into looping methods, immutable data, methods with no side effects. It is the dominant paradigm of languages such as Haskell and Erlang, and contrasts with the imperative style.

generator A generator defines a named val and assigns to it a series of values in a for expression. For example, in for(i <- 1 to 10), the generator is “i <- 1 to 10”. The value to the right of the <- is the generator expression.

generator expression A generator expression generates a series of values in a for expression. For example, in for(i <- 1 to 10), the generator expression is “1 to 10”.

generic class A class that takes type parameters. For example, because scala.List takes a type parameter, scala.List is a generic class.

generic trait A trait that takes type parameters. For example, because trait scala.collection.Set takes a type parameter, it is a generic trait.

helper function A function whose purpose is to provide a service to one or more other functions nearby. Helper functions are often implemented as local functions.

helper method A helper function that’s a member of a class. Helper methods are often private.

immutable An object is immutable if its value cannot be changed after it is created in any way visible to clients. Objects may or may not be immutable.

imperative style The imperative style of programming emphasizes careful sequencing of operations so that their effects happen in the right order. The style is characterized by iteration with loops, mutating data in place, and methods with side effects. It is the dominant paradigm of languages such as C, C++, C# and Java, and contrasts with the functional style.

initialize When a variable is defined in Scala source code, you must initialize it with an object.

instance An instance, or class instance, is an object, a concept that exists only at run time.

instantiate To instantiate a class is to make a new object from the class, an action that happens only at run time.

invariant Invariant is used in two ways. It can mean a property that always holds true when a data structure is well-formed. For example, it is an invariant of a sorted binary tree that each node is ordered before its right subnode, if it has a right subnode. Invariant is also sometimes used as a synonym for nonvariant: “class Array is invariant in its type parameter.”

invoke You can invoke a method, function, or closure on arguments, meaning its body will be executed with the specified arguments.

JVM The JVM is the Java Virtual Machine, or runtime, that hosts a running Scala program.

literal 1, “One”, and (x: Int) => x + 1 are examples of literals. A literal is a shorthand way to describe an object, where the shorthand exactly mirrors the structure of the created object.

local function A local function is a def defined inside a block. To contrast, a def defined as a member of a class, trait, or singleton object is called a method.

local variable A local variable is a val or var defined inside a block. Although similar to local variables, parameters to functions are not referred to as local variables, but simply as parameters or “variables” without the “local.”

member A member is any named element of the template of a class, trait, or singleton object. A member may be accessed with the name of its owner, a dot, and its simple name. For example, top-level fields and methods defined in a class are members of that class. A trait defined inside a class is a member of its enclosing class. A type defined with the type keyword in a class is a member of that class. A class is a member of the package in which is it defined. By contrast, a local variable or local function is not a member of its surrounding block.

meta-programming Meta-programming software is software whose input is itself software. Compilers are meta-programs, as are tools like scaladoc. Meta-programming software is required in order to do anything with an annotation.

method A method is a function that is a member of some class, trait, or singleton object.

mixin Mixin is what a trait is called when it is being used in a mixin composition. In other words, in “trait Hat,” Hat is just a trait, but in “new Cat extends AnyRef with Hat,” Hat can be called a mixin. When used as a verb, “mix in” is two words. For example, you can mix traits into classes or other traits.

mixin composition The process of mixing traits into classes or other traits. Mixin composition differs from traditional multiple inheritance in that the type of the super reference is not known at the point the trait is defined, but rather is determined anew each time the trait is mixed into a class or other trait.

modifier A keyword that qualifies a class, trait, field, or method definition in some way. For example, the private modifier indicates that a class, trait, field, or method being defined is private.

multiple definitions The same expression can be assigned in multiple definitions if you use the syntax val v1, v2, v3 = exp.

nonvariant A type parameter of a class or trait is by default nonvariant. The class or trait then does not subtype when that parameter changes. For example, because class Array is nonvariant in its type parameter, Array[String] is neither a subtype nor a supertype of Array[Any].

operation In Scala, every operation is a method call. Methods may be invoked in operator notation, such as b + 2, and when in that notation, + is an operator.

parameter Functions may take zero to many parameters. Each parameter has a name and a type. The distinction between parameters and arguments is that arguments refer to the actual objects passed when a function is invoked. Parameters are the variables that refer to those passed arguments.

parameterless function A function that takes no parameters, which is defined without any empty parentheses. Invocations of parameterless functions may not supply parentheses. This supports the uniform access principle, which enables the def to be changed into a val without requiring a change to client code.

parameterless method A parameterless method is a parameterless function that is a member of a class, trait, or singleton object.

parametric field A field defined as a class parameter.

partially applied function A function that’s used in an expression and that misses some of its arguments. For instance, if function f has type Int => Int => Int, then f and f(1) are partially applied functions.

path-dependent type A type like swiss.cow.Food. The swiss.cow part is a path that forms a reference to an object. The meaning of the type is sensitive to the path you use to access it. The types swiss.cow.Food and fish.Food, for example, are different types.

pattern In a match expression alternative, a pattern follows each case keyword and precedes either a pattern guard or the => symbol.

pattern guard In a match expression alternative, a pattern guard can follow a pattern. For example, in “case x if x % 2 == 0 => x + 1”, the pattern guard is “if x % 2 == 0”. A case with a pattern guard will only be selected if the pattern matches and the pattern guard yields true.

predicate A predicate is a function with a Boolean result type.

primary constructor The main constructor of a class, which invokes a superclass constructor, if necessary, initializes fields to passed values, and executes any top-level code defined between the curly braces of the class. Fields are initialized only for value parameters not passed to the superclass constructor, except for any that are not used in the body of the class and can therefore be optimized away.

procedure A procedure is a function with result type of Unit, which is therefore executed solely for its side effects.

reassignable A variable may or may not be reassignable. A var is reassignable while a val is not.

receiver The receiver of a method call is the variable, expression, or object on which the method is invoked.

recursive A function is recursive if it calls itself. If the only place the function calls itself is the last expression of the function, then the function is tail recursive.

reference A reference is the Java abstraction of a pointer, which uniquely identifies an object that resides on the JVM’s heap. Reference type variables hold references to objects, because reference types (instances of AnyRef) are implemented as Java objects that reside on the JVM’s heap. Value type variables, by contrast, may sometimes hold a reference (to a boxed wrapper type) and sometimes not (when the object is being represented as a primitive value). Speaking generally, a Scala variable refers to an object. The term “refers” is more abstract than “holds a reference.” If a variable of type scala.Int is currently represented as a primitive Java int value, then that variable still refers to the Int object, but no reference is involved.

reference equality Reference equality means that two references identify the very same Java object. Reference equality can be determined, for reference types only, by calling eq in AnyRef. (In Java programs, reference equality can be determined using == on Java reference types.)

reference type A reference type is a subclass of AnyRef. Instances of reference types always reside on the JVM’s heap at run time.

referential transparency A property of functions that are independent of temporal context and have no side effects. For a particular input, an invocation of a referentially transparent function can be replaced by its result without changing the program semantics.

refers A variable in a running Scala program always refers to some object. Even if that variable is assigned to null, it conceptually refers to the Null object. At runtime, an object may be implemented by a Java object or a value of a primitive type, but Scala allows programmers to think at a higher level of abstraction about their code as they imagine it running. See also reference.

refinement type A type formed by supplying a base type a number of members inside curly braces. The members in the curly braces refine the types that are present in the base type. For example, the type of “animal that eats grass” is Animal { type SuitableFood = Grass }.

result An expression in a Scala program yields a result. The result of every expression in Scala is an object.

result type A method’s result type is the type of the value that results from calling the method. (In Java, this concept is called the return type.)

return A function in a Scala program returns a value. You can call this value the result of the function. You can also say the function results in the value. The result of every function in Scala is an object.

runtime The Java Virtual Machine, or JVM, that hosts a running Scala program. Runtime encompasses both the virtual machine, as defined by the Java Virtual Machine Specification, and the runtime libraries of the Java API and the standard Scala API. The phrase at run time (with a space between run and time) means when the program is running, and contrasts with compile time.

runtime type The type of an object at run time. To contrast, a static type is the type of an expression at compile time. Most runtime types are simply bare classes with no type parameters. For example, the runtime type of “Hi” is String, and the runtime type of (x: Int) => x + 1 is Function1. Runtime types can be tested with isInstanceOf.

script A file containing top level definitions and statements, which can be run directly with scala without explicitly compiling. A script must end in an expression, not a definition.

selector The value being matched on in a match expression. For example, in “s match { case _ => }”, the selector is s.

self type A self type of a trait is the assumed type of this, the receiver, to be used within the trait. Any concrete class that mixes in the trait must ensure that its type conforms to the trait’s self type. The most common use of self types is for dividing a large class into several traits as described in Chapter 32.

semi-structured data XML data is semi-structured. It is more structured than a flat binary file or text file, but it does not have the full structure of a programming language’s data structures.

serialization You can serialize an object into a byte stream which can then be saved to files or transmitted over the network. You can later deserialize the byte stream, even on different computer, and obtain an object that is the same as the original serialized object.

shadow A new declaration of a local variable shadows one of the same name in an enclosing scope.

signature Signature is short for type signature.

singleton object An object defined with the object keyword. Each singleton object has one and only one instance. A singleton object that shares its name with a class, and is defined in the same source file as that class, is that class’s companion object. The class is its companion class. A singleton object that doesn’t have a companion class is a standalone object.

standalone object A singleton object that has no companion class.

statement An expression, definition, or import, i.e., things that can go into a template or a block in Scala source code.

static type See type.

subclass A class is a subclass of all of its superclasses and supertraits.

subtrait A trait is a subtrait of all of its supertraits.

subtype The Scala compiler will allow any of a type’s subtypes to be used as a substitute wherever that type is required. For classes and traits that take no type parameters, the subtype relationship mirrors the subclass relationship. For example, if class Cat is a subclass of abstract class Animal, and neither takes type parameters, type Cat is a subtype of type Animal. Likewise, if trait Apple is a subtrait of trait Fruit, and neither takes type parameters, type Apple is a subtype of type Fruit. For classes and traits that take type parameters, however, variance comes into play. For example, because abstract class List is declared to be covariant in its lone type parameter (i.e., List is declared List[+A]), List[Cat] is a subtype of List[Animal], and List[Apple] a subtype of List[Fruit]. These subtype relationships exist even though the class of each of these types is List. By contrast, because Set is not declared to be covariant in its type parameter (i.e., Set is declared Set[A] with no plus sign), Set[Cat] is not a subtype of Set[Animal]. A subtype should correctly implement the contracts of its supertypes, so that the Liskov Substitution Principle applies, but the compiler only verifies this property at the level of type checking.

superclass A class’s superclasses include its direct superclass, its direct superclass’s direct superclass, and so on, all the way up to Any.

supertrait A class’s or trait’s supertraits, if any, include all traits directly mixed into the class or trait or any of its superclasses, plus any supertraits of those traits.

supertype A type is a supertype of all of its subtypes.

synthetic class A synthetic class is generated automatically by the compiler rather than being written by hand by the programmer.

tail recursive A function is tail recursive if the only place the function calls itself is the last operation of the function.

target typing Target typing is a form of type inference that takes into account the type that’s expected. In nums.filter((x) => x > 0), for example, the Scala compiler infers type of x to be the element type of nums, because the filter method invokes the function on each element of nums.

template A template is the body of a class, trait, or singleton object definition. It defines the type signature, behavior, and initial state of the class, trait, or object.

trait A trait, which is defined with the trait keyword, is like an abstract class that cannot take any value parameters and can be “mixed into” classes or other traits via the process known as mixin composition. When a trait is being mixed into a class or trait, it is called a mixin. A trait may be parameterized with one or more types. When parameterized with types, the trait constructs a type. For example, Set is a trait that takes a single type parameter, whereas Set[Int] is a type. Also, Set is said to be “the trait of” type Set[Int].

type Every variable and expression in a Scala program has a type that is known at compile time. A type restricts the possible values to which a variable can refer, or an expression can produce, at run time. A variable or expression’s type can also be referred to as a static type if necessary to differentiate it from an object’s runtime type. In other words, “type” by itself means static type. Type is distinct from class because a class that takes type parameters can construct many types. For example, List is a class, but not a type. List[T] is a type with a free type parameter. List[Int] and List[String] are also types (called ground types because they have no free type parameters). A type can have a “class” or “trait.” For example, the class of type List[Int] is List. The trait of type Set[String] is Set.

type constraint Some annotations are type constraints, meaning that they add additional limits, or constraints, on what values the type includes. For example, @positive could be a type constraint on the type Int, limiting the type of 32-bit integers down to those that are positive. Type constraints are not checked by the standard Scala compiler, but must instead be checked by an extra tool or by a compiler plugin.

type constructor A class or trait that takes type parameters.

type parameter A parameter to a generic class or generic method that must be filled in by a type. For example, class List is defined as “class List[T] { …”, and method identity, a member of object Predef, is defined as “def identityT = x”. The T in both cases is a type parameter.

type signature A method’s type signature comprises its name, the number, order, and types of its parameters, if any, and its result type. The type signature of a class, trait, or singleton object comprises its name, the type signatures of all of its members and constructors, and its declared inheritance and mixin relations.

uniform access principle The uniform access principle states that variables and parameterless functions should be accessed using the same syntax. Scala supports this principle by not allowing parentheses to be placed at call sites of parameterless functions. As a result, a parameterless function definition can be changed to a val, or vice versa, without affecting client code.

unreachable At the Scala level, objects can become unreachable, at which point the memory they occupy may be reclaimed by the runtime. Unreachable does not necessarily mean unreferenced. Reference types (instances of AnyRef) are implemented as objects that reside on the JVM’s heap. When an instance of a reference type becomes unreachable, it indeed becomes unreferenced, and is available for garbage collection. Value types (instances of AnyVal) are implemented as both primitive type values and as instances of Java wrapper types (such as java.lang.Integer), which reside on the heap. Value type instances can be boxed (converted from a primitive value to a wrapper object) and unboxed (converted from a wrapper object to a primitive value) throughout the lifetime of the variables that refer to them. If a value type instance currently represented as a wrapper object on the JVM’s heap becomes unreachable, it indeed becomes unreferenced, and is available for garbage collection. But if a value type currently represented as a primitive value becomes unreachable, then it does not become unreferenced, because it does not exist as an object on the JVM’s heap at that point in time. The runtime may reclaim memory occupied by unreachable objects, but if an Int, for example, is implemented at run time by a primitive Java int that occupies some memory in the stack frame of an executing method, then the memory for that object is “reclaimed” when the stack frame is popped as the method completes. Memory for reference types, such as Strings, may be reclaimed by the JVM’s garbage collector after they become unreachable.

unreferenced See unreachable.

value The result of any computation or expression in Scala is a value, and in Scala, every value is an object. The term value essentially means the image of an object in memory (on the JVM’s heap or stack).

value type A value type is any subclass of AnyVal, such as Int, Double, or Unit. This term has meaning at the level of Scala source code. At runtime, instances of value types that correspond to Java primitive types may be implemented in terms of primitive type values or instances of wrapper types, such as java.lang.Integer. Over the lifetime of a value type instance, the runtime may transform it back and forth between primitive and wrapper types (i.e., to box and unbox it).

variable A named entity that refers to an object. A variable is either a val or a var. Both vals and vars must be initialized when defined, but only vars can be later reassigned to refer to a different object.

variance A type parameter of a class or trait can be marked with a variance annotation, either covariant (+) or contravariant (-). Such variance annotations indicate how subtyping works for a generic class or trait. For example, the generic class List is covariant in its type parameter, and thus List[String] is a subtype of List[Any]. By default, i.e., absent a + or – annotation, type parameters are nonvariant.

wildcard type A wildcard type includes references to type variables that are unknown. For example, Array[_] is a wildcard type. It is an array where the element type is completely unknown.

yield An expression can yield a result. The yield keyword designates the result of a for expression.

Bibliography Cloud DevOps Java Software Engineering Spring Framework

Bibliography of Spring Framework – Spring Project – Spring Books

See: Spring Framework, Java Bibliography

  • Spring Boot Messaging: Messaging APIs for Enterprise and Integration Solutions 1st ed. Edition, Publisher ‏ : ‎ Apress; 1st ed. edition (May 4, 2017), B071VG289T ISBN-13: 978-1484212257


Fair Use Sources:

Cloud DevOps Java Software Engineering Spring Framework

Spring Framework

” (WP)

The Spring Framework is an application framework and inversion of control container for the Java platform. The framework’s core features can be used by any Java application, but there are extensions for building web applications on top of the Java EE (Enterprise Edition) platform. Although the framework does not impose any specific programming model, it has become popular in the Java community as an addition to the Enterprise JavaBeans (EJB) model. The Spring Framework is open source.

Spring Framework Logo 2018.svg
Developer(s)Pivotal Software
Initial release1 October 2002; 18 years ago
Stable release5.3.8[1]  / 9 June 2021; 50 days ago
Written inJava
PlatformJava EE
TypeApplication framework
LicenseApache License 2.0 

Version history[edit]

1.0March 24, 2004First production release.

The first version was written by Rod Johnson, who released the framework with the publication of his book Expert One-on-One J2EE Design and Development in October 2002. The framework was first released under the Apache 2.0 license in June 2003. The first production release, 1.0, was released in March 2004.[2] The Spring 1.2.6 framework won a Jolt productivity award and a JAX Innovation Award in 2006.[3][4] Spring 2.0 was released in October 2006, Spring 2.5 in November 2007, Spring 3.0 in December 2009, Spring 3.1 in December 2011, and Spring 3.2.5 in November 2013.[5] Spring Framework 4.0 was released in December 2013.[6] Notable improvements in Spring 4.0 included support for Java SE (Standard Edition) 8, Groovy 2, some aspects of Java EE 7, and WebSocket.

Spring Boot 1.0 was released in April 2014.[7]

Spring Framework 4.2.0 was released on 31 July 2015 and was immediately upgraded to version 4.2.1, which was released on 01 Sept 2015.[8] It is “compatible with Java 6, 7 and 8, with a focus on core refinements and modern web capabilities”.[9]

Spring Framework 4.3 has been released on 10 June 2016 and will be supported until 2020.[10] It “will be the final generation within the general Spring 4 system requirements (Java 6+, Servlet 2.5+), […]”.[11]

Spring 5 is announced to be built upon Reactive Streams compatible Reactor Core.[12]


The Spring Framework includes several modules that provide a range of services:

Inversion of control container (dependency injection)[edit]

Central to the Spring Framework is its inversion of control (IoC) container, which provides a consistent means of configuring and managing Java objects using reflection. The container is responsible for managing object lifecycles of specific objects: creating these objects, calling their initialization methods, and configuring these objects by wiring them together.

Objects created by the container are also called managed objects or beans. The container can be configured by loading XML (Extensible Markup Language) files or detecting specific Java annotations on configuration classes. These data sources contain the bean definitions that provide the information required to create the beans.

Objects can be obtained by means of either dependency lookup or dependency injection.[14] Dependency lookup is a pattern where a caller asks the container object for an object with a specific name or of a specific type. Dependency injection is a pattern where the container passes objects by name to other objects, via either constructorsproperties, or factory methods.

In many cases one need not use the container when using other parts of the Spring Framework, although using it will likely make an application easier to configure and customize. The Spring container provides a consistent mechanism to configure applications and integrates with almost all Java environments, from small-scale applications to large enterprise applications.

The container can be turned into a partially compliant EJB (Enterprise JavaBeans) 3.0 container by means of the Pitchfork project. Some[who?] criticize the Spring Framework for not complying with standards.[15] However, SpringSource doesn’t see EJB 3 compliance as a major goal, and claims that the Spring Framework and the container allow for more powerful programming models.[16] The programmer does not directly create an object, but describes how it should be created, by defining it in the Spring configuration file. Similarly services and components are not called directly; instead a Spring configuration file defines which services and components must be called. This IoC is intended to increase the ease of maintenance and testing.

Aspect-oriented programming framework[edit]

The Spring Framework has its own Aspect-oriented programming (AOP) framework that modularizes cross-cutting concerns in aspects. The motivation for creating a separate AOP framework comes from the belief that it should be possible to provide basic AOP features without too much complexity in either design, implementation, or configuration. The Spring AOP framework also takes full advantage of the Spring container.

The Spring AOP framework is proxy pattern-based, and is configured at run time. This removes the need for a compilation step or load-time weaving. On the other hand, interception only allows for public method-execution on existing objects at a join point.

Compared to the AspectJ framework, Spring AOP is less powerful, but also less complicated. Spring 1.2 includes support to configure AspectJ aspects in the container. Spring 2.0 added more integration with AspectJ; for example, the pointcut language is reused and can be mixed with Spring AOP-based aspects. Further, Spring 2.0 added a Spring Aspects library that uses AspectJ to offer common Spring features such as declarative transaction management and dependency injection via AspectJ compile-time or load-time weaving. SpringSource also uses AspectJ AOP in other Spring projects such as Spring Roo and Spring Insight, with Spring Security also offering an AspectJ-based aspect library.

Spring AOP has been designed to make it able to work with cross-cutting concerns inside the Spring Framework. Any object which is created and configured by the container can be enriched using Spring AOP.

The Spring Framework uses Spring AOP internally for transaction management, security, remote access, and JMX.

Since version 2.0 of the framework, Spring provides two approaches to the AOP configuration:

  • schema-based approach[17] and
  • @AspectJ-based annotation style.[18]
<beans xmlns=""

The Spring team decided not to introduce new AOP-related terminology; therefore, in the Spring reference documentation and API, terms such as aspect, join point, advice, pointcut, introduction, target object (advised object), AOP proxy, and weaving all have the same meanings as in most other AOP frameworks (particularly AspectJ).

Data access framework[edit]

Spring’s data access framework addresses common difficulties developers face when working with databases in applications. Support is provided for all popular data access frameworks in Java: JDBC, iBatis/MyBatisHibernateJava Data Objects (JDO, discontinued since 5.x), Java Persistence API (JPA), Oracle TopLinkApache OJB, and Apache Cayenne, among others.

For all of these supported frameworks, Spring provides these features

  • Resource management – automatically acquiring and releasing database resources
  • Exception handling – translating data access related exception to a Spring data access hierarchy
  • Transaction participation – transparent participation in ongoing transactions
  • Resource unwrapping – retrieving database objects from connection pool wrappers
  • Abstraction for binary large object (BLOB) and character large object (CLOB) handling

All these features become available when using template classes provided by Spring for each supported framework. Critics have said these template classes are intrusive and offer no advantage over using (for example) the Hibernate API directly.[19][failed verification] In response, the Spring developers have made it possible to use the Hibernate and JPA APIs directly. This however requires transparent transaction management, as application code no longer assumes the responsibility to obtain and close database resources, and does not support exception translation.

Together with Spring’s transaction management, its data access framework offers a flexible abstraction for working with data access frameworks. The Spring Framework doesn’t offer a common data access API; instead, the full power of the supported APIs is kept intact. The Spring Framework is the only framework available in Java that offers managed data access environments outside of an application server or container.[20]

While using Spring for transaction management with Hibernate, the following beans may have to be configured:

  • Data Source like com.mchange.v2.c3p0.ComboPooledDataSource or org.apache.commons.dbcp.BasicDataSource
  • A SessionFactory like org.springframework.orm.hibernate3.LocalSessionFactoryBean with a DataSource attribute
  • A HibernateProperties like org.springframework.beans.factory.config.PropertiesFactoryBean
  • A TransactionManager like org.springframework.orm.hibernate3.HibernateTransactionManager with a SessionFactory attribute

Other points of configuration include:

  • An AOP configuration of cutting points.
  • Transaction semantics of AOP advice[clarify].

Transaction management[edit]

Spring’s transaction management framework brings an abstraction mechanism to the Java platform. Its abstraction is capable of:

In comparison, Java Transaction API (JTA) only supports nested transactions and global transactions, and requires an application server (and in some cases also deployment of applications in an application server).

The Spring Framework ships a PlatformTransactionManager for a number of transaction management strategies:

  • Transactions managed on a JDBC Connection
  • Transactions managed on Object-relational mapping Units of Work
  • Transactions managed via the JTA TransactionManager and UserTransaction
  • Transactions managed on other resources, like object databases

Next to this abstraction mechanism the framework also provides two ways of adding transaction management to applications:

  • Programmatically, by using Spring’s TransactionTemplate
  • Configuratively, by using metadata like XML or Java annotations (@Transactional, etc.)

Together with Spring’s data access framework — which integrates the transaction management framework — it is possible to set up a transactional system through configuration without having to rely on JTA or EJB. The transactional framework also integrates with messaging and caching engines.

Model–view–controller framework[edit]

Spring MVC/Web Reactive presentation given by Juergen Hoeller

The Spring Framework features its own model–view–controller (MVC) web application framework, which wasn’t originally planned. The Spring developers decided to write their own Web framework as a reaction to what they perceived as the poor design of the (then) popular Jakarta Struts Web framework,[21] as well as deficiencies in other available frameworks. In particular, they felt there was insufficient separation between the presentation and request handling layers, and between the request handling layer and the model.[22]

Like Struts, Spring MVC is a request-based framework. The framework defines strategy interfaces for all of the responsibilities that must be handled by a modern request-based framework. The goal of each interface is to be simple and clear so that it’s easy for Spring MVC users to write their own implementations, if they so choose. MVC paves the way for cleaner front end code. All interfaces are tightly coupled to the Servlet API. This tight coupling to the Servlet API is seen by some as a failure on the part of the Spring developers to offer a high-level abstraction for Web-based applications[citation needed]. However, this coupling makes sure that the features of the Servlet API remain available to developers while also offering a high abstraction framework to ease working with it.

The DispatcherServlet class is the front controller[23] of the framework and is responsible for delegating control to the various interfaces during the execution phases of an HTTP request.

The most important interfaces defined by Spring MVC, and their responsibilities, are listed below:

  • Controller: comes between Model and View to manage incoming requests and redirect to proper response. Controller will map the http request to corresponding methods. It acts as a gate that directs the incoming information. It switches between going into model or view.
  • HandlerAdapter: execution of objects that handle incoming requests
  • HandlerInterceptor: interception of incoming requests comparable, but not equal to Servlet filters (use is optional and not controlled by DispatcherServlet).
  • HandlerMapping: selecting objects that handle incoming requests (handlers) based on any attribute or condition internal or external to those requests
  • LocaleResolver: resolving and optionally saving of the locale of an individual user
  • MultipartResolver: facilitate working with file uploads by wrapping incoming requests
  • View: responsible for returning a response to the client. Some requests may go straight to view without going to the model part; others may go through all three.
  • ViewResolver: selecting a View based on a logical name for the view (use is not strictly required)

Each strategy interface above has an important responsibility in the overall framework. The abstractions offered by these interfaces are powerful, so to allow for a set of variations in their implementations, Spring MVC ships with implementations of all these interfaces and together offers a feature set on top of the Servlet API. However, developers and vendors are free to write other implementations. Spring MVC uses the Java java.util.Map interface as a data-oriented abstraction for the Model where keys are expected to be string values.

The ease of testing the implementations of these interfaces seems one important advantage of the high level of abstraction offered by Spring MVC. DispatcherServlet is tightly coupled to the Spring inversion of control container for configuring the web layers of applications. However, web applications can use other parts of the Spring Framework—including the container—and choose not to use Spring MVC.

A workflow of Spring MVC[edit]

When a user clicks a link or submits a form in their web-browser, the request goes to Spring DispatcherServlet. DispatcherServlet is a front-controller in spring MVC. It consults one or more handler mappings. DispatcherServlet has been chosen as an appropriate controller and forwards the request to it. The Controller processes the particular request and generates a result. It is known as Model. This information needs to be formatted in html or any front-end technology like JSP. This is the View of an application. All of the information is in the MODEL And VIEW object. When the controller is not coupled to a particular view, DispatcherServlet finds the actual JSP with the help of ViewResolver.

Configuration of DispatcherServlet[edit]

DispatcherServlet must be configured in web.xml



Remote access framework[edit]

Spring’s Remote Access framework is an abstraction for working with various RPC (remote procedure call)-based technologies available on the Java platform both for client connectivity and marshalling objects on servers. The most important feature offered by this framework is to ease configuration and usage of these technologies as much as possible by combining inversion of control and AOP.

The framework also provides fault-recovery (automatic reconnection after connection failure) and some optimizations for client-side use of EJB remote stateless session beans.

Spring provides support for these protocols and products out of the box

  • HTTP-based protocols
    • Hessian: binary serialization protocol, open-sourced and maintained by CORBA-based protocols
    • RMI (1): method invocations using RMI infrastructure yet specific to Spring
    • RMI (2): method invocations using RMI interfaces complying with regular RMI usage
    • RMI-IIOP (CORBA): method invocations using RMI-IIOP/CORBA
  • Enterprise JavaBean client integration
    • Local EJB stateless session bean connectivity: connecting to local stateless session beans
    • Remote EJB stateless session bean connectivity: connecting to remote stateless session beans
  • SOAP
    • Integration with the Apache Axis Web services framework

Apache CXF provides integration with the Spring Framework for RPC-style exporting of objects on the server side.

Both client and server setup for all RPC-style protocols and products supported by the Spring Remote access framework (except for the Apache Axis support) is configured in the Spring Core container.

There is alternative open-source implementation (Cluster4Spring) of a remoting subsystem included into Spring Framework that is intended to support various schemes of remoting (1-1, 1-many, dynamic services discovering)…

Convention-over-configuration rapid application development[edit]

Further information: rapid application development

Spring Boot[edit]

Spring Boot is Spring’s convention-over-configuration solution for creating stand-alone, production-grade Spring-based Applications that you can “just run”.[24] It is preconfigured with the Spring team’s “opinionated view” of the best configuration and use of the Spring platform and third-party libraries so you can get started with minimum fuss. Most Spring Boot applications need very little Spring configuration. Features:

  • Create stand-alone Spring applications
  • Embed Tomcat or Jetty directly (no need to deploy WAR files)
  • Provide opinionated ‘starter’ Project Object Models (POMs) to simplify your Maven configuration
  • Automatically configure Spring whenever possible
  • Provide production-ready features such as metrics, health checks and externalized configuration
  • Absolutely no code generation and no requirement for XML configuration.

Spring Roo[edit]

Spring Roo is a community project which provides an alternative, code-generation based approach at using convention-over-configuration to rapidly build applications in Java. It currently supports Spring Framework, Spring Security and Spring Web Flow. Roo differs from other rapid application development frameworks by focusing on:

  • Extensibility (via add-ons)
  • Java platform productivity (as opposed to other languages)
  • Lock-in avoidance (Roo can be removed within a few minutes from any application)
  • Runtime avoidance (with associated deployment advantages)
  • Usability (particularly via the shell features and usage patterns)

Batch framework[edit]

Spring Batch is a framework for batch processing that provides reusable functions that are essential in processing large volumes of records, including:

  • logging/tracing
  • transaction management
  • job processing statistics
  • job restart

It also provides more advanced technical services and features that will enable extremely high-volume and high performance batch jobs through optimizations and partitioning techniques. Spring Batch executes a series of jobs; a job consists of many steps and each step consists of a READ-PROCESS-WRITE task or single operation task (tasklet).

The “READ-PROCESS-WRITE” process consists of these steps: “read” data from a resource (comma-separated values (CSV), XML, or database), “process” it, then “write” it to other resources (CSV, XML, or database). For example, a step may read data from a CSV file, process it, and write it into the database. Spring Batch provides many classes to read/write CSV, XML, and database.

For a “single” operation task (tasklet), it means doing a single task only, like clean up the resources before or after a step is started or completed.

The steps can be chained together to run as a job.

Integration framework[edit]

Spring Integration is a framework for Enterprise application integration that provides reusable functions essential to messaging or event-driven architectures.

  • routers – routes a message to a message channel based on conditions
  • transformers – converts/transforms/changes the message payload and creates a new message with transformed payload
  • adapters – to integrate with other technologies and systems (HTTP, AMQP (Advanced Message Queuing Protocol), JMS (Java Message Service), XMPP (Extensible Messaging and Presence Protocol), SMTP (Simple Mail Transfer Protocol), IMAP (Internet Message Access Protocol), FTP (File Transfer Protocol) as well as FTPS/SFTP, file systems, etc.)
  • filters – filters a message based on criteria. If the criteria are not met, the message is dropped
  • service activators – invoke an operation on a service object
  • management and auditing

Spring Integration supports pipe-and-filter based architectures.

See also[edit]


  1. ^
  2. ^ “Spring Framework 1.0 Final Released”Official Spring Framework blog. 24 March 2014. Retrieved 1 March 2021.
  3. ^ Jolt winners 2006
  4. ^ “JAX Innovation Award Gewinner 2006”. Archived from the original on 2009-08-17. Retrieved 2009-08-12.
  5. ^ “Spring Framework 3.2.5 Released”Official Spring website. 7 Nov 2013. Retrieved 16 October 2016.
  6. ^
  7. ^ [1]
  8. ^ Spring Official Blog
  9. ^ Spring Official Blog
  10. ^ Spring release blog
  11. ^ Spring Official Blog
  12. ^ Reactive Spring
  13. ^ Spring Framework documentation for the Core Container
  14. ^ What is the difference between the depencylookup and dependency injection – Spring Forum. (2009-10-28). Retrieved on 2013-11-24.
  15. ^ Spring VS EJB3
  16. ^ “Pitchfork FAQ”. Retrieved 2006-06-06.
  17. ^ Spring AOP XML Configuration
  18. ^ AspectJ Annotation Configuration
  19. ^ Hibernate VS Spring
  20. ^ “Spring Data JPA for Abstraction of Queries”. Retrieved 2018-02-06.
  21. ^ Introduction to the Spring Framework
  22. ^ Johnson, Expert One-on-One J2EE Design and Development, Ch. 12. et al.
  23. ^ Patterns of Enterprise Application Architecture: Front Controller
  24. ^ Spring Boot


External links

The Wikibook Java Programming has a page on the topic of: Spring framework


” (WP)


Fair Use Sources:

Data Science - Big Data Python Software Engineering


” (WP)

SQLAlchemy is an open-sourceSQL toolkit and object-relational mapper (ORM) for the Python programming language released under the MIT License.[5]

Original author(s)Michael Bayer[1][2]
Initial releaseFebruary 14, 2006; 15 years ago[3]
Stable release1.4.15 / May 11, 2021; 2 months ago[4]
Written inPython
Operating systemCross-platform
TypeObject-relational mapping
LicenseMIT License[5] 


SQLAlchemy’s philosophy is that relational databases behave less like object collections as the scale gets larger and performance starts being a concern, while object collections behave less like tables and rows as more abstraction is designed into them. For this reason it has adopted the data mapper pattern (similar to Hibernate for Java) rather than the active record pattern used by a number of other object-relational mappers.[6] However, optional plugins allow users to develop using declarative syntax.[7]


SQLAlchemy was first released in February 2006[8][3] and has quickly become one of the most widely used object-relational mapping tools in the Python community, alongside Django‘s ORM.


This section possibly contains original research. Please improve it by verifying the claims made and adding inline citations. Statements consisting only of original research should be removed. (November 2019) (Learn how and when to remove this template message)

The following example represents an n-to-1 relationship between movies and their directors. It is shown how user-defined Python classes create corresponding database tables, how instances with relationships are created from either side of the relationship, and finally how the data can be queried—illustrating automatically generated SQL queries for both lazy and eager loading.

Schema definition

Creating two Python classes and according database tables in the DBMS:

from sqlalchemy import *
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import relation, sessionmaker

Base = declarative_base()

class Movie(Base):
    __tablename__ = "movies"

    id = Column(Integer, primary_key=True)
    title = Column(String(255), nullable=False)
    year = Column(Integer)
    directed_by = Column(Integer, ForeignKey(""))

    director = relation("Director", backref="movies", lazy=False)

    def __init__(self, title=None, year=None):
        self.title = title
        self.year = year

    def __repr__(self):
        return "Movie(%r, %r, %r)" % (self.title, self.year, self.director)

class Director(Base):
    __tablename__ = "directors"

    id = Column(Integer, primary_key=True)
    name = Column(String(50), nullable=False, unique=True)

    def __init__(self, name=None): = name

    def __repr__(self):
        return "Director(%r)" % (

engine = create_engine("dbms://user:[email protected]/dbname")

Data insertion

One can insert a director-movie relationship via either entity:

Session = sessionmaker(bind=engine)
session = Session()

m1 = Movie("Robocop", 1987)
m1.director = Director("Paul Verhoeven")

d2 = Director("George Lucas")
d2.movies = [Movie("Star Wars", 1977), Movie("THX 1138", 1971)]



alldata = session.query(Movie).all()
for somedata in alldata:

SQLAlchemy issues the following query to the DBMS (omitting aliases):

SELECT, movies.title, movies.year, movies.directed_by,,
FROM movies LEFT OUTER JOIN directors ON = movies.directed_by

The output:

Movie('Robocop', 1987L, Director('Paul Verhoeven'))
Movie('Star Wars', 1977L, Director('George Lucas'))
Movie('THX 1138', 1971L, Director('George Lucas'))

Setting lazy=True (default) instead, SQLAlchemy would first issue a query to get the list of movies and only when needed (lazy) for each director a query to get the name of the according director:

SELECT, movies.title, movies.year, movies.directed_by
FROM movies

FROM directors
WHERE = %s

See also


  1. ^ Mike Bayer is the creator of SQLAlchemy and Mako Templates for Python.
  2. ^ Interview Mike Bayer SQLAlchemy #pydata #python
  3. a b “Download – SQLAlchemy”. SQLAlchemy. Retrieved 21 February 2015.
  4. ^ “Releases – sqlalchemy/sqlalchemy”. Retrieved 17 May 2021 – via GitHub.
  5. a b “zzzeek / sqlalchemy / source / LICENSE”. BitBucket. Retrieved 21 February 2015.
  6. ^ in The architecture of open source applications
  7. ^ Declarative
  8. ^


External links


” (WP)


Fair Use Sources:

C# .NET Java Software Engineering

Actor model – Actor-based concurrency

” (WP)

The actor model in computer science is a mathematical model of concurrent computation that treats actor as the universal primitive of concurrent computation. In response to a message it receives, an actor can: make local decisions, create more actors, send more messages, and determine how to respond to the next message received. Actors may modify their own private state, but can only affect each other indirectly through messaging (removing the need for lock-based synchronization).

The actor model originated in 1973.[1] It has been used both as a framework for a theoretical understanding of computation and as the theoretical basis for several practical implementations of concurrent systems. The relationship of the model to other work is discussed in actor model and process calculi.


Main article: History of the Actor model

According to Carl Hewitt, unlike previous models of computation, the actor model was inspired by physics, including general relativity and quantum mechanics.[citation needed] It was also influenced by the programming languages LispSimula, early versions of Smalltalkcapability-based systems, and packet switching. Its development was “motivated by the prospect of highly parallel computing machines consisting of dozens, hundreds, or even thousands of independent microprocessors, each with its own local memory and communications processor, communicating via a high-performance communications network.”[2] Since that time, the advent of massive concurrency through multi-core and manycore computer architectures has revived interest in the actor model.

Following Hewitt, Bishop, and Steiger’s 1973 publication, Irene Greif developed an operational semantics for the actor model as part of her doctoral research.[3] Two years later, Henry Baker and Hewitt published a set of axiomatic laws for actor systems.[4][5] Other major milestones include William Clinger’s 1981 dissertation introducing a denotational semantics based on power domains[2] and Gul Agha‘s 1985 dissertation which further developed a transition-based semantic model complementary to Clinger’s.[6] This resulted in the full development of actor model theory.

Major software implementation work was done by Russ Atkinson, Giuseppe Attardi, Henry Baker, Gerry Barber, Peter Bishop, Peter de Jong, Ken Kahn, Henry Lieberman, Carl Manning, Tom Reinhardt, Richard Steiger and Dan Theriault in the Message Passing Semantics Group at Massachusetts Institute of Technology (MIT). Research groups led by Chuck Seitz at California Institute of Technology (Caltech) and Bill Dally at MIT constructed computer architectures that further developed the message passing in the model. See Actor model implementation.

Research on the actor model has been carried out at California Institute of TechnologyKyoto University Tokoro Laboratory, Microelectronics and Computer Technology Corporation (MCC), MIT Artificial Intelligence LaboratorySRIStanford UniversityUniversity of Illinois at Urbana–Champaign,[7] Pierre and Marie Curie University (University of Paris 6), University of PisaUniversity of Tokyo Yonezawa Laboratory, Centrum Wiskunde & Informatica (CWI) and elsewhere.

Fundamental concepts

The actor model adopts the philosophy that everything is an actor. This is similar to the everything is an object philosophy used by some object-oriented programming languages.

An actor is a computational entity that, in response to a message it receives, can concurrently:

  • send a finite number of messages to other actors;
  • create a finite number of new actors;
  • designate the behavior to be used for the next message it receives.

There is no assumed sequence to the above actions and they could be carried out in parallel.

Decoupling the sender from communications sent was a fundamental advance of the actor model enabling asynchronous communication and control structures as patterns of passing messages.[8]

Recipients of messages are identified by address, sometimes called “mailing address”. Thus an actor can only communicate with actors whose addresses it has. It can obtain those from a message it receives, or if the address is for an actor it has itself created.

The actor model is characterized by inherent concurrency of computation within and among actors, dynamic creation of actors, inclusion of actor addresses in messages, and interaction only through direct asynchronous message passing with no restriction on message arrival order.

Formal systems

Over the years, several different formal systems have been developed which permit reasoning about systems in the actor model. These include:

There are also formalisms that are not fully faithful to the actor model in that they do not formalize the guaranteed delivery of messages including the following (See Attempts to relate actor semantics to algebra and linear logic):


This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.
Find sources: “Actor model” – news · newspapers · books · scholar · JSTOR (December 2006) (Learn how and when to remove this template message)

The actor model can be used as a framework for modeling, understanding, and reasoning about a wide range of concurrent systems. For example:

  • Electronic mail (email) can be modeled as an actor system. Accounts are modeled as actors and email addresses as actor addresses.
  • Web services can be modeled with Simple Object Access Protocol (SOAP) endpoints modeled as actor addresses.
  • Objects with locks (e.g., as in Java and C#) can be modeled as a serializer, provided that their implementations are such that messages can continually arrive (perhaps by being stored in an internal queue). A serializer is an important kind of actor defined by the property that it is continually available to the arrival of new messages; every message sent to a serializer is guaranteed to arrive.
  • Testing and Test Control Notation (TTCN), both TTCN-2 and TTCN-3, follows actor model rather closely. In TTCN actor is a test component: either parallel test component (PTC) or main test component (MTC). Test components can send and receive messages to and from remote partners (peer test components or test system interface), the latter being identified by its address. Each test component has a behaviour tree bound to it; test components run in parallel and can be dynamically created by parent test components. Built-in language constructs allow the definition of actions to be taken when an expected message is received from the internal message queue, like sending a message to another peer entity or creating new test components.

Message-passing semantics

The actor model is about the semantics of message passing.

Unbounded nondeterminism controversy

Arguably, the first concurrent programs were interrupt handlers. During the course of its normal operation a computer needed to be able to receive information from outside (characters from a keyboard, packets from a network, etc). So when the information arrived the execution of the computer was interrupted and special code (called an interrupt handler) was called to put the information in a data buffer where it could be subsequently retrieved.

In the early 1960s, interrupts began to be used to simulate the concurrent execution of several programs on one processor.[15] Having concurrency with shared memory gave rise to the problem of concurrency control. Originally, this problem was conceived as being one of mutual exclusion on a single computer. Edsger Dijkstra developed semaphores and later, between 1971 and 1973,[16] Tony Hoare[17] and Per Brinch Hansen[18] developed monitors to solve the mutual exclusion problem. However, neither of these solutions provided a programming language construct that encapsulated access to shared resources. This encapsulation was later accomplished by the serializer construct ([Hewitt and Atkinson 1977, 1979] and [Atkinson 1980]).

The first models of computation (e.g.Turing machines, Post productions, the lambda calculusetc.) were based on mathematics and made use of a global state to represent a computational step (later generalized in [McCarthy and Hayes 1969] and [Dijkstra 1976] see Event orderings versus global state). Each computational step was from one global state of the computation to the next global state. The global state approach was continued in automata theory for finite-state machines and push down stack machines, including their nondeterministic versions. Such nondeterministic automata have the property of bounded nondeterminism; that is, if a machine always halts when started in its initial state, then there is a bound on the number of states in which it halts.

Edsger Dijkstra further developed the nondeterministic global state approach. Dijkstra’s model gave rise to a controversy concerning unbounded nondeterminism (also called unbounded indeterminacy), a property of concurrency by which the amount of delay in servicing a request can become unbounded as a result of arbitration of contention for shared resources while still guaranteeing that the request will eventually be serviced. Hewitt argued that the actor model should provide the guarantee of service. In Dijkstra’s model, although there could be an unbounded amount of time between the execution of sequential instructions on a computer, a (parallel) program that started out in a well defined state could terminate in only a bounded number of states [Dijkstra 1976]. Consequently, his model could not provide the guarantee of service. Dijkstra argued that it was impossible to implement unbounded nondeterminism.

Hewitt argued otherwise: there is no bound that can be placed on how long it takes a computational circuit called an arbiter to settle (see metastability (electronics)).[19] Arbiters are used in computers to deal with the circumstance that computer clocks operate asynchronously with respect to input from outside, e.g., keyboard input, disk access, network input, etc. So it could take an unbounded time for a message sent to a computer to be received and in the meantime the computer could traverse an unbounded number of states.

The actor model features unbounded nondeterminism which was captured in a mathematical model by Will Clinger using domain theory.[2] In the actor model, there is no global state.[dubious – discuss]

Direct communication and asynchrony

Messages in the actor model are not necessarily buffered. This was a sharp break with previous approaches to models of concurrent computation. The lack of buffering caused a great deal of misunderstanding at the time of the development of the actor model and is still a controversial issue. Some researchers argued that the messages are buffered in the “ether” or the “environment”. Also, messages in the actor model are simply sent (like packets in IP); there is no requirement for a synchronous handshake with the recipient.

Actor creation plus addresses in messages means variable topology

A natural development of the actor model was to allow addresses in messages. Influenced by packet switched networks [1961 and 1964], Hewitt proposed the development of a new model of concurrent computation in which communications would not have any required fields at all: they could be empty. Of course, if the sender of a communication desired a recipient to have access to addresses which the recipient did not already have, the address would have to be sent in the communication.

For example, an actor might need to send a message to a recipient actor from which it later expects to receive a response, but the response will actually be handled by a third actor component that has been configured to receive and handle the response (for example, a different actor implementing the observer pattern). The original actor could accomplish this by sending a communication that includes the message it wishes to send, along with the address of the third actor that will handle the response. This third actor that will handle the response is called the resumption (sometimes also called a continuation or stack frame). When the recipient actor is ready to send a response, it sends the response message to the resumption actor address that was included in the original communication.

So, the ability of actors to create new actors with which they can exchange communications, along with the ability to include the addresses of other actors in messages, gives actors the ability to create and participate in arbitrarily variable topological relationships with one another, much as the objects in Simula and other object-oriented languages may also be relationally composed into variable topologies of message-exchanging objects.

Inherently concurrent

As opposed to the previous approach based on composing sequential processes, the actor model was developed as an inherently concurrent model. In the actor model sequentiality was a special case that derived from concurrent computation as explained in actor model theory.

No requirement on order of message arrival

This section needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed. (March 2012) (Learn how and when to remove this template message)

Hewitt argued against adding the requirement that messages must arrive in the order in which they are sent to the actor. If output message ordering is desired, then it can be modeled by a queue actor that provides this functionality. Such a queue actor would queue the messages that arrived so that they could be retrieved in FIFO order. So if an actor X sent a message M1 to an actor Y, and later X sent another message M2 to Y, there is no requirement that M1 arrives at Y before M2.

In this respect the actor model mirrors packet switching systems which do not guarantee that packets must be received in the order sent. Not providing the order of delivery guarantee allows packet switching to buffer packets, use multiple paths to send packets, resend damaged packets, and to provide other optimizations.

For example, actors are allowed to pipeline the processing of messages. What this means is that in the course of processing a message M1, an actor can designate the behavior to be used to process the next message, and then in fact begin processing another message M2 before it has finished processing M1. Just because an actor is allowed to pipeline the processing of messages does not mean that it must pipeline the processing. Whether a message is pipelined is an engineering tradeoff. How would an external observer know whether the processing of a message by an actor has been pipelined? There is no ambiguity in the definition of an actor created by the possibility of pipelining. Of course, it is possible to perform the pipeline optimization incorrectly in some implementations, in which case unexpected behavior may occur.


Another important characteristic of the actor model is locality.

Locality means that in processing a message, an actor can send messages only to addresses that it receives in the message, addresses that it already had before it received the message, and addresses for actors that it creates while processing the message. (But see Synthesizing addresses of actors.)

Also locality means that there is no simultaneous change in multiple locations. In this way it differs from some other models of concurrency, e.g., the Petri net model in which tokens are simultaneously removed from multiple locations and placed in other locations.

Composing actor systems

The idea of composing actor systems into larger ones is an important aspect of modularity that was developed in Gul Agha’s doctoral dissertation,[6] developed later by Gul Agha, Ian Mason, Scott Smith, and Carolyn Talcott.[9]


A key innovation was the introduction of behavior specified as a mathematical function to express what an actor does when it processes a message, including specifying a new behavior to process the next message that arrives. Behaviors provided a mechanism to mathematically model the sharing in concurrency.

Behaviors also freed the actor model from implementation details, e.g., the Smalltalk-72 token stream interpreter. However, it is critical to understand that the efficient implementation of systems described by the actor model require extensive optimization. See Actor model implementation for details.

Modeling other concurrency systems

Other concurrency systems (e.g.process calculi) can be modeled in the actor model using a two-phase commit protocol.[20]

Computational Representation Theorem

See also: Denotational semantics of the Actor model

There is a Computational Representation Theorem in the actor model for systems which are closed in the sense that they do not receive communications from outside. The mathematical denotation denoted by a closed system {\displaystyle {\mathtt {S}}}{\displaystyle {\mathtt {S}}} is constructed from an initial behavior S and a behavior-approximating function progressionS. These obtain increasingly better approximations and construct a denotation (meaning) for {\displaystyle {\mathtt {S}}}{\displaystyle {\mathtt {S}}} as follows [Hewitt 2008; Clinger 1981]:{\displaystyle \mathbf {Denote} _{\mathtt {S}}\equiv \lim _{i\to \infty }\mathbf {progression} _{{\mathtt {S}}^{i}}(\bot _{\mathtt {S}})}{\displaystyle \mathbf {Denote} _{\mathtt {S}}\equiv \lim _{i\to \infty }\mathbf {progression} _{{\mathtt {S}}^{i}}(\bot _{\mathtt {S}})}

In this way, S can be mathematically characterized in terms of all its possible behaviors (including those involving unbounded nondeterminism). Although {\displaystyle \mathbf {Denote} _{\mathtt {S}}}{\displaystyle \mathbf {Denote} _{\mathtt {S}}} is not an implementation of {\displaystyle {\mathtt {S}}}{\displaystyle {\mathtt {S}}}, it can be used to prove a generalization of the Church-Turing-Rosser-Kleene thesis [Kleene 1943]:

A consequence of the above theorem is that a finite actor can nondeterministically respond with an uncountable[clarify] number of different outputs.

Relationship to logic programming

This section needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed. (March 2012) (Learn how and when to remove this template message)

One of the key motivations for the development of the actor model was to understand and deal with the control structure issues that arose in development of the Planner programming language.[citation needed] Once the actor model was initially defined, an important challenge was to understand the power of the model relative to Robert Kowalski‘s thesis that “computation can be subsumed by deduction”. Hewitt argued that Kowalski’s thesis turned out to be false for the concurrent computation in the actor model (see Indeterminacy in concurrent computation).

Nevertheless, attempts were made to extend logic programming to concurrent computation. However, Hewitt and Agha [1991] claimed that the resulting systems were not deductive in the following sense: computational steps of the concurrent logic programming systems do not follow deductively from previous steps (see Indeterminacy in concurrent computation). Recently, logic programming has been integrated into the actor model in a way that maintains logical semantics.[19]


Migration in the actor model is the ability of actors to change locations. E.g., in his dissertation, Aki Yonezawa modeled a post office that customer actors could enter, change locations within while operating, and exit. An actor that can migrate can be modeled by having a location actor that changes when the actor migrates. However the faithfulness of this modeling is controversial and the subject of research.[citation needed]


The security of actors can be protected in the following ways:

Synthesizing addresses of actors

This section needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed. (March 2012) (Learn how and when to remove this template message)

A delicate point in the actor model is the ability to synthesize the address of an actor. In some cases security can be used to prevent the synthesis of addresses (see Security). However, if an actor address is simply a bit string then clearly it can be synthesized although it may be difficult or even infeasible to guess the address of an actor if the bit strings are long enough. SOAP uses a URL for the address of an endpoint where an actor can be reached. Since a URL is a character string, it can clearly be synthesized although encryption can make it virtually impossible to guess.

Synthesizing the addresses of actors is usually modeled using mapping. The idea is to use an actor system to perform the mapping to the actual actor addresses. For example, on a computer the memory structure of the computer can be modeled as an actor system that does the mapping. In the case of SOAP addresses, it’s modeling the DNS and the rest of the URL mapping.

Contrast with other models of message-passing concurrency

Robin Milner‘s initial published work on concurrency[21] was also notable in that it was not based on composing sequential processes. His work differed from the actor model because it was based on a fixed number of processes of fixed topology communicating numbers and strings using synchronous communication. The original communicating sequential processes (CSP) model[22] published by Tony Hoare differed from the actor model because it was based on the parallel composition of a fixed number of sequential processes connected in a fixed topology, and communicating using synchronous message-passing based on process names (see Actor model and process calculi history). Later versions of CSP abandoned communication based on process names in favor of anonymous communication via channels, an approach also used in Milner’s work on the calculus of communicating systems and the π-calculus.

These early models by Milner and Hoare both had the property of bounded nondeterminism. Modern, theoretical CSP ([Hoare 1985] and [Roscoe 2005]) explicitly provides unbounded nondeterminism.

Petri nets and their extensions (e.g., coloured Petri nets) are like actors in that they are based on asynchronous message passing and unbounded nondeterminism, while they are like early CSP in that they define fixed topologies of elementary processing steps (transitions) and message repositories (places).


The actor model has been influential on both theory development and practical software development.


The actor model has influenced the development of the π-calculus and subsequent process calculi. In his Turing lecture, Robin Milner wrote:[23]

Now, the pure lambda-calculus is built with just two kinds of thing: terms and variables. Can we achieve the same economy for a process calculus? Carl Hewitt, with his actors model, responded to this challenge long ago; he declared that a value, an operator on values, and a process should all be the same kind of thing: an actor.

This goal impressed me, because it implies the homogeneity and completeness of expression … But it was long before I could see how to attain the goal in terms of an algebraic calculus…

So, in the spirit of Hewitt, our first step is to demand that all things denoted by terms or accessed by names—values, registers, operators, processes, objects—are all of the same kind of thing; they should all be processes.


The actor model has had extensive influence on commercial practice. For example, Twitter has used actors for scalability.[24] Also, Microsoft has used the actor model in the development of its Asynchronous Agents Library.[25] There are many other actor libraries listed in the actor libraries and frameworks section below.

Addressed issues

According to Hewitt [2006], the actor model addresses issues in computer and communications architecture, concurrent programming languages, and Web services including the following:

  • Scalability: the challenge of scaling up concurrency both locally and nonlocally.
  • Transparency: bridging the chasm between local and nonlocal concurrency. Transparency is currently a controversial issue. Some researchers[who?] have advocated a strict separation between local concurrency using concurrent programming languages (e.g., Java and C#) from nonlocal concurrency using SOAP for Web services. Strict separation produces a lack of transparency that causes problems when it is desirable/necessary to change between local and nonlocal access to Web services (see Distributed computing).
  • Inconsistency: inconsistency is the norm because all very large knowledge systems about human information system interactions are inconsistent. This inconsistency extends to the documentation and specifications of very large systems (e.g., Microsoft Windows software, etc.), which are internally inconsistent.

Many of the ideas introduced in the actor model are now also finding application in multi-agent systems for these same reasons [Hewitt 2006b 2007b]. The key difference is that agent systems (in most definitions) impose extra constraints upon the actors, typically requiring that they make use of commitments and goals.

Programming with actors

A number of different programming languages employ the actor model or some variation of it. These languages include:

Early actor programming languages

Later actor programming languages

Actor libraries and frameworks

Actor libraries or frameworks have also been implemented to permit actor-style programming in languages that don’t have actors built-in. Some of these frameworks are:

NameStatusLatest releaseLicenseLanguages
ReActedActive2021-03-07Apache 2.0Java
ActeurActive2020-04-16[43]Apache-2.0 / MITRust
BastionActive2020-08-12[44]Apache-2.0 / MITRust
Actor4jActive2020-01-31Apache 2.0Java
ActrActive2019-04-09[46]Apache 2.0Java
Vert.xActive2018-02-13Apache 2.0Java, Groovy, Javascript, Ruby, Scala, Kotlin, Ceylon
ActorFxInactive2013-11-13Apache 2.0.NET
Akka (toolkit)Active2019-05-21[47]Apache 2.0Java and Scala
Akka.NETActive2020-08-20[48]Apache 2.0.NET
Remact.NetInactive2016-06-26MIT.NET, Javascript
Ateji PXInactive??Java
F# MailboxProcessorActivesame as F# (built-in core library)Apache LicenseF#
KorusActive2010-02-04GPL 3Java
ActorFoundry (based on Kilim)Inactive2008-12-28?Java
Cloud HaskellActive2015-06-17[52]BSDHaskell
CloudIActive2021-05-27[53]MITATS, C/C++, Elixir/Erlang/LFE, Go, Haskell, Java, Javascript, OCaml, Perl, PHP, Python, Ruby
ClutterActive2017-05-12[54]LGPL 2.1C, C++ (cluttermm), Python (pyclutter), Perl (perl-Clutter)
NActInactive2012-02-28LGPL 3.0.NET
NactActive2018-06-06[55]Apache 2.0JavaScript/ReasonML
RetlangInactive2011-05-18[56]New BSD.NET
JetlangActive2013-05-30[57]New BSDJava
Haskell-ActorActive?2008New BSDHaskell
GParsActive2014-05-09[58]Apache 2.0Groovy
OOSMOSActive2019-05-09[59]GPL 2.0 and commercial (dual licensing)C. C++ friendly
PaniniActive2014-05-22MPL 1.1Programming Language by itself
PARLEYActive?2007-22-07GPL 2.1Python
PeerneticActive2007-06-29LGPL 3.0Java
PostSharpActive2014-09-24Commercial / Freemium.NET
PulsarActive2016-07-09[60]New BSDPython
PykkaActive2019-05-07[62]Apache 2.0Python
Termite SchemeActive?2009-05-21LGPLScheme (Gambit implementation)
LibactorActive?2009GPL 2.0C
Actor-CPPActive2012-03-10[67]GPL 2.0C++
S4Inactive2012-07-31[68]Apache 2.0Java
C++ Actor Framework (CAF)Active2020-02-08[69]Boost Software License 1.0 and BSD 3-ClauseC++11
LabVIEW Actor FrameworkActive2012-03-01[71]National Instruments SLALabVIEW
LabVIEW Messenger LibraryActive2016-06-01BSDLabVIEW
OrbitActive2019-05-28[72]New BSDJava
QP frameworks for real-time embedded systemsActive2019-05-25[73]GPL 2.0 and commercial (dual licensing)C and C++
libprocessActive2013-06-19Apache 2.0C++
SObjectizerActive2020-05-09[74]New BSDC++11
rotorActive2020-10-23[75]MIT LicenseC++17
OrleansActive2019-06-02[76]MIT LicenseC#/.NET
SkynetActive2020-12-10MIT LicenseC/Lua
Reactors.IOActive2016-06-14BSD LicenseJava/Scala
libagentsActive2020-03-08Free software licenseC++11
Proto.ActorActive2021-01-05Free software licenseGo, C#, Python, JavaScript, Java, Kotlin
FunctionalJavaActive2018-08-18[77]BSD 3-ClauseJava
RikerActive2019-01-04MIT LicenseRust
ComedyActive2019-03-09EPL 1.0JavaScript
vlingoActive2020-07-26Mozilla Public License 2.0Java, Kotlin, soon .NET
wasmCloudActive2021-03-23Apache 2.0WebAssembly (Rust, TinyGo, Zig, AssemblyScript)
rayActive2020-08-27Apache 2.0Python

See also


  1. ^ Hewitt, Carl; Bishop, Peter; Steiger, Richard (1973). “A Universal Modular Actor Formalism for Artificial Intelligence”. IJCAI.
  2. a b c d William Clinger (June 1981). “Foundations of Actor Semantics”. Mathematics Doctoral Dissertation. MIT. hdl:1721.1/6935.
  3. a b Irene Greif (August 1975). “Semantics of Communicating Parallel Processes”. EECS Doctoral Dissertation. MIT.
  4. a b Henry BakerCarl Hewitt (August 1977). “Laws for Communicating Parallel Processes”. IFIP.
  5. ^ “Laws for Communicating Parallel Processes” (PDF). 10 May 1977.
  6. a b c Gul Agha (1986). “Actors: A Model of Concurrent Computation in Distributed Systems”. Doctoral Dissertation. MIT Press. hdl:1721.1/6952.
  7. ^ “Home”. Archived from the original on 2013-02-22. Retrieved 2012-12-02.
  8. ^ Carl Hewitt. Viewing Control Structures as Patterns of Passing Messages Journal of Artificial Intelligence. June 1977.
  9. a b Gul Agha; Ian Mason; Scott Smith; Carolyn Talcott (January 1993). “A Foundation for Actor Computation”. Journal of Functional Programming.
  10. ^ Carl Hewitt (2006-04-27). “What is Commitment? Physical, Organizational, and Social” (PDF). [email protected]
  11. ^ Mauro Gaspari; Gianluigi Zavattaro (May 1997). “An Algebra of Actors” (PDF). Formal Methods for Open Object-Based Distributed Systems. Technical Report UBLCS-97-4. University of Bologna. pp. 3–18. doi:10.1007/978-0-387-35562-7_2ISBN 978-1-4757-5266-3.
  12. ^ M. Gaspari; G. Zavattaro (1999). “An Algebra of Actors”. Formal Methods for Open Object Based Systems.
  13. ^ Gul Agha; Prasanna Thati (2004). “An Algebraic Theory of Actors and Its Application to a Simple Object-Based Language” (PDF). From OO to FM (Dahl Festschrift) LNCS 2635. Archived from the original (PDF) on 2004-04-20.
  14. ^ John Darlington; Y. K. Guo (1994). “Formalizing Actors in Linear Logic”. International Conference on Object-Oriented Information Systems.
  15. ^ Hansen, Per Brinch (2002). The Origins of Concurrent Programming: From Semaphores to Remote Procedure Calls. Springer. ISBN 978-0-387-95401-1.
  16. ^ Hansen, Per Brinch (1996). “Monitors and Concurrent Pascal: A Personal History”. Communications of the ACM: 121–172.
  17. ^ Hoare, Tony (October 1974). “Monitors: An Operating System Structuring Concept”. Communications of the ACM17(10): 549–557. doi:10.1145/355620.361161S2CID 1005769.
  18. ^ Hansen, Per Brinch (July 1973). Operating System Principles. Prentice-Hall.
  19. a b Hewitt, Carl (2012). “What is computation? Actor Model versus Turing’s Model”. In Zenil, Hector (ed.). A Computable Universe: Understanding Computation & Exploring Nature as Computation. Dedicated to the memory of Alan M. Turing on the 100th anniversary of his birth. World Scientific Publishing Company.
  20. ^ Frederick Knabe. A Distributed Protocol for Channel-Based Communication with Choice PARLE 1992.
  21. ^ Robin Milner. Processes: A Mathematical Model of Computing Agents in Logic Colloquium 1973.
  22. ^ C.A.R. Hoare. Communicating sequential processesCACM. August 1978.
  23. ^ Milner, Robin (1993). “Elements of interaction”Communications of the ACM36: 78–89. doi:10.1145/151233.151240.
  24. ^ “How Twitter Is Scaling « Waiming Mok’s Blog”. 2009-06-27. Retrieved 2012-12-02.
  25. ^ “Actor-Based Programming with the Asynchronous Agents Library” MSDN September 2010.
  26. ^ Henry Lieberman (June 1981). “A Preview of Act 1”. MIT AI memo 625. hdl:1721.1/6350.
  27. ^ Henry Lieberman (June 1981). “Thinking About Lots of Things at Once without Getting Confused: Parallelism in Act 1”. MIT AI memo 626. hdl:1721.1/6351.
  28. ^ Jean-Pierre Briot. Acttalk: A framework for object-oriented concurrent programming-design and experience 2nd France-Japan workshop. 1999.
  29. ^ Ken Kahn. A Computational Theory of Animation MIT EECS Doctoral Dissertation. August 1979.
  30. ^ William Athas and Nanette Boden Cantor: An Actor Programming System for Scientific Computing in Proceedings of the NSF Workshop on Object-Based Concurrent Programming. 1988. Special Issue of SIGPLAN Notices.
  31. ^ Darrell Woelk. Developing InfoSleuth Agents Using Rosette: An Actor Based Language Proceedings of the CIKM ’95 Workshop on Intelligent Information Agents. 1995.
  32. ^ Dedecker J., Van Cutsem T., Mostinckx S., D’Hondt T., De Meuter W. Ambient-oriented Programming in AmbientTalk. In “Proceedings of the 20th European Conference on Object-Oriented Programming (ECOOP), Dave Thomas (Ed.), Lecture Notes in Computer Science Vol. 4067, pp. 230-254, Springer-Verlag.”, 2006
  33. ^ Darryl K. Taft (2009-04-17). “Microsoft Cooking Up New Parallel Programming Language”. Retrieved 2012-12-02.
  34. ^ “Humus”. Retrieved 2012-12-02.
  35. ^ Brandauer, Stephan; et al. (2015). “Parallel objects for multicores: A glimpse at the parallel language encore”. Formal Methods for Multicore Programming. Springer International Publishing: 1–56.
  36. ^ “The Pony Language”.
  37. ^ Clebsch, Sylvan; Drossopoulou, Sophia; Blessing, Sebastian; McNeil, Andy (2015). “Deny capabilities for safe, fast actors”. Proceedings of the 5th International Workshop on Programming Based on Actors, Agents, and Decentralized Control – AGERE! 2015. pp. 1–12. doi:10.1145/2824815.2824816ISBN 9781450339018S2CID 415745. by Sylvan Clebsch, Sophia Drossopoulou, Sebastian Blessing, Andy McNeil
  38. ^ “The P Language”. 2019-03-08.
  39. ^ “The P# Language”. 2019-03-12.
  40. ^ Carlos Varela and Gul Agha (2001). “Programming Dynamically Reconfigurable Open Systems with SALSA”. ACM SIGPLAN Notices. OOPSLA’2001 Intriguing Technology Track Proceedings36.
  41. ^ Philipp Haller and Martin Odersky (September 2006). “Event-Based Programming without Inversion of Control”(PDF). Proc. JMLC 2006.
  42. ^ Philipp Haller and Martin Odersky (January 2007). “Actors that Unify Threads and Events” (PDF). Technical report LAMP 2007. Archived from the original (PDF) on 2011-06-07. Retrieved 2007-12-10.
  43. ^ “acteur – 0.9.1· David Bonet ·”. Retrieved 2020-04-16.
  44. ^ Bulut, Mahmut (2019-12-15). “Bastion on” Retrieved 2019-12-15.
  45. ^ “actix – 0.10.0· Rob Ede ·”. Retrieved 2021-02-28.
  46. ^ “Releases · zakgof/actr · GitHub”. Retrieved 2019-04-16.
  47. ^ “Akka 2.5.23 Released · Akka”. Akka. 2019-05-21. Retrieved 2019-06-03.
  48. ^ Akka.NET v1.4.10 Stable Release GitHub – akkadotnet/ Port of Akka actors for .NET., Akka.NET, 2020-10-01, retrieved 2020-10-01
  49. ^ Srinivasan, Sriram; Alan Mycroft (2008). “Kilim: Isolation-Typed Actors for Java” (PDF). European Conference on Object Oriented Programming ECOOP 2008. Cyprus. Retrieved 2016-02-25.
  50. ^ “Releases · kilim/kilim · GitHub”. Retrieved 2019-06-03.
  51. ^ “Commit History · stevedekorte/ActorKit · GitHub”. Retrieved 2016-02-25.
  52. ^ “Commit History · haskell-distributed/distributed-process · GitHub”. Retrieved 2012-12-02.
  53. ^ “Releases · CloudI/CloudI · GitHub”. Retrieved 2021-06-21.
  54. ^ “Tags · GNOME/clutter · GitLab”. Retrieved 2019-06-03.
  55. ^ “Releases · ncthbrt/nact · GitHub”. Retrieved 2019-06-03.
  56. ^ “Changes – retlang – Message based concurrency in .NET – Google Project Hosting”. Retrieved 2016-02-25.
  57. ^ “ – jetlang – – Message based concurrency for Java – Google Project Hosting”. 2012-02-14. Retrieved 2016-02-25.
  58. ^ “GPars Releases”. GitHub. Retrieved 2016-02-25.
  59. ^ “Releases · oosmos/oosmos · GitHub”. GitHub. Retrieved 2019-06-03.
  60. ^ “Pulsar Design and Actors”. Archived from the originalon 2015-07-04.
  61. ^ “Pulsar documentation”. Archived from the original on 2013-07-26.
  62. ^ “Changes – Pykka 2.0.0 documentation”. Retrieved 2019-06-03.
  63. ^ “Theron – Ashton Mason”. Retrieved 2018-08-29.
  64. ^ “Theron – Version 6.00.02 released”. Archived from the original on 2016-03-16. Retrieved 2016-02-25.
  65. ^ “Theron”. Archived from the originalon 2016-03-04. Retrieved 2016-02-25.
  66. ^ “Releases · puniverse/quasar · GitHub”. Retrieved 2019-06-03.
  67. ^ “Changes – actor-cpp – An implementation of the actor model for C++ – Google Project Hosting”. Retrieved 2012-12-02.
  68. ^ “Commit History · s4/s4 · Apache”. Archived from the original on 2016-03-06. Retrieved 2016-01-16.
  69. ^ “Releases · actor-framework/actor-framework · GitHub”. Retrieved 2020-03-07.
  70. ^ “celluloid | | your community gem host”. Retrieved 2019-06-03.
  71. ^ “Community: Actor Framework, LV 2011 revision (version 3.0.7)”. 2011-09-23. Retrieved 2016-02-25.
  72. ^ “Releases · orbit/orbit · GitHub”. GitHub. Retrieved 2019-06-03.
  73. ^ “QP Real-Time Embedded Frameworks & Tools – Browse Files at”. Retrieved 2019-06-03.
  74. ^ “Releases · Stiffstream/sobjectizer · GitHub”. GitHub. Retrieved 2019-06-19.
  75. ^ “Releases · basiliscos/cpp-rotor· GitHub”. GitHub. Retrieved 2020-10-10.
  76. ^ “Releases · dotnet/orleans · GitHub”. GitHub. Retrieved 2019-06-03.
  77. ^ “FunctionalJava releases”. GitHub. Retrieved 2018-08-23.

Further reading

External links


” (WP)


Fair Use Sources:

Software Engineering

Analysis of algorithms

” (WP)

In computer science, the analysis of algorithms is the process of finding the computational complexity of algorithms – the amount of time, storage, or other resources needed to execute them. Usually, this involves determining a function that relates the length of an algorithm’s input to the number of steps it takes (its time complexity) or the number of storage locations it uses (its space complexity). An algorithm is said to be efficient when this function’s values are small, or grow slowly compared to a growth in the size of the input. Different inputs of the same length may cause the algorithm to have different behavior, so best, worst and average case descriptions might all be of practical interest. When not otherwise specified, the function describing the performance of an algorithm is usually an upper bound, determined from the worst case inputs to the algorithm.

The term “analysis of algorithms” was coined by Donald Knuth.[1] Algorithm analysis is an important part of a broader computational complexity theory, which provides theoretical estimates for the resources needed by any algorithm which solves a given computational problem. These estimates provide an insight into reasonable directions of search for efficient algorithms.

In theoretical analysis of algorithms it is common to estimate their complexity in the asymptotic sense, i.e., to estimate the complexity function for arbitrarily large input. Big O notationBig-omega notation (Big-O) and Big-theta notation are used to this end. For instance, binary search is said to run in a number of steps proportional to the logarithm of the length of the sorted list being searched, or in O(log(n)), colloquially “in logarithmic time“. Usually asymptotic estimates are used because different implementations of the same algorithm may differ in efficiency. However the efficiencies of any two “reasonable” implementations of a given algorithm are related by a constant multiplicative factor called a hidden constant.

Exact (not asymptotic) measures of efficiency can sometimes be computed but they usually require certain assumptions concerning the particular implementation of the algorithm, called model of computation. A model of computation may be defined in terms of an abstract computer, e.g., Turing machine, and/or by postulating that certain operations are executed in unit time. For example, if the sorted list to which we apply binary search has n elements, and we can guarantee that each lookup of an element in the list can be done in unit time, then at most log2 n + 1 time units are needed to return an answer.

Cost models

Time efficiency estimates depend on what we define to be a step. For the analysis to correspond usefully to the actual execution time, the time required to perform a step must be guaranteed to be bounded above by a constant. One must be careful here; for instance, some analyses count an addition of two numbers as one step. This assumption may not be warranted in certain contexts. For example, if the numbers involved in a computation may be arbitrarily large, the time required by a single addition can no longer be assumed to be constant.

Two cost models are generally used:[2][3][4][5][6]

  • the uniform cost model, also called uniform-cost measurement (and similar variations), assigns a constant cost to every machine operation, regardless of the size of the numbers involved
  • the logarithmic cost model, also called logarithmic-cost measurement (and similar variations), assigns a cost to every machine operation proportional to the number of bits involved

The latter is more cumbersome to use, so it’s only employed when necessary, for example in the analysis of arbitrary-precision arithmetic algorithms, like those used in cryptography.

A key point which is often overlooked is that published lower bounds for problems are often given for a model of computation that is more restricted than the set of operations that you could use in practice and therefore there are algorithms that are faster than what would naively be thought possible.[7]

Run-time analysis

Run-time analysis is a theoretical classification that estimates and anticipates the increase in running time (or run-time) of an algorithm as its input size (usually denoted as n) increases. Run-time efficiency is a topic of great interest in computer science: A program can take seconds, hours, or even years to finish executing, depending on which algorithm it implements. While software profiling techniques can be used to measure an algorithm’s run-time in practice, they cannot provide timing data for all infinitely many possible inputs; the latter can only be achieved by the theoretical methods of run-time analysis.

Shortcomings of empirical metrics

Since algorithms are platform-independent (i.e. a given algorithm can be implemented in an arbitrary programming language on an arbitrary computer running an arbitrary operating system), there are additional significant drawbacks to using an empirical approach to gauge the comparative performance of a given set of algorithms.

Take as an example a program that looks up a specific entry in a sorted list of size n. Suppose this program were implemented on Computer A, a state-of-the-art machine, using a linear search algorithm, and on Computer B, a much slower machine, using a binary search algorithmBenchmark testing on the two computers running their respective programs might look something like the following:

n (list size)Computer A run-time
(in nanoseconds)
Computer B run-time
(in nanoseconds)

Based on these metrics, it would be easy to jump to the conclusion that Computer A is running an algorithm that is far superior in efficiency to that of Computer B. However, if the size of the input-list is increased to a sufficient number, that conclusion is dramatically demonstrated to be in error:

n (list size)Computer A run-time
(in nanoseconds)
Computer B run-time
(in nanoseconds)
63,072 × 101231,536 × 1012 ns,
or 1 year
1,375,000 ns,
or 1.375 milliseconds

Computer A, running the linear search program, exhibits a linear growth rate. The program’s run-time is directly proportional to its input size. Doubling the input size doubles the run time, quadrupling the input size quadruples the run-time, and so forth. On the other hand, Computer B, running the binary search program, exhibits a logarithmic growth rate. Quadrupling the input size only increases the run time by a constant amount (in this example, 50,000 ns). Even though Computer A is ostensibly a faster machine, Computer B will inevitably surpass Computer A in run-time because it’s running an algorithm with a much slower growth rate.

Orders of growth

Main article: Big O notation

Informally, an algorithm can be said to exhibit a growth rate on the order of a mathematical function if beyond a certain input size n, the function {\displaystyle f(n)}f(n) times a positive constant provides an upper bound or limit for the run-time of that algorithm. In other words, for a given input size n greater than some n0 and a constant c, the running time of that algorithm will never be larger than {\displaystyle c\times f(n)}{\displaystyle c\times f(n)}. This concept is frequently expressed using Big O notation. For example, since the run-time of insertion sort grows quadratically as its input size increases, insertion sort can be said to be of order O(n2).

Big O notation is a convenient way to express the worst-case scenario for a given algorithm, although it can also be used to express the average-case — for example, the worst-case scenario for quicksort is O(n2), but the average-case run-time is O(n log n).

Empirical orders of growth

Assuming the execution time follows power rule, t ≈ k na, the coefficient a can be found [8] by taking empirical measurements of run time {\displaystyle \{t_{1},t_{2}\}}{\displaystyle \{t_{1},t_{2}\}} at some problem-size points {\displaystyle \{n_{1},n_{2}\}}{\displaystyle \{n_{1},n_{2}\}}, and calculating {\displaystyle t_{2}/t_{1}=(n_{2}/n_{1})^{a}}t_{2}/t_{1}=(n_{2}/n_{1})^{a} so that {\displaystyle a=\log(t_{2}/t_{1})/\log(n_{2}/n_{1})}a=\log(t_{2}/t_{1})/\log(n_{2}/n_{1}). In other words, this measures the slope of the empirical line on the log–log plot of execution time vs. problem size, at some size point. If the order of growth indeed follows the power rule (and so the line on log–log plot is indeed a straight line), the empirical value of a will stay constant at different ranges, and if not, it will change (and the line is a curved line) – but still could serve for comparison of any two given algorithms as to their empirical local orders of growth behaviour. Applied to the above table:

n (list size)Computer A run-time
(in nanoseconds)
Local order of growth
Computer B run-time
(in nanoseconds)
Local order of growth

It is clearly seen that the first algorithm exhibits a linear order of growth indeed following the power rule. The empirical values for the second one are diminishing rapidly, suggesting it follows another rule of growth and in any case has much lower local orders of growth (and improving further still), empirically, than the first one.

Evaluating run-time complexity

The run-time complexity for the worst-case scenario of a given algorithm can sometimes be evaluated by examining the structure of the algorithm and making some simplifying assumptions. Consider the following pseudocode:

1    get a positive integer n from input
2    if n > 10
3        print "This might take a while..."
4    for i = 1 to n
5        for j = 1 to i
6            print i * j
7    print "Done!"

A given computer will take a discrete amount of time to execute each of the instructions involved with carrying out this algorithm. The specific amount of time to carry out a given instruction will vary depending on which instruction is being executed and which computer is executing it, but on a conventional computer, this amount will be deterministic.[9] Say that the actions carried out in step 1 are considered to consume time T1, step 2 uses time T2, and so forth.

In the algorithm above, steps 1, 2 and 7 will only be run once. For a worst-case evaluation, it should be assumed that step 3 will be run as well. Thus the total amount of time to run steps 1-3 and step 7 is:{\displaystyle T_{1}+T_{2}+T_{3}+T_{7}.\,}T_{1}+T_{2}+T_{3}+T_{7}.\,

The loops in steps 4, 5 and 6 are trickier to evaluate. The outer loop test in step 4 will execute ( n + 1 ) times (note that an extra step is required to terminate the for loop, hence n + 1 and not n executions), which will consume T4n + 1 ) time. The inner loop, on the other hand, is governed by the value of j, which iterates from 1 to i. On the first pass through the outer loop, j iterates from 1 to 1: The inner loop makes one pass, so running the inner loop body (step 6) consumes T6 time, and the inner loop test (step 5) consumes 2T5 time. During the next pass through the outer loop, j iterates from 1 to 2: the inner loop makes two passes, so running the inner loop body (step 6) consumes 2T6 time, and the inner loop test (step 5) consumes 3T5 time.

Altogether, the total time required to run the inner loop body can be expressed as an arithmetic progression:{\displaystyle T_{6}+2T_{6}+3T_{6}+\cdots +(n-1)T_{6}+nT_{6}}T_{6}+2T_{6}+3T_{6}+\cdots +(n-1)T_{6}+nT_{6}

which can be factored[10] as{\displaystyle T_{6}\left[1+2+3+\cdots +(n-1)+n\right]=T_{6}\left[{\frac {1}{2}}(n^{2}+n)\right]}T_{6}\left[1+2+3+\cdots +(n-1)+n\right]=T_{6}\left[{\frac  {1}{2}}(n^{2}+n)\right]

The total time required to run the outer loop test can be evaluated similarly:{\displaystyle {\begin{aligned}&2T_{5}+3T_{5}+4T_{5}+\cdots +(n-1)T_{5}+nT_{5}+(n+1)T_{5}\\=\ &T_{5}+2T_{5}+3T_{5}+4T_{5}+\cdots +(n-1)T_{5}+nT_{5}+(n+1)T_{5}-T_{5}\end{aligned}}}{\displaystyle {\begin{aligned}&2T_{5}+3T_{5}+4T_{5}+\cdots +(n-1)T_{5}+nT_{5}+(n+1)T_{5}\\=\ &T_{5}+2T_{5}+3T_{5}+4T_{5}+\cdots +(n-1)T_{5}+nT_{5}+(n+1)T_{5}-T_{5}\end{aligned}}}

which can be factored as{\displaystyle {\begin{aligned}&T_{5}\left[1+2+3+\cdots +(n-1)+n+(n+1)\right]-T_{5}\\=&\left[{\frac {1}{2}}(n^{2}+n)\right]T_{5}+(n+1)T_{5}-T_{5}\\=&T_{5}\left[{\frac {1}{2}}(n^{2}+n)\right]+nT_{5}\\=&\left[{\frac {1}{2}}(n^{2}+3n)\right]T_{5}\end{aligned}}}{\displaystyle {\begin{aligned}&T_{5}\left[1+2+3+\cdots +(n-1)+n+(n+1)\right]-T_{5}\\=&\left[{\frac {1}{2}}(n^{2}+n)\right]T_{5}+(n+1)T_{5}-T_{5}\\=&T_{5}\left[{\frac {1}{2}}(n^{2}+n)\right]+nT_{5}\\=&\left[{\frac {1}{2}}(n^{2}+3n)\right]T_{5}\end{aligned}}}

Therefore, the total running time for this algorithm is:{\displaystyle f(n)=T_{1}+T_{2}+T_{3}+T_{7}+(n+1)T_{4}+\left[{\frac {1}{2}}(n^{2}+n)\right]T_{6}+\left[{\frac {1}{2}}(n^{2}+3n)\right]T_{5}}f(n)=T_{1}+T_{2}+T_{3}+T_{7}+(n+1)T_{4}+\left[{\frac  {1}{2}}(n^{2}+n)\right]T_{6}+\left[{\frac  {1}{2}}(n^{2}+3n)\right]T_{5}

which reduces to{\displaystyle f(n)=\left[{\frac {1}{2}}(n^{2}+n)\right]T_{6}+\left[{\frac {1}{2}}(n^{2}+3n)\right]T_{5}+(n+1)T_{4}+T_{1}+T_{2}+T_{3}+T_{7}}f(n)=\left[{\frac  {1}{2}}(n^{2}+n)\right]T_{6}+\left[{\frac  {1}{2}}(n^{2}+3n)\right]T_{5}+(n+1)T_{4}+T_{1}+T_{2}+T_{3}+T_{7}

As a rule-of-thumb, one can assume that the highest-order term in any given function dominates its rate of growth and thus defines its run-time order. In this example, n2 is the highest-order term, so one can conclude that f(n) = O(n2). Formally this can be proven as follows:

Prove that {\displaystyle \left[{\frac {1}{2}}(n^{2}+n)\right]T_{6}+\left[{\frac {1}{2}}(n^{2}+3n)\right]T_{5}+(n+1)T_{4}+T_{1}+T_{2}+T_{3}+T_{7}\leq cn^{2},\ n\geq n_{0}}\left[{\frac  {1}{2}}(n^{2}+n)\right]T_{6}+\left[{\frac  {1}{2}}(n^{2}+3n)\right]T_{5}+(n+1)T_{4}+T_{1}+T_{2}+T_{3}+T_{7}\leq cn^{2},\ n\geq n_{0}

{\displaystyle {\begin{aligned}&\left[{\frac {1}{2}}(n^{2}+n)\right]T_{6}+\left[{\frac {1}{2}}(n^{2}+3n)\right]T_{5}+(n+1)T_{4}+T_{1}+T_{2}+T_{3}+T_{7}\\\leq &(n^{2}+n)T_{6}+(n^{2}+3n)T_{5}+(n+1)T_{4}+T_{1}+T_{2}+T_{3}+T_{7}\ ({\text{for }}n\geq 0)\end{aligned}}}{\displaystyle {\begin{aligned}&\left[{\frac {1}{2}}(n^{2}+n)\right]T_{6}+\left[{\frac {1}{2}}(n^{2}+3n)\right]T_{5}+(n+1)T_{4}+T_{1}+T_{2}+T_{3}+T_{7}\\\leq &(n^{2}+n)T_{6}+(n^{2}+3n)T_{5}+(n+1)T_{4}+T_{1}+T_{2}+T_{3}+T_{7}\ ({\text{for }}n\geq 0)\end{aligned}}}

Let k be a constant greater than or equal to [T1..T7]

{\displaystyle {\begin{aligned}&T_{6}(n^{2}+n)+T_{5}(n^{2}+3n)+(n+1)T_{4}+T_{1}+T_{2}+T_{3}+T_{7}\leq k(n^{2}+n)+k(n^{2}+3n)+kn+5k\\=&2kn^{2}+5kn+5k\leq 2kn^{2}+5kn^{2}+5kn^{2}\ ({\text{for }}n\geq 1)=12kn^{2}\end{aligned}}}{\displaystyle {\begin{aligned}&T_{6}(n^{2}+n)+T_{5}(n^{2}+3n)+(n+1)T_{4}+T_{1}+T_{2}+T_{3}+T_{7}\leq k(n^{2}+n)+k(n^{2}+3n)+kn+5k\\=&2kn^{2}+5kn+5k\leq 2kn^{2}+5kn^{2}+5kn^{2}\ ({\text{for }}n\geq 1)=12kn^{2}\end{aligned}}}

Therefore {\displaystyle \left[{\frac {1}{2}}(n^{2}+n)\right]T_{6}+\left[{\frac {1}{2}}(n^{2}+3n)\right]T_{5}+(n+1)T_{4}+T_{1}+T_{2}+T_{3}+T_{7}\leq cn^{2},n\geq n_{0}{\text{ for }}c=12k,n_{0}=1}{\displaystyle \left[{\frac {1}{2}}(n^{2}+n)\right]T_{6}+\left[{\frac {1}{2}}(n^{2}+3n)\right]T_{5}+(n+1)T_{4}+T_{1}+T_{2}+T_{3}+T_{7}\leq cn^{2},n\geq n_{0}{\text{ for }}c=12k,n_{0}=1}

A more elegant approach to analyzing this algorithm would be to declare that [T1..T7] are all equal to one unit of time, in a system of units chosen so that one unit is greater than or equal to the actual times for these steps. This would mean that the algorithm’s running time breaks down as follows:[11]

{\displaystyle 4+\sum _{i=1}^{n}i\leq 4+\sum _{i=1}^{n}n=4+n^{2}\leq 5n^{2}\ ({\text{for }}n\geq 1)=O(n^{2}).}{\displaystyle 4+\sum _{i=1}^{n}i\leq 4+\sum _{i=1}^{n}n=4+n^{2}\leq 5n^{2}\ ({\text{for }}n\geq 1)=O(n^{2}).}

Growth rate analysis of other resources

The methodology of run-time analysis can also be utilized for predicting other growth rates, such as consumption of memory space. As an example, consider the following pseudocode which manages and reallocates memory usage by a program based on the size of a file which that program manages:

while file is still open:
    let n = size of file
    for every 100,000 kilobytes of increase in file size
        double the amount of memory reserved

In this instance, as the file size n increases, memory will be consumed at an exponential growth rate, which is order O(2n). This is an extremely rapid and most likely unmanageable growth rate for consumption of memory resources.


Algorithm analysis is important in practice because the accidental or unintentional use of an inefficient algorithm can significantly impact system performance. In time-sensitive applications, an algorithm taking too long to run can render its results outdated or useless. An inefficient algorithm can also end up requiring an uneconomical amount of computing power or storage in order to run, again rendering it practically useless.

Constant factors

Analysis of algorithms typically focuses on the asymptotic performance, particularly at the elementary level, but in practical applications constant factors are important, and real-world data is in practice always limited in size. The limit is typically the size of addressable memory, so on 32-bit machines 232 = 4 GiB (greater if segmented memory is used) and on 64-bit machines 264 = 16 EiB. Thus given a limited size, an order of growth (time or space) can be replaced by a constant factor, and in this sense all practical algorithms are O(1) for a large enough constant, or for small enough data.

This interpretation is primarily useful for functions that grow extremely slowly: (binary) iterated logarithm (log*) is less than 5 for all practical data (265536 bits); (binary) log-log (log log n) is less than 6 for virtually all practical data (264 bits); and binary log (log n) is less than 64 for virtually all practical data (264 bits). An algorithm with non-constant complexity may nonetheless be more efficient than an algorithm with constant complexity on practical data if the overhead of the constant time algorithm results in a larger constant factor, e.g., one may have {\displaystyle K>k\log \log n}K>k\log \log n”> so long as {\displaystyle K/k>6}<img decoding=.

For large data linear or quadratic factors cannot be ignored, but for small data an asymptotically inefficient algorithm may be more efficient. This is particularly used in hybrid algorithms, like Timsort, which use an asymptotically efficient algorithm (here merge sort, with time complexity {\displaystyle n\log n}n\log n), but switch to an asymptotically inefficient algorithm (here insertion sort, with time complexity {\displaystyle n^{2}}n^{2}) for small data, as the simpler algorithm is faster on small data.

See also


  1. ^ “Knuth: Recent News”. 28 August 2016. Archived from the original on 28 August 2016.
  2. ^ Alfred V. Aho; John E. Hopcroft; Jeffrey D. Ullman (1974). The design and analysis of computer algorithms. Addison-Wesley Pub. Co., section 1.3
  3. ^ Juraj Hromkovič (2004). Theoretical computer science: introduction to Automata, computability, complexity, algorithmics, randomization, communication, and cryptography. Springer. pp. 177–178. ISBN 978-3-540-14015-3.
  4. ^ Giorgio Ausiello (1999). Complexity and approximation: combinatorial optimization problems and their approximability properties. Springer. pp. 3–8. ISBN 978-3-540-65431-5.
  5. ^ Wegener, Ingo (2005), Complexity theory: exploring the limits of efficient algorithms, Berlin, New York: Springer-Verlag, p. 20, ISBN 978-3-540-21045-0
  6. ^ Robert Endre Tarjan (1983). Data structures and network algorithms. SIAM. pp. 3–7. ISBN 978-0-89871-187-5.
  7. ^ Examples of the price of abstraction?,
  8. ^ How To Avoid O-Abuse and Bribes Archived 2017-03-08 at the Wayback Machine, at the blog “Gödel’s Lost Letter and P=NP” by R. J. Lipton, professor of Computer Science at Georgia Tech, recounting idea by Robert Sedgewick
  9. ^ However, this is not the case with a quantum computer
  10. ^ It can be proven by induction that {\displaystyle 1+2+3+\cdots +(n-1)+n={\frac {n(n+1)}{2}}}1+2+3+\cdots +(n-1)+n={\frac  {n(n+1)}{2}}
  11. ^ This approach, unlike the above approach, neglects the constant time consumed by the loop tests which terminate their respective loops, but it is trivial to prove that such omission does not affect the final result


External links


” (WP)


Fair Use Sources:

Software Engineering

Big O notation

” (WP)

Big O notation is a mathematical notation that describes the limiting behavior of a function when the argument tends towards a particular value or infinity. Big O is a member of a family of notations invented by Paul Bachmann,[1] Edmund Landau,[2] and others, collectively called Bachmann–Landau notation or asymptotic notation.

In computer science, big O notation is used to classify algorithms according to how their run time or space requirements grow as the input size grows.[3] In analytic number theory, big O notation is often used to express a bound on the difference between an arithmetical function and a better understood approximation; a famous example of such a difference is the remainder term in the prime number theorem. Big O notation is also used in many other fields to provide similar estimates.

Big O notation characterizes functions according to their growth rates: different functions with the same growth rate may be represented using the same O notation. The letter O is used because the growth rate of a function is also referred to as the order of the function. A description of a function in terms of big O notation usually only provides an upper bound on the growth rate of the function.

Associated with big O notation are several related notations, using the symbols o, Ω, ω, and Θ, to describe other kinds of bounds on asymptotic growth rates.

Formal definition

Let f be a real or complex valued function and g a real valued function. Let both functions be defined on some unbounded subset of the positive real numbers, and {\displaystyle g(x)}g(x) be strictly positive for all large enough values of x.[4] One writes{\displaystyle f(x)=O{\bigl (}g(x){\bigr )}\quad {\text{ as }}x\to \infty }{\displaystyle f(x)=O{\bigl (}g(x){\bigr )}\quad {\text{ as }}x\to \infty }

if the absolute value of {\displaystyle f(x)}f(x) is at most a positive constant multiple of {\displaystyle g(x)}g(x) for all sufficiently large values of x. That is, {\displaystyle f(x)=O{\bigl (}g(x){\bigr )}}{\displaystyle f(x)=O{\bigl (}g(x){\bigr )}} if there exists a positive real number M and a real number x0 such that{\displaystyle |f(x)|\leq Mg(x)\quad {\text{ for all }}x\geq x_{0}.}{\displaystyle |f(x)|\leq Mg(x)\quad {\text{ for all }}x\geq x_{0}.}

In many contexts, the assumption that we are interested in the growth rate as the variable x goes to infinity is left unstated, and one writes more simply that{\displaystyle f(x)=O{\bigl (}g(x){\bigr )}.}{\displaystyle f(x)=O{\bigl (}g(x){\bigr )}.}

The notation can also be used to describe the behavior of f near some real number a (often, a = 0): we say{\displaystyle f(x)=O{\bigl (}g(x){\bigr )}\quad {\text{ as }}x\to a}{\displaystyle f(x)=O{\bigl (}g(x){\bigr )}\quad {\text{ as }}x\to a}

if there exist positive numbers {\displaystyle \delta }\delta  and M such that for all x with {\displaystyle 0<|x-a|<\delta }{\displaystyle 0<|x-a|<\delta },{\displaystyle |f(x)|\leq Mg(x).}{\displaystyle |f(x)|\leq Mg(x).}

As g(x) is chosen to be non-zero for values of x sufficiently close to a, both of these definitions can be unified using the limit superior:{\displaystyle f(x)=O{\bigl (}g(x){\bigr )}\quad {\text{ as }}x\to a}{\displaystyle f(x)=O{\bigl (}g(x){\bigr )}\quad {\text{ as }}x\to a}

if{\displaystyle \limsup _{x\to a}{\frac {\left|f(x)\right|}{g(x)}}<\infty .}{\displaystyle \limsup _{x\to a}{\frac {\left|f(x)\right|}{g(x)}}<\infty .}

In computer science, a slightly more restrictive definition is common: {\displaystyle f}f and {\displaystyle g}g are both required to be functions from the positive integers to the nonnegative real numbers; {\displaystyle f(x)=O{\bigl (}g(x){\bigr )}}{\displaystyle f(x)=O{\bigl (}g(x){\bigr )}} if there exist positive integer numbers M and n0 such that {\displaystyle f(n)\leq Mg(n)}{\displaystyle f(n)\leq Mg(n)} for all {\displaystyle n\geq n_{0}}{\displaystyle n\geq n_{0}}.[5] Where necessary, finite ranges are (tacitly) excluded from {\displaystyle f}f‘s and {\displaystyle g}g‘s domain by choosing n0 sufficiently large. (For example, {\displaystyle \log(n)}\log(n) is undefined at {\displaystyle n=0}n=0.)


In typical usage the O notation is asymptotical, that is, it refers to very large x. In this setting, the contribution of the terms that grow “most quickly” will eventually make the other ones irrelevant. As a result, the following simplification rules can be applied:

  • If f(x) is a sum of several terms, if there is one with largest growth rate, it can be kept, and all others omitted.
  • If f(x) is a product of several factors, any constants (terms in the product that do not depend on x) can be omitted.

For example, let f(x) = 6x4 − 2x3 + 5, and suppose we wish to simplify this function, using O notation, to describe its growth rate as x approaches infinity. This function is the sum of three terms: 6x4, −2x3, and 5. Of these three terms, the one with the highest growth rate is the one with the largest exponent as a function of x, namely 6x4. Now one may apply the second rule: 6x4 is a product of 6 and x4 in which the first factor does not depend on x. Omitting this factor results in the simplified form x4. Thus, we say that f(x) is a “big O” of x4. Mathematically, we can write f(x) = O(x4). One may confirm this calculation using the formal definition: let f(x) = 6x4 − 2x3 + 5 and g(x) = x4. Applying the formal definition from above, the statement that f(x) = O(x4) is equivalent to its expansion,{\displaystyle |f(x)|\leq Mx^{4}}{\displaystyle |f(x)|\leq Mx^{4}}

for some suitable choice of x0 and M and for all x > x0. To prove this, let x0 = 1 and M = 13. Then, for all x > x0:{\displaystyle {\begin{aligned}|6x^{4}-2x^{3}+5|&\leq 6x^{4}+|2x^{3}|+5\\&\leq 6x^{4}+2x^{4}+5x^{4}\\&=13x^{4}\end{aligned}}}{\displaystyle {\begin{aligned}|6x^{4}-2x^{3}+5|&\leq 6x^{4}+|2x^{3}|+5\\&\leq 6x^{4}+2x^{4}+5x^{4}\\&=13x^{4}\end{aligned}}}

so{\displaystyle |6x^{4}-2x^{3}+5|\leq 13x^{4}.}{\displaystyle |6x^{4}-2x^{3}+5|\leq 13x^{4}.}


Big O notation has two main areas of application:

In both applications, the function g(x) appearing within the O(…) is typically chosen to be as simple as possible, omitting constant factors and lower order terms.

There are two formally close, but noticeably different, usages of this notation:[citation needed]

This distinction is only in application and not in principle, however—the formal definition for the “big O” is the same for both cases, only with different limits for the function argument.[original research?]

Infinite asymptotics

Graphs of functions commonly used in the analysis of algorithms, showing the number of operations N versus input size n for each function

Big O notation is useful when analyzing algorithms for efficiency. For example, the time (or the number of steps) it takes to complete a problem of size n might be found to be T(n) = 4n2 − 2n + 2. As n grows large, the n2 term will come to dominate, so that all other terms can be neglected—for instance when n = 500, the term 4n2 is 1000 times as large as the 2n term. Ignoring the latter would have negligible effect on the expression’s value for most purposes. Further, the coefficients become irrelevant if we compare to any other order of expression, such as an expression containing a term n3 or n4. Even if T(n) = 1,000,000n2, if U(n) = n3, the latter will always exceed the former once n grows larger than 1,000,000 (T(1,000,000) = 1,000,0003 = U(1,000,000)). Additionally, the number of steps depends on the details of the machine model on which the algorithm runs, but different types of machines typically vary by only a constant factor in the number of steps needed to execute an algorithm. So the big O notation captures what remains: we write either{\displaystyle T(n)=O(n^{2})}{\displaystyle T(n)=O(n^{2})}

or{\displaystyle T(n)\in O(n^{2})}{\displaystyle T(n)\in O(n^{2})}

and say that the algorithm has order of n2 time complexity. The sign “=” is not meant to express “is equal to” in its normal mathematical sense, but rather a more colloquial “is”, so the second expression is sometimes considered more accurate (see the “Equals sign” discussion below) while the first is considered by some as an abuse of notation.[6]

Infinitesimal asymptotics

Big O can also be used to describe the error term in an approximation to a mathematical function. The most significant terms are written explicitly, and then the least-significant terms are summarized in a single big O term. Consider, for example, the exponential series and two expressions of it that are valid when x is small:{\displaystyle {\begin{aligned}e^{x}&=1+x+{\frac {x^{2}}{2!}}+{\frac {x^{3}}{3!}}+{\frac {x^{4}}{4!}}+\dotsb &{\text{for all }}x\\&=1+x+{\frac {x^{2}}{2}}+O(x^{3})&{\text{as }}x\to 0\\&=1+x+O(x^{2})&{\text{as }}x\to 0\\\end{aligned}}}{\displaystyle {\begin{aligned}e^{x}&=1+x+{\frac {x^{2}}{2!}}+{\frac {x^{3}}{3!}}+{\frac {x^{4}}{4!}}+\dotsb &{\text{for all }}x\\&=1+x+{\frac {x^{2}}{2}}+O(x^{3})&{\text{as }}x\to 0\\&=1+x+O(x^{2})&{\text{as }}x\to 0\\\end{aligned}}}

The second expression (the one with O(x3)) means the absolute-value of the error ex − (1 + x + x2/2) is at most some constant times |x3| when x is close enough to 0.


If the function f can be written as a finite sum of other functions, then the fastest growing one determines the order of f(n). For example,{\displaystyle f(n)=9\log n+5(\log n)^{4}+3n^{2}+2n^{3}=O(n^{3})\qquad {\text{as }}n\to \infty .}{\displaystyle f(n)=9\log n+5(\log n)^{4}+3n^{2}+2n^{3}=O(n^{3})\qquad {\text{as }}n\to \infty .}

In particular, if a function may be bounded by a polynomial in n, then as n tends to infinity, one may disregard lower-order terms of the polynomial. The sets O(nc) and O(cn) are very different. If c is greater than one, then the latter grows much faster. A function that grows faster than nc for any c is called superpolynomial. One that grows more slowly than any exponential function of the form cn is called subexponential. An algorithm can require time that is both superpolynomial and subexponential; examples of this include the fastest known algorithms for integer factorization and the function nlog n.

We may ignore any powers of n inside of the logarithms. The set O(log n) is exactly the same as O(log(nc)). The logarithms differ only by a constant factor (since log(nc) = c log n) and thus the big O notation ignores that. Similarly, logs with different constant bases are equivalent. On the other hand, exponentials with different bases are not of the same order. For example, 2n and 3n are not of the same order.

Changing units may or may not affect the order of the resulting algorithm. Changing units is equivalent to multiplying the appropriate variable by a constant wherever it appears. For example, if an algorithm runs in the order of n2, replacing n by cn means the algorithm runs in the order of c2n2, and the big O notation ignores the constant c2. This can be written as c2n2 = O(n2). If, however, an algorithm runs in the order of 2n, replacing n with cn gives 2cn = (2c)n. This is not equivalent to 2n in general. Changing variables may also affect the order of the resulting algorithm. For example, if an algorithm’s run time is O(n) when measured in terms of the number n of digits of an input number x, then its run time is O(log x) when measured as a function of the input number x itself, because n = O(log x).


{\displaystyle f_{1}=O(g_{1}){\text{ and }}f_{2}=O(g_{2})\Rightarrow f_{1}f_{2}=O(g_{1}g_{2})}{\displaystyle f_{1}=O(g_{1}){\text{ and }}f_{2}=O(g_{2})\Rightarrow f_{1}f_{2}=O(g_{1}g_{2})}{\displaystyle f\cdot O(g)=O(fg)}f\cdot O(g)=O(fg)


If {\displaystyle f_{1}=O(g_{1})}{\displaystyle f_{1}=O(g_{1})} and {\displaystyle f_{2}=O(g_{2})}{\displaystyle f_{2}=O(g_{2})} then {\displaystyle f_{1}+f_{2}=O(\max(g_{1},g_{2}))}{\displaystyle f_{1}+f_{2}=O(\max(g_{1},g_{2}))}. It follows that if {\displaystyle f_{1}=O(g)}{\displaystyle f_{1}=O(g)} and {\displaystyle f_{2}=O(g)}{\displaystyle f_{2}=O(g)} then {\displaystyle f_{1}+f_{2}\in O(g)}{\displaystyle f_{1}+f_{2}\in O(g)}. In other words, this second statement says that {\displaystyle O(g)}O(g) is a convex cone.

Multiplication by a constan

Let k be constant. Then {\displaystyle O(|k|\cdot g)=O(g)}{\displaystyle O(|k|\cdot g)=O(g)} if k is nonzero. In other words, if {\displaystyle f=O(g)}{\displaystyle f=O(g)}, then {\displaystyle k\cdot f=O(g).}{\displaystyle k\cdot f=O(g).}

Multiple variables

Big O (and little o, Ω, etc.) can also be used with multiple variables. To define big O formally for multiple variables, suppose {\displaystyle f}f and {\displaystyle g}g are two functions defined on some subset of {\displaystyle \mathbb {R} ^{n}}\mathbb {R} ^{n}. We say{\displaystyle f(\mathbf {x} ){\text{ is }}O(g(\mathbf {x} ))\quad {\text{ as }}\mathbf {x} \to \infty }{\displaystyle f(\mathbf {x} ){\text{ is }}O(g(\mathbf {x} ))\quad {\text{ as }}\mathbf {x} \to \infty }

if and only if[7]{\displaystyle \exists M\exists C>0~{\text{ such that for all }}~\mathbf {x} ~{\text{ with }}~x_{i}\geq M~{\text{ for some }}~i,|f(\mathbf {x} )|\leq C|g(\mathbf {x} )|~.}{\displaystyle \exists M\exists C>0~{\text{ such that for all }}~\mathbf {x} ~{\text{ with }}~x_{i}\geq M~{\text{ for some }}~i,|f(\mathbf {x} )|\leq C|g(\mathbf {x} )|~.}”></p>

<p>Equivalently, the condition that {\displaystyle x_{i}\geq M}<img decoding= for some {\displaystyle i}i can be replaced with the condition that {\displaystyle \|\mathbf {x} \|_{\infty }\geq M}{\displaystyle \|\mathbf {x} \|_{\infty }\geq M}, where {\displaystyle \|\mathbf {x} \|_{\infty }}{\displaystyle \|\mathbf {x} \|_{\infty }} denotes the Chebyshev norm. For example, the statement{\displaystyle f(n,m)=n^{2}+m^{3}+O(n+m)\quad {\text{ as }}n,m\to \infty }{\displaystyle f(n,m)=n^{2}+m^{3}+O(n+m)\quad {\text{ as }}n,m\to \infty }

asserts that there exist constants C and M such that{\displaystyle \forall \|(n,m)\|_{\infty }\geq M:\quad |g(n,m)|\leq C|n+m|}{\displaystyle \forall \|(n,m)\|_{\infty }\geq M:\quad |g(n,m)|\leq C|n+m|}

where g(n,m) is defined by{\displaystyle f(n,m)=n^{2}+m^{3}+g(n,m)~.}{\displaystyle f(n,m)=n^{2}+m^{3}+g(n,m)~.}

This definition allows all of the coordinates of {\displaystyle \mathbf {x} }\mathbf {x}  to increase to infinity. In particular, the statement{\displaystyle f(n,m)=O(n^{m})\quad {\text{ as }}n,m\to \infty }{\displaystyle f(n,m)=O(n^{m})\quad {\text{ as }}n,m\to \infty }

(i.e., {\displaystyle \exists C\exists M\forall n\forall m\dots }{\displaystyle \exists C\exists M\forall n\forall m\dots }) is quite different from{\displaystyle \forall m\colon ~f(n,m)=O(n^{m})\quad {\text{ as }}n\to \infty }{\displaystyle \forall m\colon ~f(n,m)=O(n^{m})\quad {\text{ as }}n\to \infty }

(i.e., {\displaystyle \forall m\exists C\exists M\forall n\dots }{\displaystyle \forall m\exists C\exists M\forall n\dots }).

Under this definition, the subset on which a function is defined is significant when generalizing statements from the univariate setting to the multivariate setting. For example, if {\displaystyle f(n,m)=1}{\displaystyle f(n,m)=1} and {\displaystyle g(n,m)=n}{\displaystyle g(n,m)=n}, then {\displaystyle f(n,m)=O(g(n,m))}{\displaystyle f(n,m)=O(g(n,m))} if we restrict {\displaystyle f}f and {\displaystyle g}g to {\displaystyle [1,\infty )^{2}}{\displaystyle [1,\infty )^{2}}, but not if they are defined on {\displaystyle [0,\infty )^{2}}{\displaystyle [0,\infty )^{2}}.

This is not the only generalization of big O to multivariate functions, and in practice, there is some inconsistency in the choice of definition.[8]

Matters of notation

Equals sign

The statement “f(x) is O(g(x))” as defined above is usually written as f(x) = O(g(x)). Some consider this to be an abuse of notation, since the use of the equals sign could be misleading as it suggests a symmetry that this statement does not have. As de Bruijn says, O(x) = O(x2) is true but O(x2) = O(x) is not.[9] Knuth describes such statements as “one-way equalities”, since if the sides could be reversed, “we could deduce ridiculous things like n = n2 from the identities n = O(n2) and n2 = O(n2).”[10]

For these reasons, it would be more precise to use set notation and write f(x) ∈ O(g(x)) (read as: “f(xis an element of O(g(x))”, or “f(xis in the set O(g(x))”), thinking of O(g(x)) as the class of all functions h(x) such that |h(x)| ≤ C|g(x)| for some constant C.[10] However, the use of the equals sign is customary.[citation needed]

Other arithmetic operators

Big O notation can also be used in conjunction with other arithmetic operators in more complicated equations. For example, h(x) + O(f(x)) denotes the collection of functions having the growth of h(x) plus a part whose growth is limited to that of f(x). Thus,{\displaystyle g(x)=h(x)+O(f(x))}{\displaystyle g(x)=h(x)+O(f(x))}

expresses the same as{\displaystyle g(x)-h(x)=O(f(x)).}{\displaystyle g(x)-h(x)=O(f(x)).}


Suppose an algorithm is being developed to operate on a set of n elements. Its developers are interested in finding a function T(n) that will express how long the algorithm will take to run (in some arbitrary measurement of time) in terms of the number of elements in the input set. The algorithm works by first calling a subroutine to sort the elements in the set and then perform its own operations. The sort has a known time complexity of O(n2), and after the subroutine runs the algorithm must take an additional 55n3 + 2n + 10 steps before it terminates. Thus the overall time complexity of the algorithm can be expressed as T(n) = 55n3 + O(n2). Here the terms 2n + 10 are subsumed within the faster-growing O(n2). Again, this usage disregards some of the formal meaning of the “=” symbol, but it does allow one to use the big O notation as a kind of convenient placeholder.

Multiple uses

In more complicated usage, O(…) can appear in different places in an equation, even several times on each side. For example, the following are true for {\displaystyle n\to \infty }n\to \infty :{\displaystyle {\begin{aligned}(n+1)^{2}&=n^{2}+O(n),\\(n+O(n^{1/2}))\cdot (n+O(\log n))^{2}&=n^{3}+O(n^{5/2}),\\n^{O(1)}&=O(e^{n}).\end{aligned}}}{\displaystyle {\begin{aligned}(n+1)^{2}&=n^{2}+O(n),\\(n+O(n^{1/2}))\cdot (n+O(\log n))^{2}&=n^{3}+O(n^{5/2}),\\n^{O(1)}&=O(e^{n}).\end{aligned}}}The meaning of such statements is as follows: for any functions which satisfy each O(…) on the left side, there are some functions satisfying each O(…) on the right side, such that substituting all these functions into the equation makes the two sides equal. For example, the third equation above means: “For any function f(n) = O(1), there is some function g(n) = O(en) such that nf(n) = g(n).” In terms of the “set notation” above, the meaning is that the class of functions represented by the left side is a subset of the class of functions represented by the right side. In this use the “=” is a formal symbol that unlike the usual use of “=” is not a symmetric relation. Thus for example nO(1) = O(en) does not imply the false statement O(en) = nO(1)


Big O is typeset as an italicized uppercase “O”, as in the following example: {\displaystyle O(n^{2})}O(n^{2}).[11][12] In TeX, it is produced by simply typing O inside math mode. Unlike Greek-named Bachmann–Landau notations, it needs no special symbol. Yet, some authors use the calligraphic variant {\displaystyle {\mathcal {O}}}{\mathcal {O}} instead.[13][14]

Orders of common functions

Further information: Time complexity § Table of common time complexities

Here is a list of classes of functions that are commonly encountered when analyzing the running time of an algorithm. In each case, c is a positive constant and n increases without bound. The slower-growing functions are generally listed first.

{\displaystyle O(1)}O(1)constantDetermining if a binary number is even or odd; Calculating {\displaystyle (-1)^{n}}(-1)^{n}; Using a constant-size lookup table
{\displaystyle O(\log \log n)}O(\log \log n)double logarithmicNumber of comparisons spent finding an item using interpolation search in a sorted array of uniformly distributed values
{\displaystyle O(\log n)}O(\log n)logarithmicFinding an item in a sorted array with a binary search or a balanced search tree as well as all operations in a Binomial heap
{\displaystyle O((\log n)^{c})}{\displaystyle O((\log n)^{c})}
{\displaystyle \scriptstyle c>1}{\displaystyle \scriptstyle c>1}”></td><td><a href=polylogarithmic
Matrix chain ordering can be solved in polylogarithmic time on a parallel random-access machine.
{\displaystyle O(n^{c})}O(n^{c})
{\displaystyle \scriptstyle 0<c<1}{\displaystyle \scriptstyle 0<c<1}
fractional powerSearching in a k-d tree
{\displaystyle O(n)}O(n)linearFinding an item in an unsorted list or in an unsorted array; adding two n-bit integers by ripple carry
{\displaystyle O(n\log ^{*}n)}{\displaystyle O(n\log ^{*}n)}log-star nPerforming triangulation of a simple polygon using Seidel’s algorithm, or the union–find algorithm. Note that {\displaystyle \log ^{*}(n)={\begin{cases}0,&{\text{if }}n\leq 1\\1+\log ^{*}(\log n),&{\text{if }}n>1\end{cases}}}\log ^{*}(n)={\begin{cases}0,&{\text{if }}n\leq 1\\1+\log ^{*}(\log n),&{\text{if }}n>1\end{cases}}”></td></tr><tr><td>{\displaystyle O(n\log n)=O(\log n!)}<img decoding=linearithmic, loglinear, quasilinear, or “n log n”Performing a fast Fourier transform; Fastest possible comparison sortheapsort and merge sort
{\displaystyle O(n^{2})}O(n^{2})quadraticMultiplying two n-digit numbers by a simple algorithm; simple sorting algorithms, such as bubble sortselection sort and insertion sort; (worst case) bound on some usually faster sorting algorithms such as quicksortShellsort, and tree sort
{\displaystyle O(n^{c})}O(n^{c})polynomial or algebraicTree-adjoining grammar parsing; maximum matching for bipartite graphs; finding the determinant with LU decomposition
{\displaystyle L_{n}[\alpha ,c]=e^{(c+o(1))(\ln n)^{\alpha }(\ln \ln n)^{1-\alpha }}}{\displaystyle L_{n}[\alpha ,c]=e^{(c+o(1))(\ln n)^{\alpha }(\ln \ln n)^{1-\alpha }}}
{\displaystyle \scriptstyle 0<\alpha <1}{\displaystyle \scriptstyle 0<\alpha <1}
L-notation or sub-exponentialFactoring a number using the quadratic sieve or number field sieve
{\displaystyle O(c^{n})}O(c^{n})
{\displaystyle \scriptstyle c>1}{\displaystyle \scriptstyle c>1}”></td><td><a href=exponential
Finding the (exact) solution to the travelling salesman problem using dynamic programming; determining if two logical statements are equivalent using brute-force search
{\displaystyle O(n!)}O(n!)factorialSolving the travelling salesman problem via brute-force search; generating all unrestricted permutations of a poset; finding the determinant with Laplace expansion; enumerating all partitions of a set

The statement {\displaystyle f(n)=O(n!)}{\displaystyle f(n)=O(n!)} is sometimes weakened to {\displaystyle f(n)=O\left(n^{n}\right)}{\displaystyle f(n)=O\left(n^{n}\right)} to derive simpler formulas for asymptotic complexity. For any {\displaystyle k>0}k>0″> and {\displaystyle c>0}<img decoding= is a subset of {\displaystyle O(n^{c+\varepsilon })}{\displaystyle O(n^{c+\varepsilon })} for any {\displaystyle \varepsilon >0}{\displaystyle \varepsilon >0}”>, so may be considered as a polynomial with some bigger order.</p>

<h2>Related asymptotic notations</h2>

<p>Big <em>O</em> is widely used in computer science. Together with some other related notations it forms the family of Bachmann–Landau notations.<sup>[<em><a href=citation needed]

Little-o notation

“Little o” redirects here. For the baseball player, see Omar Vizquel.

Intuitively, the assertion “f(x) is o(g(x))” (read “f(x) is little-o of g(x)”) means that g(x) grows much faster than f(x). Let as before f be a real or complex valued function and g a real valued function, both defined on some unbounded subset of the positive real numbers, such that g(x) is strictly positive for all large enough values of x. One writes{\displaystyle f(x)=o(g(x))\quad {\text{ as }}x\to \infty }{\displaystyle f(x)=o(g(x))\quad {\text{ as }}x\to \infty }

if for every positive constant ε there exists a constant N such that{\displaystyle |f(x)|\leq \varepsilon g(x)\quad {\text{ for all }}x\geq N.}{\displaystyle |f(x)|\leq \varepsilon g(x)\quad {\text{ for all }}x\geq N.}[15]

For example, one has{\displaystyle 2x=o(x^{2})}{\displaystyle 2x=o(x^{2})} and {\displaystyle 1/x=o(1).}{\displaystyle 1/x=o(1).}

The difference between the earlier definition for the big-O notation and the present definition of little-o is that while the former has to be true for at least one constant M, the latter must hold for every positive constant ε, however small.[16] In this way, little-o notation makes a stronger statement than the corresponding big-O notation: every function that is little-o of g is also big-O of g, but not every function that is big-O of g is also little-o of g. For example, {\displaystyle 2x^{2}=O(x^{2})}{\displaystyle 2x^{2}=O(x^{2})} but {\displaystyle 2x^{2}\neq o(x^{2})}{\displaystyle 2x^{2}\neq o(x^{2})}.

As g(x) is nonzero, or at least becomes nonzero beyond a certain point, the relation {\displaystyle f(x)=o(g(x))}{\displaystyle f(x)=o(g(x))} is equivalent to{\displaystyle \lim _{x\to \infty }{\frac {f(x)}{g(x)}}=0}{\displaystyle \lim _{x\to \infty }{\frac {f(x)}{g(x)}}=0} (and this is in fact how Landau[15] originally defined the little-o notation).

Little-o respects a number of arithmetic operations. For example,if c is a nonzero constant and {\displaystyle f=o(g)}{\displaystyle f=o(g)} then {\displaystyle c\cdot f=o(g)}{\displaystyle c\cdot f=o(g)}, andif {\displaystyle f=o(F)}{\displaystyle f=o(F)} and {\displaystyle g=o(G)}{\displaystyle g=o(G)} then {\displaystyle f\cdot g=o(F\cdot G).}{\displaystyle f\cdot g=o(F\cdot G).}

It also satisfies a transitivity relation:if {\displaystyle f=o(g)}{\displaystyle f=o(g)} and {\displaystyle g=o(h)}{\displaystyle g=o(h)} then {\displaystyle f=o(h).}{\displaystyle f=o(h).}

Big Omega notation

Another asymptotic notation is {\displaystyle \Omega }\Omega , read “big omega”.[17] There are two widespread and incompatible definitions of the statement{\displaystyle f(x)=\Omega (g(x))}f(x)=\Omega (g(x)) as {\displaystyle x\to a}x\to a,

where a is some real number, ∞, or −∞, where f and g are real functions defined in a neighbourhood of a, and where g is positive in this neighbourhood.

The Hardy–Littlewood definition is used mainly in analytic number theory, and the Knuth definition mainly in computational complexity theory; the definitions are not equivalent.

The Hardy–Littlewood definition

In 1914 Godfrey Harold Hardy and John Edensor Littlewood introduced the new symbol {\displaystyle \Omega }\Omega ,[18] which is defined as follows:{\displaystyle f(x)=\Omega (g(x))}{\displaystyle f(x)=\Omega (g(x))} as {\displaystyle x\to \infty }x\to \infty  if {\displaystyle \limsup _{x\to \infty }\left|{\frac {f(x)}{g(x)}}\right|>0.}{\displaystyle \limsup _{x\to \infty }\left|{\frac {f(x)}{g(x)}}\right|>0.}”></p>

<p>Thus {\displaystyle f(x)=\Omega (g(x))}<img decoding= is the negation of {\displaystyle f(x)=o(g(x))}f(x)=o(g(x)).

In 1916 the same authors introduced the two new symbols {\displaystyle \Omega _{R}}\Omega _{R} and {\displaystyle \Omega _{L}}\Omega _{L}, defined as:[19]{\displaystyle f(x)=\Omega _{R}(g(x))}f(x)=\Omega _{R}(g(x)) as {\displaystyle x\to \infty }x\to \infty  if {\displaystyle \limsup _{x\to \infty }{\frac {f(x)}{g(x)}}>0}{\displaystyle \limsup _{x\to \infty }{\frac {f(x)}{g(x)}}>0}”>;{\displaystyle f(x)=\Omega _{L}(g(x))}<img decoding= as {\displaystyle x\to \infty }x\to \infty  if {\displaystyle \liminf _{x\to \infty }{\frac {f(x)}{g(x)}}<0.}{\displaystyle \liminf _{x\to \infty }{\frac {f(x)}{g(x)}}<0.}

These symbols were used by Edmund Landau, with the same meanings, in 1924.[20] After Landau, the notations were never used again exactly thus; {\displaystyle \Omega _{R}}\Omega _{R} became {\displaystyle \Omega _{+}}\Omega _{+} and {\displaystyle \Omega _{L}}\Omega _{L} became {\displaystyle \Omega _{-}}\Omega _{-}.[citation needed]

These three symbols {\displaystyle \Omega ,\Omega _{+},\Omega _{-}}\Omega ,\Omega _{+},\Omega _{-}, as well as {\displaystyle f(x)=\Omega _{\pm }(g(x))}f(x)=\Omega _{\pm }(g(x)) (meaning that {\displaystyle f(x)=\Omega _{+}(g(x))}f(x)=\Omega _{+}(g(x)) and {\displaystyle f(x)=\Omega _{-}(g(x))}f(x)=\Omega _{-}(g(x)) are both satisfied), are now currently used in analytic number theory.[21][22]

Simple examples
This section does not cite any sources. Please help improve this section by adding citations to reliable sources. Unsourced material may be challenged and removed(April 2021) (Learn how and when to remove this template message)

We have{\displaystyle \sin x=\Omega (1)}{\displaystyle \sin x=\Omega (1)} as {\displaystyle x\to \infty ,}{\displaystyle x\to \infty ,}

and more precisely{\displaystyle \sin x=\Omega _{\pm }(1)}{\displaystyle \sin x=\Omega _{\pm }(1)} as {\displaystyle x\to \infty .}x\to \infty .

We have{\displaystyle \sin x+1=\Omega (1)}{\displaystyle \sin x+1=\Omega (1)} as {\displaystyle x\to \infty ,}{\displaystyle x\to \infty ,}

and more precisely{\displaystyle \sin x+1=\Omega _{+}(1)}{\displaystyle \sin x+1=\Omega _{+}(1)} as {\displaystyle x\to \infty ;}{\displaystyle x\to \infty ;}

however{\displaystyle \sin x+1\not =\Omega _{-}(1)}{\displaystyle \sin x+1\not =\Omega _{-}(1)} as {\displaystyle x\to \infty .}x\to \infty .

The Knuth definition

In 1976 Donald Knuth published a paper to justify his use of the {\displaystyle \Omega }\Omega -symbol to describe a stronger property.[23] Knuth wrote: “For all the applications I have seen so far in computer science, a stronger requirement … is much more appropriate”. He defined{\displaystyle f(x)=\Omega (g(x))\Leftrightarrow g(x)=O(f(x))}f(x)=\Omega (g(x))\Leftrightarrow g(x)=O(f(x))

with the comment: “Although I have changed Hardy and Littlewood’s definition of {\displaystyle \Omega }\Omega , I feel justified in doing so because their definition is by no means in wide use, and because there are other ways to say what they want to say in the comparatively rare cases when their definition applies.”[23]

Family of Bachmann–Landau notations

NotationName[23]DescriptionFormal DefinitionLimit Definition[24][25][26][23][18]
{\displaystyle f(n)=O(g(n))}f(n)=O(g(n))Big O; Big Oh; Big Omicron{\displaystyle |f|}|f| is bounded above by g (up to constant factor) asymptotically{\displaystyle \exists k>0\exists n_{0}\forall n>n_{0}\colon |f(n)|\leq k\cdot g(n)}{\displaystyle \exists k>0\exists n_{0}\forall n>n_{0}\colon |f(n)|\leq k\cdot g(n)}”></td><td>{\displaystyle \limsup _{n\to \infty }{\frac {\left|f(n)\right|}{g(n)}}<\infty }<img decoding=
{\displaystyle f(n)=\Theta (g(n))}f(n)=\Theta (g(n))Big Thetaf is bounded both above and below by g asymptotically{\displaystyle \exists k_{1}>0\exists k_{2}>0\exists n_{0}\forall n>n_{0}\colon }{\displaystyle \exists k_{1}>0\exists k_{2}>0\exists n_{0}\forall n>n_{0}\colon }”> {\displaystyle k_{1}\cdot g(n)\leq f(n)\leq k_{2}\cdot g(n)}<img decoding={\displaystyle f(n)=O(g(n))}f(n)=O(g(n)) and {\displaystyle f(n)=\Omega (g(n))}f(n)=\Omega (g(n)) (Knuth version)
{\displaystyle f(n)=\Omega (g(n))}f(n)=\Omega (g(n))Big Omega in complexity theory (Knuth)f is bounded below by g asymptotically{\displaystyle \exists k>0\exists n_{0}\forall n>n_{0}\colon f(n)\geq k\cdot g(n)}{\displaystyle \exists k>0\exists n_{0}\forall n>n_{0}\colon f(n)\geq k\cdot g(n)}”></td><td>{\displaystyle \liminf _{n\to \infty }{\frac {f(n)}{g(n)}}>0}<img decoding=Small O; Small Ohf is dominated by g asymptotically{\displaystyle \forall k>0\exists n_{0}\forall n>n_{0}\colon |f(n)|<k\cdot g(n)}{\displaystyle \forall k>0\exists n_{0}\forall n>n_{0}\colon |f(n)|<k\cdot g(n)}{\displaystyle \lim _{n\to \infty }{\frac {\left|f(n)\right|}{g(n)}}=0}{\displaystyle \lim _{n\to \infty }{\frac {\left|f(n)\right|}{g(n)}}=0}
{\displaystyle f(n)\sim g(n)}{\displaystyle f(n)\sim g(n)}On the order off is equal to g asymptotically{\displaystyle \forall \varepsilon >0\exists n_{0}\forall n>n_{0}\colon \left|{\frac {f(n)}{g(n)}}-1\right|<\varepsilon }{\displaystyle \forall \varepsilon >0\exists n_{0}\forall n>n_{0}\colon \left|{\frac {f(n)}{g(n)}}-1\right|<\varepsilon }{\displaystyle \lim _{n\to \infty }{\frac {f(n)}{g(n)}}=1}{\displaystyle \lim _{n\to \infty }{\frac {f(n)}{g(n)}}=1}
{\displaystyle f(n)=\omega (g(n))}f(n)=\omega (g(n))Small Omegaf dominates g asymptotically{\displaystyle \forall k>0\exists n_{0}\forall n>n_{0}\colon |f(n)|>k\cdot |g(n)|}{\displaystyle \forall k>0\exists n_{0}\forall n>n_{0}\colon |f(n)|>k\cdot |g(n)|}”></td><td>{\displaystyle \lim _{n\to \infty }{\frac {\left|f(n)\right|}{g(n)}}=\infty }<img decoding=
{\displaystyle f(n)=\Omega (g(n))}f(n)=\Omega (g(n))Big Omega in number theory (Hardy–Littlewood){\displaystyle |f|}|f| is not dominated by g asymptotically{\displaystyle \exists k>0\forall n_{0}\exists n>n_{0}\colon |f(n)|\geq k\cdot g(n)}{\displaystyle \exists k>0\forall n_{0}\exists n>n_{0}\colon |f(n)|\geq k\cdot g(n)}”></td><td>{\displaystyle \limsup _{n\to \infty }{\frac {\left|f(n)\right|}{g(n)}}>0}<img decoding=. The table is (partly) sorted from smallest to largest, in the sense that {\displaystyle o,O,\Theta ,\sim ,}{\displaystyle o,O,\Theta ,\sim ,} (Knuth’s version of) {\displaystyle \Omega ,\omega }{\displaystyle \Omega ,\omega } on functions correspond to {\displaystyle <,\leq ,\approx ,=,}{\displaystyle <,\leq ,\approx ,=,}{\displaystyle \geq ,>}{\displaystyle \geq ,>}”> on the real line<sup><a href=[26] (the Hardy-Littlewood version of {\displaystyle \Omega }{\displaystyle \Omega }, however, doesn’t correspond to any such description).

Computer science uses the big {\displaystyle O}{\displaystyle O}, big Theta {\displaystyle \Theta }{\displaystyle \Theta }, little {\displaystyle o}{\displaystyle o}, little omega {\displaystyle \omega }{\displaystyle \omega } and Knuth’s big Omega {\displaystyle \Omega }{\displaystyle \Omega } notations.[27] Analytic number theory often uses the big {\displaystyle O}{\displaystyle O}, small {\displaystyle o}{\displaystyle o}, Hardy–Littlewood’s big Omega {\displaystyle \Omega }{\displaystyle \Omega } (with or without the +, − or ± subscripts) and {\displaystyle \sim }\sim  notations.[21] The small omega {\displaystyle \omega }{\displaystyle \omega } notation is not used as often in analysis.[28]

Use in computer science

Further information: Analysis of algorithms

Informally, especially in computer science, the big O notation often can be used somewhat differently to describe an asymptotic tight bound where using big Theta Θ notation might be more factually appropriate in a given context.[citation needed] For example, when considering a function T(n) = 73n3 + 22n2 + 58, all of the following are generally acceptable, but tighter bounds (such as numbers 2 and 3 below) are usually strongly preferred over looser bounds (such as number 1 below).

  1. T(n) = O(n100)
  2. T(n) = O(n3)
  3. T(n) = Θ(n3)

The equivalent English statements are respectively:

  1. T(n) grows asymptotically no faster than n100
  2. T(n) grows asymptotically no faster than n3
  3. T(n) grows asymptotically as fast as n3.

So while all three statements are true, progressively more information is contained in each. In some fields, however, the big O notation (number 2 in the lists above) would be used more commonly than the big Theta notation (items numbered 3 in the lists above). For example, if T(n) represents the running time of a newly developed algorithm for input size n, the inventors and users of the algorithm might be more inclined to put an upper asymptotic bound on how long it will take to run without making an explicit statement about the lower asymptotic bound.

Other notation

In their book Introduction to AlgorithmsCormenLeisersonRivest and Stein consider the set of functions f which satisfy{\displaystyle f(n)=O(g(n))\quad (n\to \infty )~.}{\displaystyle f(n)=O(g(n))\quad (n\to \infty )~.}

In a correct notation this set can, for instance, be called O(g), where{\displaystyle O(g)=\{f:{\text{there exist positive constants}}~c~{\text{and}}~n_{0}~{\text{such that}}~0\leq f(n)\leq cg(n)~{\text{for all}}~n\geq n_{0}\}}{\displaystyle O(g)=\{f:{\text{there exist positive constants}}~c~{\text{and}}~n_{0}~{\text{such that}}~0\leq f(n)\leq cg(n)~{\text{for all}}~n\geq n_{0}\}}.[29]

The authors state that the use of equality operator (=) to denote set membership rather than the set membership operator (∈) is an abuse of notation, but that doing so has advantages.[6] Inside an equation or inequality, the use of asymptotic notation stands for an anonymous function in the set O(g), which eliminates lower-order terms, and helps to reduce inessential clutter in equations, for example:[30]{\displaystyle 2n^{2}+3n+1=2n^{2}+O(n).}{\displaystyle 2n^{2}+3n+1=2n^{2}+O(n).}

Extensions to the Bachmann–Landau notations

Another notation sometimes used in computer science is Õ (read soft-O): f(n) = Õ(g(n)) is shorthand for f(n) = O(g(n) logk g(n)) for some k.[31] Essentially, it is big O notation, ignoring logarithmic factors because the growth-rate effects of some other super-logarithmic function indicate a growth-rate explosion for large-sized input parameters that is more important to predicting bad run-time performance than the finer-point effects contributed by the logarithmic-growth factor(s). This notation is often used to obviate the “nitpicking” within growth-rates that are stated as too tightly bounded for the matters at hand (since logk n is always o(nε) for any constant k and any ε > 0).

Also the L notation, defined as{\displaystyle L_{n}[\alpha ,c]=e^{(c+o(1))(\ln n)^{\alpha }(\ln \ln n)^{1-\alpha }}}{\displaystyle L_{n}[\alpha ,c]=e^{(c+o(1))(\ln n)^{\alpha }(\ln \ln n)^{1-\alpha }}}

is convenient for functions that are between polynomial and exponential in terms of {\displaystyle \ln n}\ln n.

Generalizations and related usages

The generalization to functions taking values in any normed vector space is straightforward (replacing absolute values by norms), where f and g need not take their values in the same space. A generalization to functions g taking values in any topological group is also possible[citation needed]. The “limiting process” x → xo can also be generalized by introducing an arbitrary filter base, i.e. to directed nets f and g. The o notation can be used to define derivatives and differentiability in quite general spaces, and also (asymptotical) equivalence of functions,{\displaystyle f\sim g\iff (f-g)\in o(g)}f\sim g\iff (f-g)\in o(g)

which is an equivalence relation and a more restrictive notion than the relationship “f is Θ(g)” from above. (It reduces to lim f / g = 1 if f and g are positive real valued functions.) For example, 2x is Θ(x), but 2x − x is not o(x).

History (Bachmann–Landau, Hardy, and Vinogradov notations)

The symbol O was first introduced by number theorist Paul Bachmann in 1894, in the second volume of his book Analytische Zahlentheorie (“analytic number theory“).[1] The number theorist Edmund Landau adopted it, and was thus inspired to introduce in 1909 the notation o;[2] hence both are now called Landau symbols. These notations were used in applied mathematics during the 1950s for asymptotic analysis.[32] The symbol {\displaystyle \Omega }\Omega  (in the sense “is not an o of”) was introduced in 1914 by Hardy and Littlewood.[18] Hardy and Littlewood also introduced in 1916 the symbols {\displaystyle \Omega _{R}}\Omega _{R} (“right”) and {\displaystyle \Omega _{L}}\Omega _{L} (“left”),[19] precursors of the modern symbols {\displaystyle \Omega _{+}}\Omega _{+} (“is not smaller than a small o of”) and {\displaystyle \Omega _{-}}\Omega _{-} (“is not larger than a small o of”). Thus the Omega symbols (with their original meanings) are sometimes also referred to as “Landau symbols”. This notation {\displaystyle \Omega }\Omega  became commonly used in number theory at least since the 1950s.[33] In the 1970s the big O was popularized in computer science by Donald Knuth, who introduced the related Theta notation, and proposed a different definition for the Omega notation.[23]

Landau never used the big Theta and small omega symbols.

Hardy’s symbols were (in terms of the modern O notation){\displaystyle f\preccurlyeq g\iff f\in O(g)}{\displaystyle f\preccurlyeq g\iff f\in O(g)}   and   {\displaystyle f\prec g\iff f\in o(g);}f\prec g\iff f\in o(g);

(Hardy however never defined or used the notation {\displaystyle \prec \!\!\prec }\prec \!\!\prec , nor {\displaystyle \ll }\ll , as it has been sometimes reported). Hardy introduced the symbols {\displaystyle \preccurlyeq }\preccurlyeq  and {\displaystyle \prec }\prec  (as well as some other symbols) in his 1910 tract “Orders of Infinity”, and made use of them only in three papers (1910–1913). In his nearly 400 remaining papers and books he consistently used the Landau symbols O and o.

Hardy’s notation is not used anymore. On the other hand, in the 1930s,[34] the Russian number theorist Ivan Matveyevich Vinogradov introduced his notation {\displaystyle \ll }\ll , which has been increasingly used in number theory instead of the {\displaystyle O}O notation. We have{\displaystyle f\ll g\iff f\in O(g),}f\ll g\iff f\in O(g),

and frequently both notations are used in the same paper.

The big-O originally stands for “order of” (“Ordnung”, Bachmann 1894), and is thus a Latin letter. Neither Bachmann nor Landau ever call it “Omicron”. The symbol was much later on (1976) viewed by Knuth as a capital omicron,[23] probably in reference to his definition of the symbol Omega. The digit zero should not be used.

See also

References and notes

  1. a b Bachmann, Paul (1894). Analytische Zahlentheorie[Analytic Number Theory] (in German). 2. Leipzig: Teubner.
  2. a b Landau, Edmund (1909). Handbuch der Lehre von der Verteilung der Primzahlen [Handbook on the theory of the distribution of the primes] (in German). Leipzig: B. G. Teubner. p. 883.
  3. ^ Mohr, Austin. “Quantum Computing in Complexity Theory and Theory of Computation” (PDF). p. 2. Retrieved 7 June2014.
  4. ^ Landau, Edmund (1909). Handbuch der Lehre von der Verteilung der Primzahlen [Handbook on the theory of the distribution of the primes] (in German). Leipzig: B.G. Teubner. p. 31.
  5. ^ Michael Sipser (1997). Introduction to the Theory of Computation. Boston/MA: PWS Publishing Co. Here: Def.7.2, p.227
  6. a b Cormen,Thomas H.; Leiserson, Charles E.; Rivest, Ronald L. (2009). Introduction to Algorithms (3rd ed.). Cambridge/MA: MIT Press. p. 45ISBN 978-0-262-53305-8Because θ(g(n)) is a set, we could write “f(n) ∈ θ(g(n))” to indicate that f(n) is a member of θ(g(n)). Instead, we will usually write f(n) = θ(g(n)) to express the same notion. You might be confused because we abuse equality in this way, but we shall see later in this section that doing so has its advantages.
  7. ^ Cormen, Thomas; Leiserson, Charles; Rivest, Ronald; Stein, Clifford (2009). Introduction to Algorithms (Third ed.). MIT. p. 53.
  8. ^ Howell, Rodney. “On Asymptotic Notation with Multiple Variables” (PDF). Retrieved 2015-04-23.
  9. ^ N. G. de Bruijn (1958). Asymptotic Methods in Analysis. Amsterdam: North-Holland. pp. 5–7. ISBN 978-0-486-64221-5.
  10. a b Graham, RonaldKnuth, DonaldPatashnik, Oren (1994). Concrete Mathematics (2 ed.). Reading, Massachusetts: Addison–Wesley. p. 446. ISBN 978-0-201-55802-9.
  11. ^ Donald E. Knuth, The art of computer programming. Vol. 1. Fundamental algorithms, third edition, Addison Wesley Longman, 1997. Section
  12. ^ Ronald L. Graham, Donald E. Knuth, and Oren Patashnik, Concrete Mathematics: A Foundation for Computer Science (2nd ed.), Addison-Wesley, 1994. Section 9.2, p. 443.
  13. ^ Sivaram Ambikasaran and Eric Darve, An {\displaystyle {\mathcal {O}}(N\log N)}{\displaystyle {\mathcal {O}}(N\log N)} Fast Direct Solver for Partial Hierarchically Semi-Separable Matrices, J. Scientific Computing 57 (2013), no. 3, 477–501.
  14. ^ Saket Saurabh and Meirav Zehavi, {\displaystyle (k,n-k)}(k,n-k)-Max-Cut: An {\displaystyle {\mathcal {O}}^{*}(2^{p})}{\displaystyle {\mathcal {O}}^{*}(2^{p})}-Time Algorithm and a Polynomial Kernel, Algorithmica80 (2018), no. 12, 3844–3860.
  15. a b Landau, Edmund (1909). Handbuch der Lehre von der Verteilung der Primzahlen [Handbook on the theory of the distribution of the primes] (in German). Leipzig: B. G. Teubner. p. 61.
  16. ^ Thomas H. Cormen et al., 2001, Introduction to Algorithms, Second Edition, Ch. 3.1
  17. ^ Cormen TH, Leiserson CE, Rivest RL, Stein C (2009). Introduction to algorithms (3rd ed.). Cambridge, Mass.: MIT Press. p. 48. ISBN 978-0-262-27083-0OCLC 676697295.
  18. a b c Hardy, G. H.; Littlewood, J. E. (1914). “Some problems of diophantine approximation: Part II. The trigonometrical series associated with the elliptic ϑ-functions”Acta Mathematica37: 225. doi:10.1007/BF02401834.
  19. a b G. H. Hardy and J. E. Littlewood, « Contribution to the theory of the Riemann zeta-function and the theory of the distribution of primes », Acta Mathematica, vol. 41, 1916.
  20. ^ E. Landau, “Über die Anzahl der Gitterpunkte in gewissen Bereichen. IV.” Nachr. Gesell. Wiss. Gött. Math-phys. Kl. 1924, 137–150.
  21. a b Aleksandar Ivić. The Riemann zeta-function, chapter 9. John Wiley & Sons 1985.
  22. ^ Gérald Tenenbaum, Introduction to analytic and probabilistic number theory, Chapter I.5. American Mathematical Society, Providence RI, 2015.
  23. a b c d e f Knuth, Donald (April–June 1976). “Big Omicron and big Omega and big Theta” (PDF). SIGACT News: 18–24.
  24. ^ Balcázar, José L.; Gabarró, Joaquim. “Nonuniform complexity classes specified by lower and upper bounds”(PDF). RAIRO – Theoretical Informatics and Applications – Informatique Théorique et Applications23 (2): 180. ISSN 0988-3754. Retrieved 14 March 2017.
  25. ^ Cucker, Felipe; Bürgisser, Peter (2013). “A.1 Big Oh, Little Oh, and Other Comparisons”. Condition: The Geometry of Numerical Algorithms. Berlin, Heidelberg: Springer. pp. 467–468. doi:10.1007/978-3-642-38896-5ISBN 978-3-642-38896-5.
  26. a b Vitányi, PaulMeertens, Lambert (April 1985). “Big Omega versus the wild functions” (PDF). ACM SIGACT News16 (4): 56–59. CiteSeerX
  27. ^ Cormen, Thomas H.Leiserson, Charles E.Rivest, Ronald L.Stein, Clifford (2001) [1990]. Introduction to Algorithms(2nd ed.). MIT Press and McGraw-Hill. pp. 41–50. ISBN 0-262-03293-7.
  28. ^ for example it is omitted in: Hildebrand, A.J. “Asymptotic Notations” (PDF). Department of Mathematics. Asymptotic Methods in Analysis. Math 595, Fall 2009. Urbana, IL: University of Illinois. Retrieved 14 March 2017.
  29. ^ Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L. (2009). Introduction to Algorithms (3rd ed.). Cambridge/MA: MIT Press. p. 47. ISBN 978-0-262-53305-8When we have only an asymptotic upper bound, we use O-notation. For a given function g(n), we denote by O(g(n)) (pronounced “big-oh of g of n” or sometimes just “oh of g of n“) the set of functions O(g(n)) = { f(n) : there exist positive constants c and n0 such that 0 ≤ f(n) ≤ cg(n) for all n ≥ n0}
  30. ^ Cormen,Thomas H.; Leiserson, Charles E.; Rivest, Ronald L. (2009). Introduction to Algorithms (3rd ed.). Cambridge/MA: MIT Press. p. 49ISBN 978-0-262-53305-8When the asymptotic notation stands alone (that is, not within a larger formula) on the right-hand side of an equation (or inequality), as in n = O(n²), we have already defined the equal sign to mean set membership: n ∈ O(n²). In general, however, when asymptotic notation appears in a formula, we interpret it as standing for some anonymous function that we do not care to name. For example, the formula 2n2 + 3n + 1 = 2n2 + θ(n) means that 2n2 + 3n + 1 = 2n2 + f(n), where f(n) is some function in the set θ(n). In this case, we let f(n) = 3n + 1, which is indeed in θ(n). Using asymptotic notation in this manner can help eliminate inessential detail and clutter in an equation.
  31. ^ Introduction to algorithms. Cormen, Thomas H. (Third ed.). Cambridge, Mass.: MIT Press. 2009. p. 63ISBN 978-0-262-27083-0OCLC 676697295.
  32. ^ Erdelyi, A. (1956). Asymptotic ExpansionsISBN 978-0-486-60318-6.
  33. ^ E. C. Titchmarsh, The Theory of the Riemann Zeta-Function (Oxford; Clarendon Press, 1951)
  34. ^ See for instance “A new estimate for G(n) in Waring’s problem” (Russian). Doklady Akademii Nauk SSSR 5, No 5-6 (1934), 249–253. Translated in English in: Selected works / Ivan Matveevič Vinogradov; prepared by the Steklov Mathematical Institute of the Academy of Sciences of the USSR on the occasion of his 90th birthday. Springer-Verlag, 1985.

Further reading

External links

The Wikibook Data Structures has a page on the topic of: Big-O Notation
Wikiversity solved a MyOpenMath problem using Big-O Notation


” (WP)


Fair Use Sources:

Software Engineering

Higher-order function

Not to be confused with Functor (category theory).

” (WP)

In mathematics and computer science, a higher-order function is a function that does at least one of the following:

All other functions are first-order functions. In mathematics higher-order functions are also termed operators or functionals. The differential operator in calculus is a common example, since it maps a function to its derivative, also a function. Higher-order functions should not be confused with other uses of the word “functor” throughout mathematics, see Functor (disambiguation).

In the untyped lambda calculus, all functions are higher-order; in a typed lambda calculus, from which most functional programming languages are derived, higher-order functions that take one function as argument are values with types of the form {\displaystyle (\tau _{1}\to \tau _{2})\to \tau _{3}}(\tau _{1}\to \tau _{2})\to \tau _{3}.

General examples

  • map function, found in many functional programming languages, is one example of a higher-order function. It takes as arguments a function f and a collection of elements, and as the result, returns a new collection with f applied to each element from the collection.
  • Sorting functions, which take a comparison function as a parameter, allowing the programmer to separate the sorting algorithm from the comparisons of the items being sorted. The C standard function qsort is an example of this.
  • filter
  • fold
  • apply
  • Function composition
  • Integration
  • Callback
  • Tree traversal
  • Montague grammar, a semantic theory of natural language, uses higher-order functions

Support in programming languages

Direct support

The examples are not intended to compare and contrast programming languages, but to serve as examples of higher-order function syntax

In the following examples, the higher-order function twice takes a function, and applies the function to some value twice. If twice has to be applied several times for the same f it preferably should return a function rather than a value. This is in line with the “don’t repeat yourself” principle.


Further information: APL (programming language)

      twice{⍺⍺ ⍺⍺ ⍵}


      g{plusthree twice ⍵}
      g 7

Or in a tacit manner:



      gplusthree twice
      g 7


Further information: C++

Using std::function in C++11:

#include <iostream>
#include <functional>

auto twice = [](const std::function<int(int)>& f)
    return [&f](int x) {
        return f(f(x));

auto plus_three = [](int i)
    return i + 3;

int main()
    auto g = twice(plus_three);

    std::cout << g(7) << '\n'; // 13

Or, with generic lambdas provided by C++14:

#include <iostream>

auto twice = [](const auto& f)
    return [&f](int x) {
        return f(f(x));

auto plus_three = [](int i)
    return i + 3;

int main()
    auto g = twice(plus_three);

    std::cout << g(7) << '\n'; // 13


Further information: C Sharp (programming language)

Using just delegates:

using System;

public class Program
    public static void Main(string[] args)
        Func<Func<int, int>, Func<int, int>> twice = f => x => f(f(x));

        Func<int, int> plusThree = i => i + 3;

        var g = twice(plusThree);

        Console.WriteLine(g(7)); // 13

Or equivalently, with static methods:

using System;

public class Program
    private static Func<int, int> Twice(Func<int, int> f)
        return x => f(f(x));

    private static int PlusThree(int i) => i + 3;

    public static void Main(string[] args)
        var g = Twice(PlusThree);

        Console.WriteLine(g(7)); // 13


Further information: Clojure

(defn twice [f]
  (fn [x] (f (f x))))

(defn plus-three [i]
  (+ i 3))

(def g (twice plus-three))

(println (g 7)) ; 13

ColdFusion Markup Language (CFML)

Further information: ColdFusion Markup Language

twice = function(f) {
    return function(x) {
        return f(f(x));

plusThree = function(i) {
    return i + 3;

g = twice(plusThree);

writeOutput(g(7)); // 13


Further information: D (programming language)

import std.stdio : writeln;

alias twice = (f) => (int x) => f(f(x));

alias plusThree = (int i) => i + 3;

void main()
    auto g = twice(plusThree);

    writeln(g(7)); // 13


Further information: Elixir (programming language)

In Elixir, you can mix module definitions and anonymous functions

defmodule Hof do
    def twice(f) do
        fn(x) -> f.(f.(x)) end

plus_three = fn(i) -> 3 + i end

g = Hof.twice(plus_three)

IO.puts g.(7) # 13

Alternatively, we can also compose using pure anonymous functions.

twice = fn(f) ->
    fn(x) -> f.(f.(x)) end

plus_three = fn(i) -> 3 + i end

g = twice.(plus_three)

IO.puts g.(7) # 13


Further information: Erlang (programming language)

or_else([], _) -> false;
or_else([F | Fs], X) -> or_else(Fs, X, F(X)).

or_else(Fs, X, false) -> or_else(Fs, X);
or_else(Fs, _, {false, Y}) -> or_else(Fs, Y);
or_else(_, _, R) -> R.

or_else([fun erlang:is_integer/1, fun erlang:is_atom/1, fun erlang:is_list/1], 3.23).

In this Erlang example, the higher-order function or_else/2 takes a list of functions (Fs) and argument (X). It evaluates the function F with the argument X as argument. If the function F returns false then the next function in Fs will be evaluated. If the function F returns {false, Y} then the next function in Fs with argument Y will be evaluated. If the function F returns R the higher-order function or_else/2 will return R. Note that XY, and R can be functions. The example returns false.


Further information: F Sharp (programming language)

let twice f = f >> f

let plus_three = (+) 3

let g = twice plus_three

g 7 |> printf "%A" // 13


Further information: Go (programming language)

package main

import "fmt"

func twice(f func(int) int) func(int) int {
	return func(x int) int {
		return f(f(x))

func main() {
	plusThree := func(i int) int {
		return i + 3

	g := twice(plusThree)

	fmt.Println(g(7)) // 13

Notice a function literal can be defined either with an identifier (twice) or anonymously (assigned to variable f).


Further information: Haskell (programming language)

twice :: (Int -> Int) -> (Int -> Int)
twice f = f . f

plusThree :: Int -> Int
plusThree = (+3)

main :: IO ()
main = print (g 7) -- 13
    g = twice plusThree


Further information: J (programming language)


   twice=.     adverb : 'u u y'

   plusthree=. verb   : 'y + 3'
   g=. plusthree twice
   g 7

or tacitly,

   twice=. ^:2

   plusthree=. +&3
   g=. plusthree twice
   g 7

Java (1.8+)

Further information: Java (programming language) and Java version history

Using just functional interfaces:

import java.util.function.*;

class Main {
    public static void main(String[] args) {
        Function<IntUnaryOperator, IntUnaryOperator> twice = f -> f.andThen(f);

        IntUnaryOperator plusThree = i -> i + 3;

        var g = twice.apply(plusThree);

        System.out.println(g.applyAsInt(7)); // 13

Or equivalently, with static methods:

import java.util.function.*;

class Main {
    private static IntUnaryOperator twice(IntUnaryOperator f) {
        return f.andThen(f);

    private static int plusThree(int i) {
        return i + 3;

    public static void main(String[] args) {
        var g = twice(Main::plusThree);

        System.out.println(g.applyAsInt(7)); // 13


Further information: JavaScript

"use strict";

const twice = f => x => f(f(x));

const plusThree = i => i + 3;

const g = twice(plusThree);

console.log(g(7)); // 13


Further information: Julia (programming language)

julia> function twice(f)
           function result(x)
               return f(f(x))
           return result
twice (generic function with 1 method)

julia> plusthree(i) = i + 3
plusthree (generic function with 1 method)

julia> g = twice(plusthree)
(::var"#result#3"{typeof(plusthree)}) (generic function with 1 method)

julia> g(7)


Further information: Kotlin (programming language)

fun twice(f: (Int) -> Int): (Int) -> Int {
    return { f(f(it)) }

fun plusThree(i: Int) = i + 3

fun main() {
    val g = twice(::plusThree)

    println(g(7)) // 13


Further information: Lua (programming language)

local function twice(f)
  return function (x)
    return f(f(x))

local function plusThree(i)
  return i + 3

local g = twice(plusThree)

print(g(7)) -- 13


Further information: MATLAB

function result = twice(f)
    result = @inner

    function val = inner(x)
        val = f(f(x));

plusthree = @(i) i + 3;

g = twice(plusthree)

disp(g(7)); % 13


Further information: OCaml (programming language)

let twice f x =
  f (f x)

let plus_three =
  (+) 3

let () =
  let g = twice plus_three in

  print_int (g 7); (* 13 *)
  print_newline ()


Further information: PHP



function twice(callable $f): Closure {
    return function (int $x) use ($f): int {
        return $f($f($x));

function plusThree(int $i): int {
    return $i + 3;

$g = twice('plusThree');

echo $g(7), "\n"; // 13

or with all functions in variables:



$twice = fn(callable $f): Closure => fn(int $x): int => $f($f($x));

$plusThree = fn(int $i): int => $i + 3;

$g = $twice($plusThree);

echo $g(7), "\n"; // 13

Note that arrow functions implicitly capture any variables that come from the parent scope,[1] whereas anonymous functions require the use keyword to do the same.


Further information: Pascal (programming language)

{$mode objfpc}

type fun = function(x: Integer): Integer;

function twice(f: fun; x: Integer): Integer;
  result := f(f(x));

function plusThree(i: Integer): Integer;
  result := i + 3;

  writeln(twice(@plusThree, 7)); { 13 }


Further information: Perl

use strict;
use warnings;

sub twice {
    my ($f) = @_;
    sub {

sub plusThree {
    my ($i) = @_;
    $i + 3;

my $g = twice(\&plusThree);

print $g->(7), "\n"; # 13

or with all functions in variables:

use strict;
use warnings;

my $twice = sub {
    my ($f) = @_;
    sub {

my $plusThree = sub {
    my ($x) = @_;
    $x + 3;

my $g = $twice->($plusThree);

print $g->(7), "\n"; # 13


Further information: Python (programming language)

>>> def twice(f):
...     def result(x):
...         return f(f(x))
...     return result

>>> plusthree = lambda i: i + 3

>>> g = twice(plusthree)
>>> g(7)

Python decorator syntax is often used to replace a function with the result of passing that function through a higher-order function. E.g., the function g could be implemented equivalently:

>>> @twice
... def g(i):
...     return i + 3

>>> g(7)


Further information: R (programming language)

twice <- function(f) {
  return(function(x) {

plusThree <- function(i) {
  return(i + 3)

g <- twice(plusThree)

> print(g(7))
[1] 13


Further information: Raku (programming language)

sub twice(Callable:D $f) {
    return sub { $f($f($^x)) };

sub plusThree(Int:D $i) {
    return $i + 3;

my $g = twice(&plusThree);

say $g(7); # 13

In Raku, all code objects are closures and therefore can reference inner “lexical” variables from an outer scope because the lexical variable is “closed” inside of the function. Raku also supports “pointy block” syntax for lambda expressions which can be assigned to a variable or invoked anonymously.


Further information: Ruby (programming language)

def twice(f)
  ->(x) { }

plus_three = ->(i) { i + 3 }

g = twice(plus_three)

puts # 13


Further information: Rust (programming language)

fn twice(f: impl Fn(i32) -> i32) -> impl Fn(i32) -> i32 {
    move |x| f(f(x))

fn plus_three(i: i32) -> i32 {
    i + 3

fn main() {
    let g = twice(plus_three);

    println!("{}", g(7)) // 13


Further information: Scala (programming language)

object Main {
  def twice(f: Int => Int): Int => Int =
    f compose f

  def plusThree(i: Int): Int =
    i + 3

  def main(args: Array[String]): Unit = {
    val g = twice(plusThree)

    print(g(7)) // 13


Further information: Scheme (programming language)

(define (add x y) (+ x y))
(define (f x)
  (lambda (y) (+ x y)))
(display ((f 3) 7))
(display (add 3 7))

In this Scheme example, the higher-order function (f x) is used to implement currying. It takes a single argument and returns a function. The evaluation of the expression ((f 3) 7) first returns a function after evaluating (f 3). The returned function is (lambda (y) (+ 3 y)). Then, it evaluates the returned function with 7 as the argument, returning 10. This is equivalent to the expression (add 3 7), since (f x) is equivalent to the curried form of (add x y).


Further information: Swift (programming language)

func twice(_ f: @escaping (Int) -> Int) -> (Int) -> Int {
    return { f(f($0)) }

let plusThree = { $0 + 3 }

let g = twice(plusThree)

print(g(7)) // 13


Further information: Tcl

set twice {{f x} {apply $f [apply $f $x]}}
set plusThree {{i} {return [expr $i + 3]}}

# result: 13
puts [apply $twice $plusThree 7]

Tcl uses apply command to apply an anonymous function (since 8.6).


Further information: XACML

The XACML standard defines higher-order functions in the standard to apply a function to multiple values of attribute bags.

rule allowEntry{
    condition anyOfAny(function[stringEqual], citizenships, allowedCitizenships)

The list of higher-order functions in XACML can be found here.


Further information: XQuery

declare function local:twice($f, $x) {

declare function local:plusthree($i) {
  $i + 3

local:twice(local:plusthree#1, 7) (: 13 :)


Function pointers

Function pointers in languages such as C and C++ allow programmers to pass around references to functions. The following C code computes an approximation of the integral of an arbitrary function:

#include <stdio.h>

double square(double x)
    return x * x;

double cube(double x)
    return x * x * x;

/* Compute the integral of f() within the interval [a,b] */
double integral(double f(double x), double a, double b, int n)
    int i;
    double sum = 0;
    double dt = (b - a) / n;
    for (i = 0;  i < n;  ++i) {
        sum += f(a + (i + 0.5) * dt);
    return sum * dt;

int main()
    printf("%g\n", integral(square, 0, 1, 100));
    printf("%g\n", integral(cube, 0, 1, 100));
    return 0;

The qsort function from the C standard library uses a function pointer to emulate the behavior of a higher-order function.


Macros can also be used to achieve some of the effects of higher-order functions. However, macros cannot easily avoid the problem of variable capture; they may also result in large amounts of duplicated code, which can be more difficult for a compiler to optimize. Macros are generally not strongly typed, although they may produce strongly typed code.

Dynamic code evaluation

In other imperative programming languages, it is possible to achieve some of the same algorithmic results as are obtained via higher-order functions by dynamically executing code (sometimes called Eval or Execute operations) in the scope of evaluation. There can be significant drawbacks to this approach:

  • The argument code to be executed is usually not statically typed; these languages generally rely on dynamic typing to determine the well-formedness and safety of the code to be executed.
  • The argument is usually provided as a string, the value of which may not be known until run-time. This string must either be compiled during program execution (using just-in-time compilation) or evaluated by interpretation, causing some added overhead at run-time, and usually generating less efficient code.


In object-oriented programming languages that do not support higher-order functions, objects can be an effective substitute. An object’s methods act in essence like functions, and a method may accept objects as parameters and produce objects as return values. Objects often carry added run-time overhead compared to pure functions, however, and added boilerplate code for defining and instantiating an object and its method(s). Languages that permit stack-based (versus heap-based) objects or structs can provide more flexibility with this method.

An example of using a simple stack based record in Free Pascal with a function that returns a function:

program example;

  int = integer;
  Txy = record x, y: int; end;
  Tf = function (xy: Txy): int;
function f(xy: Txy): int; 
  Result := xy.y + xy.x; 

function g(func: Tf): Tf; 
  result := func; 

  a: Tf;
  xy: Txy = (x: 3; y: 7);

  a := g(@f);     // return a function to "a"
  writeln(a(xy)); // prints 10

The function a() takes a Txy record as input and returns the integer value of the sum of the record’s x and y fields (3 + 7).


Defunctionalization can be used to implement higher-order functions in languages that lack first-class functions:

// Defunctionalized function data structures
template<typename T> struct Add { T value; };
template<typename T> struct DivBy { T value; };
template<typename F, typename G> struct Composition { F f; G g; };

// Defunctionalized function application implementations
template<typename F, typename G, typename X>
auto apply(Composition<F, G> f, X arg) {
    return apply(f.f, apply(f.g, arg));

template<typename T, typename X>
auto apply(Add<T> f, X arg) {
    return arg  + f.value;

template<typename T, typename X>
auto apply(DivBy<T> f, X arg) {
    return arg / f.value;

// Higher-order compose function
template<typename F, typename G>
Composition<F, G> compose(F f, G g) {
    return Composition<F, G> {f, g};

int main(int argc, const char* argv[]) {
    auto f = compose(DivBy<float>{ 2.0f }, Add<int>{ 5 });
    apply(f, 3); // 4.0f
    apply(f, 9); // 7.0f
    return 0;

In this case, different types are used to trigger different functions via function overloading. The overloaded function in this example has the signature auto apply.

See also


  1. ^ “PHP: Arrow Functions – Manual” Retrieved 2021-03-01.


” (WP)


Fair Use Sources:

Cloud History Software Engineering

! Template Authors-Teachers CS Pioneers

See also: List of pioneers in computer science and Timeline of the History of Computers

” (WP)

” (WP)


Fair Use Sources:

Cloud History Software Engineering

List of pioneers in computer science

See also: Timeline of the History of Computers

” (WP)

This article presents a list of individuals who made transformative breakthroughs in the creation, development and imagining of what computers could do.


To put the list in chronological order, click the small “up-down” icon in the Date column. The Person column can also be sorted alphabetically, up-down.

830~Al-KhwarizmiThe term “algorithm” is derived from the algorism, the technique of performing arithmetic with Hindu–Arabic numerals popularised by al-Khwarizmi in his book On the Calculation with Hindu Numerals.[1][2][3]
1944Aiken, HowardConceived and codesigned the Harvard Mark I.
1970, 1989Allen, Frances E.Developed bit vector notation and program control-flow graphs. Became the first female IBM Fellow in 1989. In 2006, she became the first female recipient of the ACM’s Turing Award.
1939Atanasoff, JohnBuilt the first electronic digital computer, the Atanasoff–Berry Computer, though it was neither programmable nor Turing-complete.
1822, 1837Babbage, CharlesOriginated the concept of a programmable general-purpose computer. Designed the Analytical Engine and built a prototype for a less powerful mechanical calculator.
1954, 1963Backus, JohnLed the team that created FORTRAN (Formula Translation), the first practical high-level programming language, and he formulated the Backus–Naur form that described the formal language syntax.
1964Baran, PaulOne of two independent inventors of the concept of digital packet switching used in modern computer networking including the Internet.[4][5] Baran published a series of briefings and papers about dividing information into “message blocks” and sending it over distributed networks between 1960 and 1964.[6][7]
1874Baudot, ÉmileA French telegraphic engineer patents the Baudot code, the first means of digital communication.[8] The modem speed unit baud is named after him.
1989, 1990Berners-Lee, TimInvented World Wide Web. With Robert Cailliau, sent first HTTP communication between client and server.
1966Böhm, CorradoTheorized of the concept of structured programming.
1847, 1854Boole, GeorgeFormalized Boolean algebra, the basis for digital logic and computer science.
1947Booth, KathleenInvented the first assembly language.
1969, 1978Brinch Hansen, PerDeveloped the RC 4000 multiprogramming system which introduced the concept of an operating system kernel and the separation of policy and mechanism, effectively the first microkernel architecture.[9] Co-developed the monitor with Tony Hoare, and created the first monitor implementation.[10] Implemented the first form of remote procedure call in the RC 4000,[9] and was first to propose remote procedure calls as a structuring concept for distributed computing.[11]
1959, 1995Brooks, FredManager of IBM System/360 and OS/360 projects; author of The Mythical Man-Month.
1908Brouwer, Luitzen Egbertus JanFounded intuitionistic logic which later came to prevalent use in proof assistants.
1930Bush, VannevarAnalogue computing pioneer. Originator of the Memex concept, which led to the development of Hypertext.
1951Caminer, DavidWith John Pinkerton, developed the LEO computer, the first business computer, for J. Lyons and Co
1978Cerf, VintWith Bob Kahn, designed the Transmission Control Protocol and Internet Protocol (TCP/IP), the primary data communication protocols of the Internet and other computer networks.
1956Chomsky, NoamMade contributions to computer science with his work in linguistics. He developed Chomsky hierarchy, a discovery which has directly impacted programming language theory and other branches of computer science.
1936Church, AlonzoMade fundamental contributions to theoretical computer science, specifically in the development of computability theory in the form of lambda calculus. Independently of Alan Turing, he formulated what is now known as Church-Turing Thesis and proved that first-order logic is undecidable.
1962Clark, Wesley A.Designed LINC, the first functional computer scaled down and priced for the individual user. Put in service in 1963, many of its features are seen as prototypes of what were to be essential elements of personal computers.
1981Clarke, Edmund M.Developed model checking and formal verification of software and hardware together with E. Allen Emerson.
1970Codd, Edgar F.Proposed and formalized the relational model of data management, the theoretical basis of relational databases.
1971Conway, LynnSuperscalar architecture with multiple-issue out-of-order dynamic instruction scheduling.
1967Cook, StephenFormalized the notion of NP-completeness, inspiring a great deal of research in computational complexity theory.
1965Cooley, JamesWith John W. Tukey, created the fast Fourier transform.
1965Davies, DonaldOne of two independent inventors of the concept of digital packet switching used in modern computer networking including the Internet.[4][12] Davies conceived of and named the concept of packet switching in data communication networks in 1965 and 1966.[13][14]
1962Dahl, Ole-JohanWith Kristen Nygaard, invented the proto-object oriented language SIMULA.
1968Dijkstra, EdsgerMade advances in algorithms, pioneered and coined the term structured programming, invented the semaphore, and famously suggested that the GOTO statement should be considered harmful.
1918Eccles, William and Jordan, Frank WilfredBritish physicists patent the Eccles–Jordan trigger circuit.[15] The so-called bistable flip-flop, this circuit is a building block of all digital memory cells. Built from Vacuum tubes, their concept was essential for the success of the Colossus codebreaking computer.
1943, 1951Eckert, J. PresperWith John Mauchly, designed and built the ENIAC, the first modern (all electronic, Turing-complete) computer, and the UNIVAC I, the first commercially available computer.
1981Emerson, E. AllenDeveloped model checking and formal verification of software and hardware together with Edmund M. Clarke.
1963Engelbart, DouglasBest known for inventing the computer mouse (in a joint effort with Bill English); as a pioneer of human–computer interaction whose Augment team developed hypertextnetworked computers, and precursors to GUIs.
1973Thacker, Charles P.Pioneering design and realization of the Xerox Alto, the first modern personal computer, and in addition for his contributions to the Ethernet and the Tablet PC.
1971Faggin, FedericoDesigned the first commercial microprocessor (Intel 4004).
1974Feinler, ElizabethHer team defined a simple text file format for Internet host names. The list evolved into the Domain Name System and her group became the naming authority for the top-level domains of .mil, .gov, .edu, .org, and .com.
1943Flowers, TommyDesigned and built the Mark 1 and the ten improved Mark 2 Colossus computers, the world’s first programmable, digital, electronic, computing devices.
1994Floyd, SallyFounded the field of Active Queue Management and co-invented Random Early Detection which is used in almost all Internet routers.
1879Frege, GottlobExtended Aristotelian logic with first-order predicate calculus, independently of Charles Sanders Peirce, a crucial precursor in computability theory. Also relevant to early work on artificial intelligencelogic programming.
1880, 1898Sanders Peirce, CharlesProved the functional completeness of the NOR gate. Proposed the implementation of logic via electrical circuits, decades before Claude Shannon. Extended Aristotelian logic with first-order predicate calculus, independently of Gottlob Frege, a crucial precursor in computability theory. Also relevant to early work on artificial intelligencelogic programming.
1985Furber, Stephen
Sophie Wilson
Are known for their work on creating ARM 32bit RISC microprocessor.[16]
1958, 1961, 1967Ginsburg, SeymourProved “don’t-care” circuit minimization does not necessarily yield optimal results, proved that the ALGOL programming language is context-free (thus linking formal language theory to the problem of compiler writing), and invented AFL Theory.
1931Gödel, KurtProved that Peano arithmetic could not be both logically consistent and complete in first-order predicate calculus. Church, Kleene, and Turing developed the foundations of computation theory based on corollaries to Gödel’s work.
1989Goldwasser, ShafiZero-knowledge proofs invented by Goldwasser, Micali and Rackoff. Goldwasser and Micali awarded the Turing Award in 2012 for this and other work.
2011Graham, Susan L.Awarded the 2009 IEEE John von Neumann Medal for “contributions to programming language design and implementation and for exemplary service to the discipline of computer science”.
1953Gray, FrankPhysicist and researcher at Bell Labs, developed the reflected binary code (RBC) or Gray code.[17] Gray’s methodologies are used for error detection and correction in digital communication systems, such as QAM in digital subscriber line networks.
1974, 2005Gray, JimInnovator in database systems and transaction processing implementation.
1986, 1990Grosz, Barbara[undue weight? – discuss]Created the first computational model of discourse, which established the field of research and influenced language-processing technologies. Also developed SharedPlans model for collaboration in multi-agent systems.
1988, 2015Gustafson, JohnProved the viability of parallel computing experimentally and theoretically Gustafson’s Law. Developed high-efficiency formats for representing real numbers Unum and Posit.
1971Hamilton, MargaretDeveloped the concepts of asynchronous software, priority scheduling, end-to-end testing, and human-in-the-loop decision capability, such as priority displays which then became the foundation for ultra reliable software design.
1950Hamming, RichardCreated the mathematical field of error-correcting codeHamming codeHamming matrix, the Hamming windowHamming numberssphere-packing (or Hamming bound), and the Hamming distance.[18][19] He established concept of perfect code.[20][21]
1972, 1973Thi, André Truong Trong and François Gernelle[undue weight? – discuss]Invention of the Micral N, the earliest commercial, non-kit personal computer based on a microprocessor.
1981, 1995, 1999Hejlsberg, AndersAuthor of Turbo Pascal while at Borland, the chief architect of Delphi, and designer and lead architect of C# at Microsoft.
2008, 2012, 2018Hinton, GeoffreyPopularized and enabled the use of artificial neural networks and deep learning, which rank among the most successful tools in modern artificial intelligence efforts. Received the Turing Award in 2018 for conceptual and engineering breakthroughs that have made deep neural networks a critical component of computing.[22]
1961, 1969, 1978, 1980Hoare, C.A.R.Developed the formal language Communicating Sequential Processes (CSP), Hoare logic for verifying program correctness, and Quicksort. Fundamental contributions to the definition and design of programming languages.
1968Holberton, BettyWrote the first mainframe sort merge on the Univac
1889Hollerith, HermanWidely regarded as the father of modern machine data processing. His invention of the punched card tabulating machine marks the beginning of the era of semiautomatic data processing systems.
1952Hopper, GracePioneered work on the necessity for high-level programming languages, which she termed automatic programming, and wrote the A-O compiler, which heavily influenced the COBOL language.
1997Hsu Feng-hsiungWork led to the creation of the Deep Thought chess computer, and the architect and the principal designer of the IBM Deep Blue chess computer which defeated the reigning World Chess ChampionGarry Kasparov, in 1997.
1952Hurd, CuthbertHelped the International Business Machines Corporation develop its first general-purpose computer, the IBM 701.
1945, 1953Huskey, HarryEarly computer design including contributions to the ENIACEDVACPilot ACEEDVACSEACSWAC, and Bendix G-15 computers. The G-15 has been described as the first personal computer, being operable by one person.
1954, 1962Iverson, KennethAssisted in establishing the first graduate course in computer science (at Harvard) and taught that course; invented the APL programming language and made contribution to interactive computing.
1801Jacquard, Joseph MarieBuilt and demonstrated the Jacquard loom, a programmable mechanized loom controlled by a tape constructed from punched cards.
1206Al-JazariInvented programmable machines, including programmable humanoid robots,[23] and the castle clock, an astronomical clock considered the first programmable analog computer.[24]
1953Spärck Jones, Karen[undue weight? – discuss]One of the pioneers of information retrieval and natural language processing.
1970, 1990Karnaugh, MauriceInventor of the Karnaugh map, used for logic function minimization.
1973Karpinski, JacekDeveloped the first differential analyzer that used transistors, and developed one of the first machine learning algorithms for character and image recognition. Also was the inventor of one of the first minicomputers, the K-202.
1970~Kay, AlanPioneered many of the ideas at the root of object-oriented programming languages, led the team that developed Smalltalk, and made fundamental contributions to personal computing.
1957Kirsch, Russell GrayWhilst working for the National Bureau of Standards (NBS), Kirsch used a recently developed image scanner to scan and store the first digital photograph.[25] His scanned photo of his three-month-old son was deemed by Life magazine as one the “100 Photographs That Changed The World.”
1936Kleene, Stephen ColePioneered work with Alonzo Church on the Lambda Calculus that first laid down the foundations of computation theory.
1968, 1989Knuth, DonaldWrote The Art of Computer Programming and created TeX. Coined the term “analysis of algorithms” and made major contributions to that field, including popularizing Big O notation.
1974, 1978Lamport, LeslieFormulated algorithms to solve many fundamental problems in distributed systems (e.g. the bakery algorithm).
Developed the concept of a logical clock, enabling synchronization between distributed entities based on the events through which they communicate. Created LaTeX.
1951Lebedev, Sergei AlekseyevichIndependently designed the first electronic computer in the Soviet Union, MESM, in Kiev, Ukraine.
1670~Leibniz, GottfriedMade advances in symbolic logic, such as the Calculus ratiocinator, that were heavily influential on Gottlob Frege. He anticipated later developments in first-order predicate calculus, which were crucial for the theoretical foundations of computer science.
1960Licklider, J. C. R.Began the investigation of human–computer interaction, leading to many advances in computer interfaces as well as in cybernetics and artificial intelligence.
1987Liskov, BarbaraDeveloped the Liskov substitution principle, which guarantees semantic interoperability of data types in a hierarchy.
1300~Llull, RamonDesigned multiple symbolic representations machines, and pioneered notions of symbolic representation and manipulation to produce knowledge—both of which were major influences on Leibniz.
1852Lovelace, AdaAn English mathematician and writer, chiefly known for her work on Charles Babbage’s proposed mechanical general-purpose computer, the Analytical Engine. She was the first to recognize that the machine had applications beyond pure calculation, and created the first algorithm intended to be carried out by such a machine. As a result, she is often regarded as the first to recognize the full potential of a “computing machine” and the first computer programmer.
1909Ludgate, PercyCharles Babbage in 1843 and Percy Ludgate in 1909 designed the first two Analytical Engines in history. Ludgate’s engine used multiplication as its basis (using his own discrete “Irish logarithms”), had the first multiplier-accumulator (MAC), was first to exploit a MAC to perform division, stored numbers as displacements of rods in shuttles, and had several other novel features, including for program control.
1971Martin-Löf, PerPublished an early draft on the type theory that many proof assistants build on.
1943, 1951Mauchly, JohnWith J. Presper Eckert, designed and built the ENIAC, the first modern (all electronic, Turing-complete) computer, and the UNIVAC I, the first commercially available computer. Also worked on BINAC(1949), EDVAC(1949), UNIVAC(1951) with Grace Hopper and Jean Bartik, to develop early stored program computers.
1958McCarthy, JohnInvented LISP, a functional programming language.
1956, 2012McCluskey, Edward J.Fundamental contributions that shaped the design and testing of digital systems, including the first algorithm for digital logic synthesis, the Quine-McCluskey logic minimization method.
1986Meyer, BertrandDeveloped design by contract in the guise of the Eiffel programming language.
1963Minsky, MarvinCo-founder of Artificial Intelligence Lab at Massachusetts Institute of Technology, author of several texts on AI and philosophy. Critic of the perceptron.
850~Banū MūsāThe Banū Mūsā brothers wrote the Book of Ingenious Devices, where they described what appears to be the first programmable machine, an automatic flute player.[26]
1950, 1960Nakamatsu YoshirōInvented the first floppy disk at Tokyo Imperial University in 1950,[27][28] receiving a 1952 Japanese patent[29][30] and 1958 US patent for his floppy magnetic disk sheet invention,[31] and licensed to Nippon Columbia in 1960[32] and IBM in the 1970s.[29][27]
2008Nakamoto, SatoshiThe anonymous creator or creators of Bitcoin, the first peer-to-peer digital currency. Nakamoto’s 2008 white-paper introduced the concept of the blockchain, a database structure that allows full trust in the decentralized and distributed public transaction ledger of the cryptocurrency.[33]
1934, 1938Nakashima AkiraNEC engineer introduced switching circuit theory in papers from 1934 to 1936, laying the foundations for digital circuit design, in digital computers and other areas of modern technology.
1960Naur, PeterEdited the ALGOL 60 Revised Report, introducing Backus-Naur form
1945Neumann, John vonFormulated the von Neumann architecture upon which most modern computers are based.
1956Newell, AllenTogether with J. C. Shaw[34] and Herbert Simon, the three co-wrote the Logic Theorist, the first true AI program, in the first list-processing language, which influenced LISP.
1943Newman, MaxInstigated the production of the Colossus computers at Bletchley Park. After the war he established the Computing Machine Laboratory at the University of Manchester where he created the project that built the world’s first stored-program computer, the Manchester Baby.
1962Nygaard, KristenWith Ole-Johan Dahl, invented the proto-object oriented language SIMULA.
500 BC ~PāṇiniAshtadhyayi Sanskrit grammar was systematised and technical, using metarules, transformations, and recursions, a forerunner to formal language theory and basis for Panini-Backus form used to describe programming languages.
1642Pascal, BlaiseInvented the mechanical calculator.
1952Perlis, AlanOn Project Whirlwind, member of the team that developed the ALGOL programming language, and the first recipient of the Turing Award
1985Perlman, RadiaInvented the Spanning Tree Protocol (STP), which is fundamental to the operation of network bridges, while working for Digital Equipment Corporation. Has done extensive and innovative research, particularly on encryption and networking. She received the USENIX Lifetime Achievement Award in 2007, among numerous others.
1964Perotto, Pier Giorgio[undue weight? – discuss]Computer designer for Olivetti, designed one of the first electronic programmable calculators, the Programma 101[35][36][37]
1932Péter, RózsaPublished a series of papers grounding recursion theory as a separate area of mathematical research, setting the foundation for theoretical computer science.
1995Picard, Rosalind[undue weight? – discuss]Founded Affective Computing, and laid the foundations for giving computers skills of emotional intelligence.
1936Post, Emil L.Developed the Post machine as a model of computation, independently of Turing. Known also for developing truth tables, the Post correspondence problem used in recursion theory as well as proving what is known as Post’s theorem.
19672011Ritchie, DennisWith Ken Thompson, pioneered the C programming language and the Unix computer operating system at Bell Labs.
1958–1960Rosen, SaulDesigned the software of the first transistor-based computer. Also influenced the ALGOL programming language.
1910Russell, BertrandMade contributions to computer science with his work on mathematical logic (example: truth function). Introduced the notion of type theory. He also introduced type system (along with Alfred North Whitehead) in his work, Principia Mathematica.
1975Salton, Gerard[undue weight? – discuss]A pioneer of automatic information retrieval, who proposed the vector space model and the inverted index.
1962Sammet, Jean E.Developed the FORMAC programming language. She was also the first to write extensively about the history and categorization of programming languages in 1969, and became the first female president of the Association for Computing Machinery in 1974.
1963, 1973Sasaki TadashiSharp engineer who conceived a single-chip microprocessor CPU, presenting the idea to Busicom and Intel in 1968. This influenced the first commercial microprocessor, the Intel 4004; before Busicom, Intel was a memory manufacturer. Tadashi Sasaki also developed LCD calculators at Sharp.[38]
1937, 1948Shannon, ClaudeFounded information theory, and laid foundations for practical digital circuit design.
1968, 1980Shima MasatoshiDesigned the Intel 4004, the first commercial microprocessor,[39][40] as well as the Intel 8080Zilog Z80 and Zilog Z8000 microprocessors, and the Intel 8259825582538257 and 8251 chips.[41]
1956, 1957Simon, Herbert A.A political scientist and economist who pioneered artificial intelligence. Co-creator of the Logic Theory Machine and the General Problem Solver programs.
1972Stallman, RichardStallman launched the GNU Project in September 1983 to create a Unix-like computer operating system composed entirely of free software. With this, he also launched the free software movement.
1982Stonebraker, MichaelResearcher at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) who revolutionized the field of database management systems (DBMSs) and founded multiple successful database companies
1979Stroustrup, BjarneInvented C++ at Bell Labs
1963Sutherland, IvanAuthor of Sketchpad, the ancestor of modern computer-aided drafting (CAD) programs and one of the early examples of object-oriented programming.
1967Thompson, KenCreated the Unix operating system, the B programming languagePlan 9 operating system, the first machine to achieve a Master rating in chess, and the UTF-8 encoding at Bell Labs and the Go programming language at Google.
1993Toh Chai KeongCreated mobile ad hoc networking; Implemented the first working wireless ad hoc network of laptop computers in 1998 using Linux OS, Lucent WaveLan 802.11 radios, and a new distributed routing protocol transparent to TCP/UDP/IP.
1991Torvalds, LinusCreated the first version of the Linux kernel.
1912, 1914, 1920Torres Quevedo, LeonardoIn 1912, Leonardo Torres Quevedo built El Ajedrecista (the chess player), one of the first autonomous machines capable of playing chess. As opposed to the human-operated The Turk and Ajeeb, El Ajedrecista was a true automaton built to play chess without human guidance. It played an endgame with three chess pieces, automatically moving a white king and a rook to checkmate the black king moved by a human opponent. In his work Essays on Automatics, published in 1914, Torres Quevedo formulates what will be a new branch of engineering: automation. This work also included floating-point arithmetic. In 1920, Torres Quevedo was the first in history to build an early electromechanical version of the Analytical Engine.
1965Tukey, John W.With James Cooley, created the fast Fourier transform. He invented the term “bit”.[42]
1936Turing, AlanMade several foundamental contributions to theoretical computer science, including the Turing machine computational model, the conceiving of the stored program concept and the designing of the high-speed ACE design. Independently of Alonzo Church, he formulated the Church-Turing thesis and proved that first-order logic is undecidable. He also explored the philosophical issues concerning artificial intelligence, proposing what is now known as Turing test.
1950~Wang AnMade key contributions to the development of magnetic core memory.
1955, 1960s, 1974Ware, WillisCo-designer of JOHNNIAC. Chaired committee that developed the Code of Fair Information Practice and led to the Privacy Act of 1974. Vice-chair of the Privacy Protection Study Commission.
1968Wijngaarden, Adriaan vanDeveloper of the W-grammar first used in the definition of ALGOL 68
1949Wilkes, MauriceBuilt the first practical stored program computer (EDSAC) to be completed and for being credited with the ideas of several high-level programming language constructs.
1970, 1978Wirth, NiklausDesigned the PascalModula-2 and Oberon programming languages.
1875, 1875Verea, RamónDesigned and patented the Verea Direct Multiplier, the first mechanical direct multiplier.
1938, 1945Zuse, KonradBuilt the first digital freely programmable computer, the Z1. Built the first functional program-controlled computer, the Z3.[43] The Z3 was proven to be Turing-complete in 1998. Produced the world’s first commercial computer, the Z4. Designed the first high-level programming language, Plankalkül.
1970Wilkinson, James H.Research in numerical analysis to facilitate the use of the high-speed digital computer, having received special recognition for his work in computations in linear algebra and “backward” error analysis.[44]
1973Bachman, CharlesOutstanding contributions to database technology.[45]
1976Rabin, Michael O.The joint paper “Finite Automata and Their Decision Problems,”[46] which introduced the idea of nondeterministic machines, which has proved to be an enormously valuable concept. Their (Scott & Rabin) classic paper has been a continuous source of inspiration for subsequent work in this field.[47][48]
1976Scott, DanaThe joint paper “Finite Automata and Their Decision Problems,”[46] which introduced the idea of nondeterministic machines, which has proved to be an enormously valuable concept. Their (Scott & Rabin) classic paper has been a continuous source of inspiration for subsequent work in this field.[47][48]
1978Floyd, Robert W.Having a clear influence on methodologies for the creation of efficient and reliable software, and helping to found the following important subfields of computer science: the theory of parsing, the semantics of programming languages, automatic program verificationautomatic program synthesis, and analysis of algorithms.[49]
1985Karp, Richard M.Contributions to the theory of algorithms including the development of efficient algorithms for network flow and other combinatorial optimization problems, the identification of polynomial-time computability with the intuitive notion of algorithmic efficiency, and, most notably, contributions to the theory of NP-completeness.
1986Hopcroft, JohnFundamental achievements in the design and analysis of algorithms and data structures.
1986Tarjan, RobertFundamental achievements in the design and analysis of algorithms and data structures.
1987Cocke, JohnSignificant contributions in the design and theory of compilers, the architecture of large systems and the development of reduced instruction set computers (RISC).
1989Kahan, WilliamFundamental contributions to numerical analysis. One of the foremost experts on floating-point computations. Kahan has dedicated himself to “making the world safe for numerical computations.
1989Corbató, Fernando J.Pioneering work organizing the concepts and leading the development of the general-purpose, large-scale, time-sharing and resource-sharing computer systems, CTSS and Multics.
1991Milner, Robin1) LCF, the mechanization of Scott’s Logic of Computable Functions, probably the first theoretically based yet practical tool for machine assisted proof construction; 2) ML, the first language to include polymorphic type inference together with a type-safe exception-handling mechanism; 3) CCS, a general theory of concurrency. In addition, he formulated and strongly advanced full abstraction, the study of the relationship between operational and denotational semantics.[50]
1992Lampson, Butler W.Development of distributed, personal computing environments and the technology for their implementation: workstationsnetworksoperating systems, programming systems, displayssecurity and document publishing.
1993Hartmanis, JurisFoundations for the field of computational complexity theory.[51]
1993Stearns, Richard E.Foundations for the field of computational complexity theory.[51]
1994Feigenbaum, EdwardPioneering the design and construction of large scale artificial intelligence systems, demonstrating the practical importance and potential commercial impact of artificial intelligence technology.[52]
1994Reddy, RajPioneering the design and construction of large scale artificial intelligence systems, demonstrating the practical importance and potential commercial impact of artificial intelligence technology.[52]
1995Blum, ManuelContributions to the foundations of computational complexity theory and its application to cryptography and program checking.[53]
1996Pnueli, AmirIntroducing temporal logic into computing science and for outstanding contributions to program and systems verification.[54]
2000Yao, AndrewFundamental contributions to the theory of computation, including the complexity-based theory of pseudorandom number generationcryptography, and communication complexity.
1977Rivest, RonIngenious contribution and making public-key cryptography useful in practice.
1977Shamir, AdiIngenious contribution and making public-key cryptography useful in practice.
1977Adleman, LeonardIngenious contribution and making public-key cryptography useful in practice.
1978Kahn, BobDesigned the Transmission Control Protocol and Internet Protocol (TCP/IP), the primary data communication protocols of the Internet and other computer networks.
2007Sifakis, JosephDeveloping model checking into a highly effective verification technology, widely adopted in the hardware and software industries.[55]
2010Valiant, LeslieTransformative contributions to the theory of computation, including the theory of probably approximately correct (PAC) learning, the complexity of enumeration and of algebraic computation, and the theory of parallel and distributed computing.
2011Pearl, JudeaFundamental contributions to artificial intelligence through the development of a calculus for probabilistic and causal reasoning.[56]
1976Hellman, MartinFundamental contributions to modern cryptography. Diffie and Hellman’s groundbreaking 1976 paper, “New Directions in Cryptography,”[57] introduced the ideas of public-key cryptography and digital signatures, which are the foundation for most regularly-used security protocols on the Internet today.[58]
1976Diffie, WhitfieldFundamental contributions to modern cryptography. Diffie and Hellman’s groundbreaking 1976 paper, “New Directions in Cryptography,”[57] introduced the ideas of public-key cryptography and digital signatures, which are the foundation for most regularly-used security protocols on the Internet today.[59]
2018Bengio, YoshuaHinton GeoffreyLecun YannConceptual and engineering breakthroughs that have made deep neural networks a critical component of computing.[22]
2012Silvio MicaliFor transformative work that laid the complexity-theoretic foundations for the science of cryptography and in the process pioneered new methods for efficient verification of mathematical proofs in complexity theory.
2017John L. HennessyFor pioneering a systematic, quantitative approach to the design and evaluation of computer architectures with enduring impact on the microprocessor industry.
2017David PattersonFor pioneering a systematic, quantitative approach to the design and evaluation of computer architectures with enduring impact on the microprocessor industry.
2019Edwin CatmullFor fundamental contributions to 3-D computer graphics, and the revolutionary impact of these techniques on computer-generated imagery (CGI) in filmmaking and other applications
2019Pat HanrahanFor fundamental contributions to 3-D computer graphics, and the revolutionary impact of these techniques on computer-generated imagery (CGI) in filmmaking and other applications

~ Items marked with a tilde are circa dates.

See also


  1. ^ Mario Tokoro, ed. (2010). “9”. e: From Understanding Principles to Solving Problems. pp. 223–224. ISBN 978-1-60750-468-9.
  2. ^ Cristopher Moore; Stephan Mertens (2011). The Nature of Computation. Oxford University Press. p. 36. ISBN 978-0-19-162080-5.
  3. ^ A. P. Ershov, Donald Ervin Knuth, ed. (1981). Algorithms in modern mathematics and computer science: proceedings, Urgench, Uzbek SSR, September 16–22, 1979. Springer. ISBN 978-3-540-11157-3.
  4. a b “The real story of how the Internet became so vulnerable”Washington Post. May 30, 2015. Archived from the original on 2015-05-30. Retrieved 2020-02-18. Historians credit seminal insights to Welsh scientist Donald W. Davies and American engineer Paul Baran
  5. ^ “Inductee Details – Paul Baran”. National Inventors Hall of Fame. Archived from the original on 6 September 2017. Retrieved 6 September 2017.
  6. ^ Baran, Paul (2002). “The beginnings of packet switching: some underlying concepts” (PDF). IEEE Communications Magazine40 (7): 42–48. doi:10.1109/MCOM.2002.1018006ISSN 0163-6804Essentially all the work was defined by 1961, and fleshed out and put into formal written form in 1962. The idea of hot potato routing dates from late 1960.
  7. ^ Monica, 1776 Main Street Santa; California 90401-3208. “Paul Baran and the Origins of the Internet” Retrieved 2020-02-15.
  8. ^ “Jean-Maurice- Emile Baudot. Système de télégraphie rapide, June 1874. Brevet 103,898; Source: Archives Institut National de la Propriété Industrielle (INPI)”.
  9. a b “Per Brinch Hansen • IEEE Computer Society” Retrieved 2015-12-15.
  10. ^ Brinch Hansen, Per (April 1993). “Monitors and Concurrent Pascal: a personal history” (PDF). 2nd ACM Conference on the History of Programming Languages.
  11. ^ Brinch Hansen, Per (November 1978). “Distributed processes: a concurrent programming concept” (PDF). Communications of the ACM21 (11): 934–941. CiteSeerX 11610744.
  12. ^ “Inductee Details – Donald Watts Davies”. National Inventors Hall of Fame. Archived from the original on 6 September 2017. Retrieved 6 September 2017.
  13. ^ Roberts, Dr. Lawrence G. (November 1978). “The Evolution of Packet Switching”. Archived from the original on March 24, 2016. Retrieved 5 September 2017. Almost immediately after the 1965 meeting, Donald Davies conceived of the details of a store-and-forward packet switching system; Roberts, Dr. Lawrence G. (May 1995). “The ARPANET & Computer Networks”. Archived from the original on March 24, 2016. Retrieved 13 April 2016. Then in June 1966, Davies wrote a second internal paper, “Proposal for a Digital Communication Network” In which he coined the word packet,- a small sub part of the message the user wants to send, and also introduced the concept of an “Interface computer” to sit between the user equipment and the packet network.
  14. ^ Donald Davies (2001), “A Historical Study of the Beginnings of Packet Switching”Computer Journal, British Computer Society
  15. ^ William Henry Eccles and Frank Wilfred Jordan, “Improvements in ionic relays” British patent number: GB 148582 (filed: 21 June 1918; published: 5 August 1920). Available on-line at: .
  16. ^ “Computer History Museum | Fellow Awards – Steve Furber”. Archived from the original on 2013-04-02.
  17. ^ Gray, Frank (1953-03-17). “Pulse code communication” (PDF). U.S. patent no. 2,632,058
  18. ^ Morgan 1998, pp. 973–975.
  19. ^ Hamming 1950, pp. 147–160.
  20. ^ Ling & Xing 2004, pp. 82–88.
  21. ^ Pless 1982, pp. 21–24.
  22. a b Fathers of the Deep Learning Revolution Receive ACM A.M. Turing Award
  23. ^ “articles58” 29 June 2007. Archived from the original on 29 June 2007. Retrieved 25 October 2017.
  24. ^ “Ancient Discoveries, Episode 11: Ancient Robots”History Channel. Retrieved 2008-09-06.
  25. ^ Kirsch, Russell A., “Earliest Image Processing”NISTS Museum; SEAC and the Start of Image Processing at the National Bureau of StandardsNational Institute of Standards and Technology, archived from the original on 2014-07-19
  26. ^ Koetsier, Teun (2001). “On the prehistory of programmable machines: musical automata, looms, calculators”. Mechanism and Machine Theory36 (5): 589–603. doi:10.1016/S0094-114X(01)00005-2.
  27. a b G. W. A. Dummer (1997), Electronic Inventions and Discoveries, page 164Institute of Physics
  28. ^ Valerie-Anne Giscard d’Estaing (1990), The Book of Inventions and Discoveries, page 124, Queen Anne Press
  29. a b Lazarus, David (April 10, 1995). “‘Japan’s Edison’ Is Country’s Gadget King : Japanese Inventor Holds Record for Patent”The New York Times. Retrieved 2010-12-21.
  30. ^ YOSHIRO NAKAMATSU – THE THOMAS EDISON OF JAPAN, Stellarix Consultancy Services, 2015
  31. ^ Magnetic record sheet, Patent US3131937
  32. ^ Graphic Arts Japan, Volume 2 (1960), pages 20–22
  33. ^ Nakamoto, Satoshi (24 May 2009). “”Bitcoin: A Peer-to-Peer Electronic Cash System” (PDF)” (PDF).
  34. ^ Fred Joseph Gruenberger, The History of the JOHNNIAC, RAND Memorandum 5654
  35. ^ “Olivetti Programma 101 Electronic Calculator”The Old Calculator Web Museumtechnically, the machine was a programmable calculator, not a computer.
  36. ^ “2008/107/1 Computer, Programma 101, and documents (3), plastic / metal / paper / electronic components, hardware architect Pier Giorgio Perotto, designed by Mario Bellini, made by Olivetti, Italy, 1965–1971” Retrieved 2016-03-20.
  37. ^ “Olivetti Programma 101 Electronic Calculator”The Old Calculator Web MuseumIt appears that the Mathatronics Mathatron calculator preceeded [sic] the Programma 101 to market.
  38. ^ Aspray, William (1994-05-25). “Oral-History: Tadashi Sasaki”Interview #211 for the Center for the History of Electrical Engineering. The Institute of Electrical and Electronics Engineers, Inc. Retrieved 2013-01-02.
  39. ^ Nigel Tout. “The Busicom 141-PF calculator and the Intel 4004 microprocessor”. Retrieved November 15, 2009.
  40. ^ Federico FagginThe Making of the First MicroprocessorIEEE Solid-State Circuits Magazine, Winter 2009, IEEE Xplore
  41. ^ Japan, Information Processing Society of. “Shima Masatoshi-Computer Museum” Retrieved 25 October 2017.
  42. ^ Claude Shannon (1948). “Bell System Technical Journal”. Bell System Technical Journal.
  43. ^ Copeland, B. Jack (25 October 2017). Zalta, Edward N. (ed.). The Stanford Encyclopedia of Philosophy. Metaphysics Research Lab, Stanford University. Retrieved 25 October 2017 – via Stanford Encyclopedia of Philosophy.
  44. ^ Wilkinson, J. H. (1971). “Some Comments from a Numerical Analyst”. Journal of the ACM18 (2): 137–147. doi:10.1145/321637.321638S2CID 37748083.
  45. ^ Bachman, C. W. (1973). “The programmer as navigator”Communications of the ACM16 (11): 653–658. doi:10.1145/355611.362534.
  46. a b Rabin, M. O.; Scott, D. (1959). “Finite Automata and Their Decision Problems”IBM Journal of Research and Development3 (2): 114. doi:10.1147/rd.32.0114S2CID 3160330.
  47. a b Rabin, M. O. (1977). “Complexity of computations”Communications of the ACM20 (9): 625–633. doi:10.1145/359810.359816.
  48. a b Scott, D. S. (1977). “Logic and programming languages”Communications of the ACM20 (9): 634–641. doi:10.1145/359810.359826.
  49. ^ Floyd, R. W. (1979). “The paradigms of programming”Communications of the ACM22 (8): 455–460. doi:10.1145/359138.359140.
  50. ^ Milner, R. (1993). “Elements of interaction: Turing award lecture”Communications of the ACM36: 78–89. doi:10.1145/151233.151240.
  51. a b Stearns, R. E. (1994). “Turing Award lecture: It’s time to reconsider time”Communications of the ACM37 (11): 95–99. doi:10.1145/188280.188379.
  52. a b Reddy, R. (1996). “To dream the possible dream”Communications of the ACM39 (5): 105–112. doi:10.1145/229459.233436.
  53. ^ “A.M. Turing Award Laureate – Manuel Blum” Retrieved 4 November 2018.
  54. ^ “A.M. Turing Award Laureate – Amir Pnueli” Retrieved 4 November 2018.
  55. ^ 2007 Turing Award Winners Announced
  56. ^ “Judea Pearl”. ACM.
  57. a b Diffie, W.; Hellman, M. (1976). “New directions in cryptography” (PDF). IEEE Transactions on Information Theory22 (6): 644–654. CiteSeerX
  58. ^ “Cryptography Pioneers Receive 2015 ACM A.M. Turing Award”. ACM.
  59. ^ “Cryptography Pioneers Receive 2015 ACM A.M. Turing Award”. ACM.


External links


” (WP)


Fair Use Sources:

DevOps Kubernetes Software Engineering

Argo CD – Declarative, GitOps continuous delivery tool for Kubernetes

See also Kubernetes, Kubernetes and Containerization Bibliography

” (WP)

What Is Argo CD?

Argo CD is a declarative, GitOps continuous delivery tool for Kubernetes.

Argo CD UI

Why Argo CD?

Application definitions, configurations, and environments should be declarative and version controlled. Application deployment and lifecycle management should be automated, auditable, and easy to understand.

Getting Started

Quick Start

kubectl create namespace argocd
kubectl apply -n argocd -f

Follow our getting started guide. Further user oriented documentation is provided for additional features. If you are looking to upgrade ArgoCD, see the upgrade guide. Developer oriented documentation is available for people interested in building third-party integrations.

How it works

Argo CD follows the GitOps pattern of using Git repositories as the source of truth for defining the desired application state. Kubernetes manifests can be specified in several ways:

  • kustomize applications
  • helm charts
  • ksonnet applications
  • jsonnet files
  • Plain directory of YAML/json manifests
  • Any custom config management tool configured as a config management plugin

Argo CD automates the deployment of the desired application states in the specified target environments. Application deployments can track updates to branches, tags, or pinned to a specific version of manifests at a Git commit. See tracking strategies for additional details about the different tracking strategies available.

For a quick 10 minute overview of Argo CD, check out the demo presented to the Sig Apps community meeting:

Argo CD Overview Demo


Argo CD Architecture

Argo CD is implemented as a kubernetes controller which continuously monitors running applications and compares the current, live state against the desired target state (as specified in the Git repo). A deployed application whose live state deviates from the target state is considered OutOfSync. Argo CD reports & visualizes the differences, while providing facilities to automatically or manually sync the live state back to the desired target state. Any modifications made to the desired target state in the Git repo can be automatically applied and reflected in the specified target environments.

For additional details, see architecture overview.


  • Automated deployment of applications to specified target environments
  • Support for multiple config management/templating tools (Kustomize, Helm, Ksonnet, Jsonnet, plain-YAML)
  • Ability to manage and deploy to multiple clusters
  • SSO Integration (OIDC, OAuth2, LDAP, SAML 2.0, GitHub, GitLab, Microsoft, LinkedIn)
  • Multi-tenancy and RBAC policies for authorization
  • Rollback/Roll-anywhere to any application configuration committed in Git repository
  • Health status analysis of application resources
  • Automated configuration drift detection and visualization
  • Automated or manual syncing of applications to its desired state
  • Web UI which provides real-time view of application activity
  • CLI for automation and CI integration
  • Webhook integration (GitHub, BitBucket, GitLab)
  • Access tokens for automation
  • PreSync, Sync, PostSync hooks to support complex application rollouts ( & canary upgrades)
  • Audit trails for application events and API calls
  • Prometheus metrics
  • Parameter overrides for overriding ksonnet/helm parameters in Git

Development Status

Argo CD is being actively developed by the community. Our releases can be found here.


Organizations who have officially adopted Argo CD can be found here.

” (WP)


Fair Use Sources:

Bibliography Cloud DevOps Kubernetes Software Engineering

Kubernetes and Containerization Bibliography

See also Kubernetes

Artificial Intelligence Cloud Data Science - Big Data Networking Software Engineering

AIoT – Artificial intelligence of Things

” (WP)

The Artificial Intelligence of Things (AIoT) is the combination of Artificial intelligence (AI) technologies with the Internet of things (IoT) infrastructure to achieve more efficient IoT operations, improve human-machine interactions and enhance data management and analytics[1] [2] [3]

See also


  1. ^ Ghosh, Iman (12 August 2020). “AIoT: When Artificial Intelligence Meets the Internet of Things”Visual Capitalist. Retrieved 22 September 2020.
  2. ^ Lin, Yu-Jin; Chuang, Chen-Wei; Yen, Chun-Yueh; Huang, Sheng-Hsin; Huang, Peng-Wei; Chen, Ju-Yi; Lee, Shuenn-Yuh (March 2019). “Artificial Intelligence of Things Wearable System for Cardiac Disease Detection”2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS): 67–70. doi:10.1109/AICAS.2019.8771630. Retrieved 22 September 2020.
  3. ^ Chu, William Cheng-Chung; Shih, Chihhsiong; Chou, Wen-Yi; Ahamed, Sheikh Iqbal; Hsiung, Pao-Ann (November 2019). “Artificial Intelligence of Things in Sports Science: Weight Training as an Example”Computer52 (11): 52–61. doi:10.1109/MC.2019.2933772ISSN 1558-0814. Retrieved 22 September 2020.


” (WP)


Fair Use Sources:

Data Science - Big Data Software Engineering

Ubiquitous computing – Ubicomp

” (WP)

Ubiquitous computing (or “ubicomp“) is a concept in software engineering and computer science where computing is made to appear anytime and everywhere. In contrast to desktop computingubiquitous computing can occur using any device, in any location, and in any format. A user interacts with the computer, which can exist in many different forms, including laptop computerstablets and terminals in everyday objects such as a refrigerator or a pair of glasses. The underlying technologies to support ubiquitous computing include Internet, advanced middlewareoperating systemmobile codesensorsmicroprocessors, new I/O and user interfacescomputer networks, mobile protocols, location and positioning, and new materials.

This paradigm is also described as pervasive computing,[1] ambient intelligence,[2] or “everyware”.[3] Each term emphasizes slightly different aspects. When primarily concerning the objects involved, it is also known as physical computing, the Internet of Things (IoT)haptic computing,[4] and “things that think”. Rather than propose a single definition for ubiquitous computing and for these related terms, a taxonomy of properties for ubiquitous computing has been proposed, from which different kinds or flavors of ubiquitous systems and applications can be described.[5]

Ubiquitous computing touches on distributed computingmobile computing, location computing, mobile networking, sensor networkshuman–computer interaction, context-aware smart home technologies, and artificial intelligence.

Core concepts

Ubiquitous computing is the concept of using small internet connected and inexpensive computers to help with everyday functions in an automated fashion. For example, a domestic ubiquitous computing environment might interconnect lighting and environmental controls with personal biometric monitors woven into clothing so that illumination and heating conditions in a room might be modulated, continuously and imperceptibly. Another common scenario posits refrigerators “aware” of their suitably tagged contents, able to both plan a variety of menus from the food actually on hand, and warn users of stale or spoiled food.[6]

Ubiquitous computing presents challenges across computer science: in systems design and engineering, in systems modelling, and in user interface design. Contemporary human-computer interaction models, whether command-line, menu-driven, or GUI-based, are inappropriate and inadequate to the ubiquitous case. This suggests that the “natural” interaction paradigm appropriate to a fully robust ubiquitous computing has yet to emerge – although there is also recognition in the field that in many ways we are already living in a ubicomp world (see also the main article on natural user interfaces). Contemporary devices that lend some support to this latter idea include mobile phonesdigital audio playersradio-frequency identification tags, GPS, and interactive whiteboards.

Mark Weiser proposed three basic forms for ubiquitous computing devices:[7]

  • Tabs: a wearable device that is approximately a centimeter in size
  • Pads: a hand-held device that is approximately a decimeter in size
  • Boards: an interactive larger display device that is approximately a meter in size

Ubiquitous computing devices proposed by Mark Weiser are all based around flat devices of different sizes with a visual display.[8] Expanding beyond those concepts there is a large array of other ubiquitous computing devices that could exist. Some of the additional forms that have been conceptualized are:[5]

  • Dust: miniaturized devices can be without visual output displays, e.g. micro electro-mechanical systems (MEMS), ranging from nanometres through micrometers to millimetres. See also Smart dust.
  • Skin: fabrics based upon light emitting and conductive polymers, organic computer devices, can be formed into more flexible non-planar display surfaces and products such as clothes and curtains, see OLED display. MEMS device can also be painted onto various surfaces so that a variety of physical world structures can act as networked surfaces of MEMS.
  • Clay: ensembles of MEMS can be formed into arbitrary three dimensional shapes as artefacts resembling many different kinds of physical object (see also tangible interface).

In Manuel Castells‘ book The Rise of the Network Society, Castells puts forth the concept that there is going to be a continuous evolution of computing devices. He states we will progress from stand-alone microcomputers and decentralized mainframes towards pervasive computing. Castells’ model of a pervasive computing system, uses the example of the Internet as the start of a pervasive computing system. The logical progression from that paradigm is a system where that networking logic becomes applicable in every realm of daily activity, in every location and every context. Castells envisages a system where billions of miniature, ubiquitous inter-communication devices will be spread worldwide, “like pigment in the wall paint”.

Ubiquitous computing may be seen to consist of many layers, each with their own roles, which together form a single system:

  • Layer 1: Task management layer
    • Monitors user task, context and index
    • Map user’s task to need for the services in the environment
    • To manage complex dependencies
  • Layer 2: Environment management layer
    • To monitor a resource and its capabilities
    • To map service need, user level states of specific capabilities
  • Layer 3: Environment layer
    • To monitor a relevant resource
    • To manage reliability of the resources


Mark Weiser coined the phrase “ubiquitous computing” around 1988, during his tenure as Chief Technologist of the Xerox Palo Alto Research Center (PARC). Both alone and with PARC Director and Chief Scientist John Seely Brown, Weiser wrote some of the earliest papers on the subject, largely defining it and sketching out its major concerns.[7][9][10]

Recognizing the effects of extending processing power

Recognizing that the extension of processing power into everyday scenarios would necessitate understandings of social, cultural and psychological phenomena beyond its proper ambit, Weiser was influenced by many fields outside computer science, including “philosophyphenomenologyanthropologypsychologyand sociology of science “. He was explicit about “the humanistic origins of the invisible ideal'”,[10] referencing as well the ironically dystopian Philip K. Dick novel Ubik.

Andy Hopper from Cambridge University UK proposed and demonstrated the concept of “Teleporting” – where applications follow the user wherever he/she moves.

Roy Want, while a researcher and student working under Andy Hopper at Cambridge University, worked on the “Active Badge System”, which is an advanced location computing system where personal mobility that is merged with computing.

Bill Schilit (now at Google) also did some earlier work in this topic, and participated in the early Mobile Computing workshop held in Santa Cruz in 1996.

Ken Sakamura of the University of TokyoJapan leads the Ubiquitous Networking Laboratory (UNL), Tokyo as well as the T-Engine Forum. The joint goal of Sakamura’s Ubiquitous Networking specification and the T-Engine forum, is to enable any everyday device to broadcast and receive information.[11][12]

MIT has also contributed significant research in this field, notably Things That Think consortium (directed by Hiroshi IshiiJoseph A. Paradiso and Rosalind Picard) at the Media Lab[13] and the CSAIL effort known as Project Oxygen.[14] Other major contributors include University of Washington‘s Ubicomp Lab (directed by Shwetak Patel), Dartmouth College‘s DartNets LabGeorgia Tech‘s College of ComputingCornell University‘s People Aware Computing LabNYU‘s Interactive Telecommunications ProgramUC Irvine‘s Department of Informatics, Microsoft ResearchIntel Research and Equator,[15] Ajou University UCRi & CUS.[16]


One of the earliest ubiquitous systems was artist Natalie Jeremijenko‘s “Live Wire”, also known as “Dangling String”, installed at Xerox PARC during Mark Weiser’s time there.[17] This was a piece of string attached to a stepper motor and controlled by a LAN connection; network activity caused the string to twitch, yielding a peripherally noticeable indication of traffic. Weiser called this an example of calm technology.[18]

A present manifestation of this trend is the widespread diffusion of mobile phones. Many mobile phones support high speed data transmission, video services, and other services with powerful computational ability. Although these mobile devices are not necessarily manifestations of ubiquitous computing, there are examples, such as Japan’s Yaoyorozu (“Eight Million Gods”) Project in which mobile devices, coupled with radio frequency identification tags demonstrate that ubiquitous computing is already present in some form.[19]

Ambient Devices has produced an “orb”, a “dashboard”, and a “weather beacon“: these decorative devices receive data from a wireless network and report current events, such as stock prices and the weather, like the Nabaztag produced by Violet Snowden.

The Australian futurist Mark Pesce has produced a highly configurable 52-LED LAMP enabled lamp which uses Wi-Fi named MooresCloud after Gordon Moore.[20]

The Unified Computer Intelligence Corporation launched a device called Ubi – The Ubiquitous Computer designed to allow voice interaction with the home and provide constant access to information.[21]

Ubiquitous computing research has focused on building an environment in which computers allow humans to focus attention on select aspects of the environment and operate in supervisory and policy-making roles. Ubiquitous computing emphasizes the creation of a human computer interface that can interpret and support a user’s intentions. For example, MIT’s Project Oxygen seeks to create a system in which computation is as pervasive as air:

In the future, computation will be human centered. It will be freely available everywhere, like batteries and power sockets, or oxygen in the air we breathe…We will not need to carry our own devices around with us. Instead, configurable generic devices, either handheld or embedded in the environment, will bring computation to us, whenever we need it and wherever we might be. As we interact with these “anonymous” devices, they will adopt our information personalities. They will respect our desires for privacy and security. We won’t have to type, click, or learn new computer jargon. Instead, we’ll communicate naturally, using speech and gestures that describe our intent…[22]

This is a fundamental transition that does not seek to escape the physical world and “enter some metallic, gigabyte-infested cyberspace” but rather brings computers and communications to us, making them “synonymous with the useful tasks they perform”.[19]

Network robots link ubiquitous networks with robots, contributing to the creation of new lifestyles and solutions to address a variety of social problems including the aging of population and nursing care.[23]


Privacy is easily the most often-cited criticism of ubiquitous computing (ubicomp), and may be the greatest barrier to its long-term success.[24]

Public policy problems are often “preceded by long shadows, long trains of activity”, emerging slowly, over decades or even the course of a century. There is a need for a long-term view to guide policy decision making, as this will assist in identifying long-term problems or opportunities related to the ubiquitous computing environment. This information can reduce uncertainty and guide the decisions of both policy makers and those directly involved in system development (Wedemeyer et al. 2001). One important consideration is the degree to which different opinions form around a single problem. Some issues may have strong consensus about their importance, even if there are great differences in opinion regarding the cause or solution. For example, few people will differ in their assessment of a highly tangible problem with physical impact such as terrorists using new weapons of mass destruction to destroy human life. The problem statements outlined above that address the future evolution of the human species or challenges to identity have clear cultural or religious implications and are likely to have greater variance in opinion about them.[19]

Ubiquitous computing research centres

This is a list of notable institutions who claim to have a focus on Ubiquitous computing sorted by country:Canada

Topological Media Lab, Concordia University, CanadaFinland

Community Imaging Group, University of Oulu, FinlandGermany

Tele cooperation Office (TECO), Karlsruhe Institute of Technology, GermanyIndia

Ubiquitous Computing Research Resource Centre (UCRC), Centre for Development of Advanced Computing[25]Pakistan

Centre for Research in Ubiquitous Computing (CRUC), Karachi, Pakistan.Sweden

Mobile Life Centre, Stockholm UniversityUnited Kingdom

Mixed Reality Lab, University of Nottingham

See also


  1. ^ Nieuwdorp, E. (2007). “The pervasive discourse”. Computers in Entertainment5(2): 13. doi:10.1145/1279540.1279553S2CID 17759896.
  2. ^ Hansmann, Uwe (2003). Pervasive Computing: The Mobile World. Springer. ISBN 978-3-540-00218-5.
  3. ^ Greenfield, Adam (2006). Everyware: The Dawning Age of Ubiquitous Computing. New Riders. pp. 11–12. ISBN 978-0-321-38401-0.
  4. ^ “World Haptics Conferences”. Haptics Technical Committee. Archived from the original on 16 November 2011.
  5. a b Poslad, Stefan (2009). Ubiquitous Computing Smart Devices, Smart Environments and Smart Interaction (PDF). Wiley. ISBN 978-0-470-03560-3.
  6. ^ Kang, Byeong-Ho (January 2007). “Ubiquitous Computing Environment Threats and Defensive Measures”International Journal of Multimedia and Ubiquitous Engineering2 (1): 47–60. Retrieved 2019-03-22.
  7. a b Weiser, Mark (1991). “The Computer for the 21st Century”. Archived from the original on 22 October 2014.
  8. ^ Weiser, Mark (March 23, 1993). “Some Computer Science Issues in Ubiquitous Computing”. CACM. Retrieved May 28, 2019.
  9. ^ Weiser, M.; Gold, R.; Brown, J.S. (1999-05-11). “Ubiquitous computing”. Archived from the original on 10 March 2009.
  10. a b Weiser, Mark (17 March 1996). “Ubiquitous computing”. Archived from the original on 2 June 2018.
  11. ^ Krikke, J (2005). “T-Engine: Japan’s ubiquitous computing architecture is ready for prime time”. IEEE Pervasive Computing4 (2): 4–9. doi:10.1109/MPRV.2005.40S2CID 11365911.
  12. ^ “T-Engine Forum Summary”. Archived from the original on 21 October 2018. Retrieved 25 August 2011.
  13. ^ “MIT Media Lab – Things That Think Consortium”MIT. Retrieved 2007-11-03.
  14. ^ “MIT Project Oxygen: Overview”MIT. Retrieved 2007-11-03.
  15. ^ “Equator”UCL. Retrieved 2009-11-19.
  16. ^ “Center of excellence for Ubiquitous System” (in Korean). CUS. Archived from the original on 2 October 2011.
  17. ^ Weiser, Mark (2017-05-03). “Designing Calm Technology”. Retrieved May 27,2019.
  18. ^ Weiser, Mark; Gold, Rich; Brown, John Seely (1999). “The Origins of Ubiquitous Computing Research at PARC in the Late 1980s”IBM Systems Journal38 (4): 693. doi:10.1147/sj.384.0693S2CID 38805890.
  19. a b c Winter, Jenifer (December 2008). “Emerging Policy Problems Related to Ubiquitous Computing: Negotiating Stakeholders’ Visions of the Future”. Knowledge, Technology & Policy21 (4): 191–203. doi:10.1007/s12130-008-9058-4hdl:10125/63534S2CID 109339320.
  20. ^ Fingas, Jon (13 October 2012). “MooresCloud Light runs Linux, puts LAMP on your lamp (video)”. Retrieved 22 March 2019.
  21. ^ “Ubi Cloud”. Archived from the original on 2 January 2015.
  22. ^ “MIT Project Oxygen: Overview”. Archived from the original on July 5, 2004.
  23. ^ “Network Robot Forum”. Archived from the original on October 24, 2007.
  24. ^ Hong, Jason I.; Landay, James A. (June 2004). “An architecture for privacy-sensitive ubiquitous computing” (PDF). Proceedings of the 2nd international conference on Mobile systems, applications, and services – MobiSYS ’04. pp. 177=189. doi:10.1145/990064.990087ISBN 1581137931S2CID 3776760.
  25. ^ “Ubiquitous Computing Projects”Department of Electronics & Information Technology (DeitY). Ministry of Communications & IT, Government of India. Archived from the original on 2015-07-07. Retrieved 2015-07-07.

Further reading

External links

Wikimedia Commons has media related to Ubiquitous computing.


” (WP)


Fair Use Sources:

Artificial Intelligence Cloud Data Science - Big Data Software Engineering

AI – Artificial intelligence

“AI” redirects here. For other uses, see AI (disambiguation) and Artificial intelligence (disambiguation).

See also: Artificial Intelligence (AI) Coined – 1955 AD

” (WP)

Artificial intelligence (AI) is intelligence demonstrated by machines, unlike the natural intelligence displayed by humans and animals, which involves consciousness and emotionality. The distinction between the former and the latter categories is often revealed by the acronym chosen. ‘Strong’ AI is usually labelled as artificial general intelligence (AGI) while attempts to emulate ‘natural’ intelligence have been called artificial biological intelligence (ABI). Leading AI textbooks define the field as the study of “intelligent agents“: any device that perceives its environment and takes actions that maximize its chance of successfully achieving its goals.[3] Colloquially, the term “artificial intelligence” is often used to describe machines that mimic “cognitive” functions that humans associate with the human mind, such as “learning” and “problem solving”.[4]

As machines become increasingly capable, tasks considered to require “intelligence” are often removed from the definition of AI, a phenomenon known as the AI effect.[5] A quip in Tesler’s Theorem says “AI is whatever hasn’t been done yet.”[6] For instance, optical character recognition is frequently excluded from things considered to be AI,[7] having become a routine technology.[8] Modern machine capabilities generally classified as AI include successfully understanding human speech,[9] competing at the highest level in strategic game systems (such as chess and Go),[10] and also imperfect-information games like poker,[11] self-driving cars, intelligent routing in content delivery networks, and military simulations.[12]

Artificial intelligence was founded as an academic discipline in 1955, and in the years since has experienced several waves of optimism,[13][14] followed by disappointment and the loss of funding (known as an “AI winter“),[15][16] followed by new approaches, success and renewed funding.[14][17] After AlphaGo successfully defeated a professional Go player in 2015, artificial intelligence once again attracted widespread global attention.[18] For most of its history, AI research has been divided into sub-fields that often fail to communicate with each other.[19] These sub-fields are based on technical considerations, such as particular goals (e.g. “robotics” or “machine learning“),[20] the use of particular tools (“logic” or artificial neural networks), or deep philosophical differences.[23][24][25] Sub-fields have also been based on social factors (particular institutions or the work of particular researchers).[19]

The traditional problems (or goals) of AI research include reasoningknowledge representationplanninglearningnatural language processingperception and the ability to move and manipulate objects.[20] AGI is among the field’s long-term goals.[26] Approaches include statistical methodscomputational intelligence, and traditional symbolic AI. Many tools are used in AI, including versions of search and mathematical optimization, artificial neural networks, and methods based on statistics, probability and economics. The AI field draws upon computer scienceinformation engineeringmathematicspsychologylinguisticsphilosophy, and many other fields.

The field was founded on the assumption that human intelligence “can be so precisely described that a machine can be made to simulate it”.[27] This raises philosophical arguments about the mind and the ethics of creating artificial beings endowed with human-like intelligence. These issues have been explored by mythfiction and philosophy since antiquity.[32] Some people also consider AI to be a danger to humanity if it progresses unabated.[33][34] Others believe that AI, unlike previous technological revolutions, will create a risk of mass unemployment.[35]

In the twenty-first century, AI techniques have experienced a resurgence following concurrent advances in computer power, large amounts of data, and theoretical understanding; and AI techniques have become an essential part of the technology industry, helping to solve many challenging problems in computer science, software engineering and operations research.[36][17]


Main articles: History of artificial intelligence and Timeline of artificial intelligenceSilver didrachma from Crete depicting Talos, an ancient mythical automaton with artificial intelligence

Thought-capable artificial beings appeared as storytelling devices in antiquity,[37] and have been common in fiction, as in Mary Shelley‘s Frankenstein or Karel Čapek‘s R.U.R.[38] These characters and their fates raised many of the same issues now discussed in the ethics of artificial intelligence.[32]

The study of mechanical or “formal” reasoning began with philosophers and mathematicians in antiquity. The study of mathematical logic led directly to Alan Turing‘s theory of computation, which suggested that a machine, by shuffling symbols as simple as “0” and “1”, could simulate any conceivable act of mathematical deduction. This insight, that digital computers can simulate any process of formal reasoning, is known as the Church–Turing thesis.[39] Along with concurrent discoveries in neurobiologyinformation theory and cybernetics, this led researchers to consider the possibility of building an electronic brain. Turing proposed changing the question from whether a machine was intelligent, to “whether or not it is possible for machinery to show intelligent behaviour”.[40] The first work that is now generally recognized as AI was McCullouch and Pitts‘ 1943 formal design for Turing-complete “artificial neurons”.[41]

The field of AI research was born at a workshop at Dartmouth College in 1956,[42] where the term “Artificial Intelligence” was coined by John McCarthy to distinguish the field from cybernetics and escape the influence of the cyberneticist Norbert Wiener.[43] Attendees Allen Newell (CMU), Herbert Simon (CMU), John McCarthy (MIT), Marvin Minsky (MIT) and Arthur Samuel (IBM) became the founders and leaders of AI research.[44] They and their students produced programs that the press described as “astonishing”:[45] computers were learning checkers strategies (c. 1954)[46] (and by 1959 were reportedly playing better than the average human),[47] solving word problems in algebra, proving logical theorems (Logic Theorist, first run c. 1956) and speaking English.[48] By the middle of the 1960s, research in the U.S. was heavily funded by the Department of Defense[49] and laboratories had been established around the world.[50] AI’s founders were optimistic about the future: Herbert Simon predicted, “machines will be capable, within twenty years, of doing any work a man can do”. Marvin Minsky agreed, writing, “within a generation … the problem of creating ‘artificial intelligence’ will substantially be solved”.[13]

They failed to recognize the difficulty of some of the remaining tasks. Progress slowed and in 1974, in response to the criticism of Sir James Lighthill[51] and ongoing pressure from the US Congress to fund more productive projects, both the U.S. and British governments cut off exploratory research in AI. The next few years would later be called an “AI winter“,[15] a period when obtaining funding for AI projects was difficult.

In the early 1980s, AI research was revived by the commercial success of expert systems,[52] a form of AI program that simulated the knowledge and analytical skills of human experts. By 1985, the market for AI had reached over a billion dollars. At the same time, Japan’s fifth generation computer project inspired the U.S and British governments to restore funding for academic research.[14] However, beginning with the collapse of the Lisp Machine market in 1987, AI once again fell into disrepute, and a second, longer-lasting hiatus began.[16]

The development of metal–oxide–semiconductor (MOS) very-large-scale integration (VLSI), in the form of complementary MOS (CMOS) transistor technology, enabled the development of practical artificial neural network (ANN) technology in the 1980s. A landmark publication in the field was the 1989 book Analog VLSI Implementation of Neural Systems by Carver A. Mead and Mohammed Ismail.[53]

In the late 1990s and early 21st century, AI began to be used for logistics, data miningmedical diagnosis and other areas.[36] The success was due to increasing computational power (see Moore’s law and transistor count), greater emphasis on solving specific problems, new ties between AI and other fields (such as statisticseconomics and mathematics), and a commitment by researchers to mathematical methods and scientific standards.[54] Deep Blue became the first computer chess-playing system to beat a reigning world chess champion, Garry Kasparov, on 11 May 1997.[55]

In 2011, in a Jeopardy! quiz show exhibition match, IBM‘s question answering systemWatson, defeated the two greatest Jeopardy! champions, Brad Rutter and Ken Jennings, by a significant margin.[56] Faster computers, algorithmic improvements, and access to large amounts of data enabled advances in machine learning and perception; data-hungry deep learning methods started to dominate accuracy benchmarks around 2012.[57] The Kinect, which provides a 3D body–motion interface for the Xbox 360 and the Xbox One, uses algorithms that emerged from lengthy AI research[58] as do intelligent personal assistants in smartphones.[59] In March 2016, AlphaGo won 4 out of 5 games of Go in a match with Go champion Lee Sedol, becoming the first computer Go-playing system to beat a professional Go player without handicaps.[10][60] In the 2017 Future of Go SummitAlphaGo won a three-game match with Ke Jie,[61] who at the time continuously held the world No. 1 ranking for two years.[62][63] Deep Blue‘s Murray Campbell called AlphaGo’s victory “the end of an era… board games are more or less done[64] and it’s time to move on.”[65] This marked the completion of a significant milestone in the development of Artificial Intelligence as Go is a relatively complex game, more so than Chess. AlphaGo was later improved, generalized to other games like chess, with AlphaZero;[66] and MuZero[67] to play many different video games, that were previously handled separately,[68] in addition to board games. Other programs handle imperfect-information games; such as for poker at a superhuman level, Pluribus (poker bot)[69] and Cepheus (poker bot).[11] See: General game playing.

According to Bloomberg’s Jack Clark, 2015 was a landmark year for artificial intelligence, with the number of software projects that use AI within Google increased from a “sporadic usage” in 2012 to more than 2,700 projects. Clark also presents factual data indicating the improvements of AI since 2012 supported by lower error rates in image processing tasks.[70] He attributes this to an increase in affordable neural networks, due to a rise in cloud computing infrastructure and to an increase in research tools and datasets.[17] Other cited examples include Microsoft’s development of a Skype system that can automatically translate from one language to another and Facebook’s system that can describe images to blind people.[70] In a 2017 survey, one in five companies reported they had “incorporated AI in some offerings or processes”.[71][72] Around 2016, China greatly accelerated its government funding; given its large supply of data and its rapidly increasing research output, some observers believe it may be on track to becoming an “AI superpower”.[73][74]

By 2020, Natural Language Processing systems such as the enormous GPT-3 (then by far the largest artificial neural network) were matching human performance on pre-existing benchmarks, albeit without the system attaining commonsense understanding of the contents of the benchmarks.[75] DeepMind’s AlphaFold 2 (2020) demonstrated the ability to determine, in hours rather than months, the 3D structure of a protein. Facial recognition advanced to where, under some circumstances, some systems claim to have a 99% accuracy rate.[76]


Computer science defines AI research as the study of “intelligent agents“: any device that perceives its environment and takes actions that maximize its chance of successfully achieving its goals.[3] A more elaborate definition characterizes AI as “a system’s ability to correctly interpret external data, to learn from such data, and to use those learnings to achieve specific goals and tasks through flexible adaptation.”[77]

A typical AI analyzes its environment and takes actions that maximize its chance of success.[3] An AI’s intended utility function (or goal) can be simple (“1 if the AI wins a game of Go, 0 otherwise”) or complex (“Perform actions mathematically similar to ones that succeeded in the past”). Goals can be explicitly defined or induced. If the AI is programmed for “reinforcement learning“, goals can be implicitly induced by rewarding some types of behavior or punishing others.[a] Alternatively, an evolutionary system can induce goals by using a “fitness function” to mutate and preferentially replicate high-scoring AI systems, similar to how animals evolved to innately desire certain goals such as finding food.[78] Some AI systems, such as nearest-neighbor, instead of reason by analogy, these systems are not generally given goals, except to the degree that goals are implicit in their training data.[79] Such systems can still be benchmarked if the non-goal system is framed as a system whose “goal” is to successfully accomplish its narrow classification task.[80]

AI often revolves around the use of algorithms. An algorithm is a set of unambiguous instructions that a mechanical computer can execute.[b] A complex algorithm is often built on top of other, simpler, algorithms. A simple example of an algorithm is the following (optimal for first player) recipe for play at tic-tac-toe:[81]

  1. If someone has a “threat” (that is, two in a row), take the remaining square. Otherwise,
  2. if a move “forks” to create two threats at once, play that move. Otherwise,
  3. take the center square if it is free. Otherwise,
  4. if your opponent has played in a corner, take the opposite corner. Otherwise,
  5. take an empty corner if one exists. Otherwise,
  6. take any empty square.

Many AI algorithms are capable of learning from data; they can enhance themselves by learning new heuristics (strategies, or “rules of thumb”, that have worked well in the past), or can themselves write other algorithms. Some of the “learners” described below, including Bayesian networks, decision trees, and nearest-neighbor, could theoretically, (given infinite data, time, and memory) learn to approximate any function, including which combination of mathematical functions would best describe the world.[citation needed] These learners could therefore derive all possible knowledge, by considering every possible hypothesis and matching them against the data. In practice, it is seldom possible to consider every possibility, because of the phenomenon of “combinatorial explosion“, where the time needed to solve a problem grows exponentially. Much of AI research involves figuring out how to identify and avoid considering a broad range of possibilities unlikely to be beneficial.[82][83] For example, when viewing a map and looking for the shortest driving route from Denver to New York in the East, one can in most cases skip looking at any path through San Francisco or other areas far to the West; thus, an AI wielding a pathfinding algorithm like A* can avoid the combinatorial explosion that would ensue if every possible route had to be ponderously considered.[84]

The earliest (and easiest to understand) approach to AI was symbolism (such as formal logic): “If an otherwise healthy adult has a fever, then they may have influenza“. A second, more general, approach is Bayesian inference: “If the current patient has a fever, adjust the probability they have influenza in such-and-such way”. The third major approach, extremely popular in routine business AI applications, are analogizers such as SVM and nearest-neighbor: “After examining the records of known past patients whose temperature, symptoms, age, and other factors mostly match the current patient, X% of those patients turned out to have influenza”. A fourth approach is harder to intuitively understand, but is inspired by how the brain’s machinery works: the artificial neural network approach uses artificial “neurons” that can learn by comparing itself to the desired output and altering the strengths of the connections between its internal neurons to “reinforce” connections that seemed to be useful. These four main approaches can overlap with each other and with evolutionary systems; for example, neural nets can learn to make inferences, to generalize, and to make analogies. Some systems implicitly or explicitly use multiple of these approaches, alongside many other AI and non-AI algorithms; the best approach is often different depending on the problem.[85][86]

Learning algorithms work on the basis that strategies, algorithms, and inferences that worked well in the past are likely to continue working well in the future. These inferences can be obvious, such as “since the sun rose every morning for the last 10,000 days, it will probably rise tomorrow morning as well”. They can be nuanced, such as “X% of families have geographically separate species with color variants, so there is a Y% chance that undiscovered black swans exist”. Learners also work on the basis of “Occam’s razor“: The simplest theory that explains the data is the likeliest. Therefore, according to Occam’s razor principle, a learner must be designed such that it prefers simpler theories to complex theories, except in cases where the complex theory is proven substantially better.The blue line could be an example of overfitting a linear function due to random noise.

Settling on a bad, overly complex theory gerrymandered to fit all the past training data is known as overfitting. Many systems attempt to reduce overfitting by rewarding a theory in accordance with how well it fits the data, but penalizing the theory in accordance with how complex the theory is.[87] Besides classic overfitting, learners can also disappoint by “learning the wrong lesson”. A toy example is that an image classifier trained only on pictures of brown horses and black cats might conclude that all brown patches are likely to be horses.[88] A real-world example is that, unlike humans, current image classifiers often don’t primarily make judgments from the spatial relationship between components of the picture, and they learn relationships between pixels that humans are oblivious to, but that still correlate with images of certain types of real objects. Modifying these patterns on a legitimate image can result in “adversarial” images that the system misclassifies.[c][89][90]A self-driving car system may use a neural network to determine which parts of the picture seem to match previous training images of pedestrians, and then model those areas as slow-moving but somewhat unpredictable rectangular prisms that must be avoided.

Compared with humans, existing AI lacks several features of human “commonsense reasoning“; most notably, humans have powerful mechanisms for reasoning about “naïve physics” such as space, time, and physical interactions. This enables even young children to easily make inferences like “If I roll this pen off a table, it will fall on the floor”. Humans also have a powerful mechanism of “folk psychology” that helps them to interpret natural-language sentences such as “The city councilmen refused the demonstrators a permit because they advocated violence” (A generic AI has difficulty discerning whether the ones alleged to be advocating violence are the councilmen or the demonstrators[91][92][93]). This lack of “common knowledge” means that AI often makes different mistakes than humans make, in ways that can seem incomprehensible. For example, existing self-driving cars cannot reason about the location nor the intentions of pedestrians in the exact way that humans do, and instead must use non-human modes of reasoning to avoid accidents.[94][95][96]


The cognitive capabilities of current architectures are very limited, using only a simplified version of what intelligence is really capable of. For instance, the human mind has come up with ways to reason beyond measure and logical explanations to different occurrences in life. What would have been otherwise straightforward, an equivalently difficult problem may be challenging to solve computationally as opposed to using the human mind. This gives rise to two classes of models: structuralist and functionalist. The structural models aim to loosely mimic the basic intelligence operations of the mind such as reasoning and logic. The functional model refers to the correlating data to its computed counterpart.[97]

The overall research goal of artificial intelligence is to create technology that allows computers and machines to function in an intelligent manner. The general problem of simulating (or creating) intelligence has been broken down into sub-problems. These consist of particular traits or capabilities that researchers expect an intelligent system to display. The traits described below have received the most attention.[20]

Reasoning, problem solving

Early researchers developed algorithms that imitated step-by-step reasoning that humans use when they solve puzzles or make logical deductions.[98] By the late 1980s and 1990s, AI research had developed methods for dealing with uncertain or incomplete information, employing concepts from probability and economics.[99]

These algorithms proved to be insufficient for solving large reasoning problems because they experienced a “combinatorial explosion”: they became exponentially slower as the problems grew larger.[82] Even humans rarely use the step-by-step deduction that early AI research could model. They solve most of their problems using fast, intuitive judgments.[100]

Knowledge representation

An ontology represents knowledge as a set of concepts within a domain and the relationships between those concepts.Main articles: Knowledge representation and Commonsense knowledge

Knowledge representation[101] and knowledge engineering[102] are central to classical AI research. Some “expert systems” attempt to gather explicit knowledge possessed by experts in some narrow domain. In addition, some projects attempt to gather the “commonsense knowledge” known to the average person into a database containing extensive knowledge about the world. Among the things a comprehensive commonsense knowledge base would contain are: objects, properties, categories and relations between objects;[103] situations, events, states and time;[104] causes and effects;[105] knowledge about knowledge (what we know about what other people know);[106] and many other, less well researched domains. A representation of “what exists” is an ontology: the set of objects, relations, concepts, and properties formally described so that software agents can interpret them. The semantics of these are captured as description logic concepts, roles, and individuals, and typically implemented as classes, properties, and individuals in the Web Ontology Language.[107] The most general ontologies are called upper ontologies, which attempt to provide a foundation for all other knowledge[108] by acting as mediators between domain ontologies that cover specific knowledge about a particular knowledge domain (field of interest or area of concern). Such formal knowledge representations can be used in content-based indexing and retrieval,[109] scene interpretation,[110] clinical decision support,[111] knowledge discovery (mining “interesting” and actionable inferences from large databases),[112] and other areas.[113]

Among the most difficult problems in knowledge representation are:Default reasoning and the qualification problemMany of the things people know take the form of “working assumptions”. For example, if a bird comes up in conversation, people typically picture a fist-sized animal that sings and flies. None of these things are true about all birds. John McCarthy identified this problem in 1969[114] as the qualification problem: for any commonsense rule that AI researchers care to represent, there tend to be a huge number of exceptions. Almost nothing is simply true or false in the way that abstract logic requires. AI research has explored a number of solutions to this problem.[115]Breadth of commonsense knowledgeThe number of atomic facts that the average person knows is very large. Research projects that attempt to build a complete knowledge base of commonsense knowledge (e.g., Cyc) require enormous amounts of laborious ontological engineering—they must be built, by hand, one complicated concept at a time.[116]Subsymbolic form of some commonsense knowledgeMuch of what people know is not represented as “facts” or “statements” that they could express verbally. For example, a chess master will avoid a particular chess position because it “feels too exposed”[117] or an art critic can take one look at a statue and realize that it is a fake.[118] These are non-conscious and sub-symbolic intuitions or tendencies in the human brain.[119] Knowledge like this informs, supports and provides a context for symbolic, conscious knowledge. As with the related problem of sub-symbolic reasoning, it is hoped that situated AIcomputational intelligence, or statistical AI will provide ways to represent this knowledge.[119]


hierarchical control system is a form of control system in which a set of devices and governing software is arranged in a hierarchy.Main article: Automated planning and scheduling

Intelligent agents must be able to set goals and achieve them.[120] They need a way to visualize the future—a representation of the state of the world and be able to make predictions about how their actions will change it—and be able to make choices that maximize the utility (or “value”) of available choices.[121]

In classical planning problems, the agent can assume that it is the only system acting in the world, allowing the agent to be certain of the consequences of its actions.[122] However, if the agent is not the only actor, then it requires that the agent can reason under uncertainty. This calls for an agent that can not only assess its environment and make predictions but also evaluate its predictions and adapt based on its assessment.[123]

Multi-agent planning uses the cooperation and competition of many agents to achieve a given goal. Emergent behavior such as this is used by evolutionary algorithms and swarm intelligence.[124]


Main article: Machine learningFor this project the AI had to find the typical patterns in the colors and brushstrokes of Renaissance painter Raphael. The portrait shows the face of the actress Ornella Muti, “painted” by AI in the style of Raphael.

Machine learning (ML), a fundamental concept of AI research since the field’s inception,[d] is the study of computer algorithms that improve automatically through experience.[e][127]

Unsupervised learning is the ability to find patterns in a stream of input, without requiring a human to label the inputs first. Supervised learning includes both classification and numerical regression, which requires a human to label the input data first. Classification is used to determine what category something belongs in, and occurs after a program sees a number of examples of things from several categories. Regression is the attempt to produce a function that describes the relationship between inputs and outputs and predicts how the outputs should change as the inputs change.[127] Both classifiers and regression learners can be viewed as “function approximators” trying to learn an unknown (possibly implicit) function; for example, a spam classifier can be viewed as learning a function that maps from the text of an email to one of two categories, “spam” or “not spam”. Computational learning theory can assess learners by computational complexity, by sample complexity (how much data is required), or by other notions of optimization.[128] In reinforcement learning[129] the agent is rewarded for good responses and punished for bad ones. The agent uses this sequence of rewards and punishments to form a strategy for operating in its problem space.

Natural language processing

parse tree represents the syntactic structure of a sentence according to some formal grammar.Main article: Natural language processing

Natural language processing[130] (NLP) allows machines to read and understand human language. A sufficiently powerful natural language processing system would enable natural-language user interfaces and the acquisition of knowledge directly from human-written sources, such as newswire texts. Some straightforward applications of natural language processing include information retrievaltext miningquestion answering[131] and machine translation.[132] Many current approaches use word co-occurrence frequencies to construct syntactic representations of text. “Keyword spotting” strategies for search are popular and scalable but dumb; a search query for “dog” might only match documents with the literal word “dog” and miss a document with the word “poodle”. “Lexical affinity” strategies use the occurrence of words such as “accident” to assess the sentiment of a document. Modern statistical NLP approaches can combine all these strategies as well as others, and often achieve acceptable accuracy at the page or paragraph level. Beyond semantic NLP, the ultimate goal of “narrative” NLP is to embody a full understanding of commonsense reasoning.[133] By 2019, transformer-based deep learning architectures could generate coherent text.[134]


Main articles: Machine perceptionComputer vision, and Speech recognitionFeature detection (pictured: edge detection) helps AI compose informative abstract structures out of raw data.

Machine perception[135] is the ability to use input from sensors (such as cameras (visible spectrum or infrared), microphones, wireless signals, and active lidar, sonar, radar, and tactile sensors) to deduce aspects of the world. Applications include speech recognition,[136] facial recognition, and object recognition.[137] Computer vision is the ability to analyze visual input. Such input is usually ambiguous; a giant, fifty-meter-tall pedestrian far away may produce the same pixels as a nearby normal-sized pedestrian, requiring the AI to judge the relative likelihood and reasonableness of different interpretations, for example by using its “object model” to assess that fifty-meter pedestrians do not exist.[138]

Motion and manipulation

Main article: Robotics

AI is heavily used in robotics.[139] Advanced robotic arms and other industrial robots, widely used in modern factories, can learn from experience how to move efficiently despite the presence of friction and gear slippage.[140] A modern mobile robot, when given a small, static, and visible environment, can easily determine its location and map its environment; however, dynamic environments, such as (in endoscopy) the interior of a patient’s breathing body, pose a greater challenge. Motion planning is the process of breaking down a movement task into “primitives” such as individual joint movements. Such movement often involves compliant motion, a process where movement requires maintaining physical contact with an object.[141][142][143] Moravec’s paradox generalizes that low-level sensorimotor skills that humans take for granted are, counterintuitively, difficult to program into a robot; the paradox is named after Hans Moravec, who stated in 1988 that “it is comparatively easy to make computers exhibit adult level performance on intelligence tests or playing checkers, and difficult or impossible to give them the skills of a one-year-old when it comes to perception and mobility”.[144][145] This is attributed to the fact that, unlike checkers, physical dexterity has been a direct target of natural selection for millions of years.[146]

Social intelligence

Main article: Affective computingKismet, a robot with rudimentary social skills[147]

Moravec’s paradox can be extended to many forms of social intelligence.[148][149] Distributed multi-agent coordination of autonomous vehicles remains a difficult problem.[150] Affective computing is an interdisciplinary umbrella that comprises systems which recognize, interpret, process, or simulate human affects.[151][152][153] Moderate successes related to affective computing include textual sentiment analysis and, more recently, multimodal affect analysis (see multimodal sentiment analysis), wherein AI classifies the affects displayed by a videotaped subject.[154]

In the long run, social skills and an understanding of human emotion and game theory would be valuable to a social agent. The ability to predict the actions of others by understanding their motives and emotional states would allow an agent to make better decisions. Some computer systems mimic human emotion and expressions to appear more sensitive to the emotional dynamics of human interaction, or to otherwise facilitate human–computer interaction.[155] Similarly, some virtual assistants are programmed to speak conversationally or even to banter humorously; this tends to give naïve users an unrealistic conception of how intelligent existing computer agents actually are.[156]

General intelligence

Main articles: Artificial general intelligence and AI-complete

Historically, projects such as the Cyc knowledge base (1984–) and the massive Japanese Fifth Generation Computer Systems initiative (1982–1992) attempted to cover the breadth of human cognition. These early projects failed to escape the limitations of non-quantitative symbolic logic models and, in retrospect, greatly underestimated the difficulty of cross-domain AI. Nowadays, most current AI researchers work instead on tractable “narrow AI” applications (such as medical diagnosis or automobile navigation).[157] Many researchers predict that such “narrow AI” work in different individual domains will eventually be incorporated into a machine with artificial general intelligence (AGI), combining most of the narrow skills mentioned in this article and at some point even exceeding human ability in most or all these areas.[26][158] Many advances have general, cross-domain significance. One high-profile example is that DeepMind in the 2010s developed a “generalized artificial intelligence” that could learn many diverse Atari games on its own, and later developed a variant of the system which succeeds at sequential learning.[159][160][161] Besides transfer learning,[162] hypothetical AGI breakthroughs could include the development of reflective architectures that can engage in decision-theoretic metareasoning, and figuring out how to “slurp up” a comprehensive knowledge base from the entire unstructured Web.[163] Some argue that some kind of (currently-undiscovered) conceptually straightforward, but mathematically difficult, “Master Algorithm” could lead to AGI.[164] Finally, a few “emergent” approaches look to simulating human intelligence extremely closely, and believe that anthropomorphic features like an artificial brain or simulated child development may someday reach a critical point where general intelligence emerges.[165][166]

Many of the problems in this article may also require general intelligence, if machines are to solve the problems as well as people do. For example, even specific straightforward tasks, like machine translation, require that a machine read and write in both languages (NLP), follow the author’s argument (reason), know what is being talked about (knowledge), and faithfully reproduce the author’s original intent (social intelligence). A problem like machine translation is considered “AI-complete“, because all of these problems need to be solved simultaneously in order to reach human-level machine performance.


No established unifying theory or paradigm guides AI research. Researchers disagree about many issues.[f] A few of the most long-standing questions that have remained unanswered are these: should artificial intelligence simulate natural intelligence by studying psychology or neurobiology? Or is human biology as irrelevant to AI research as bird biology is to aeronautical engineering?[23] Can intelligent behavior be described using simple, elegant principles (such as logic or optimization)? Or does it necessarily require solving a large number of unrelated problems?[24]

Cybernetics and brain simulation

Main articles: Cybernetics and Computational neuroscience

In the 1940s and 1950s, a number of researchers explored the connection between neurobiologyinformation theory, and cybernetics. Some of them built machines that used electronic networks to exhibit rudimentary intelligence, such as W. Grey Walter‘s turtles and the Johns Hopkins Beast. Many of these researchers gathered for meetings of the Teleological Society at Princeton University and the Ratio Club in England.[168] By 1960, this approach was largely abandoned, although elements of it would be revived in the 1980s.


Main article: Symbolic AI

When access to digital computers became possible in the mid-1950s, AI research began to explore the possibility that human intelligence could be reduced to symbol manipulation. The research was centered in three institutions: Carnegie Mellon UniversityStanford, and MIT, and as described below, each one developed its own style of research. John Haugeland named these symbolic approaches to AI “good old fashioned AI” or “GOFAI“.[169] During the 1960s, symbolic approaches had achieved great success at simulating high-level “thinking” in small demonstration programs. Approaches based on cybernetics or artificial neural networks were abandoned or pushed into the background.[g] Researchers in the 1960s and the 1970s were convinced that symbolic approaches would eventually succeed in creating a machine with artificial general intelligence and considered this the goal of their field.

Cognitive simulation

Economist Herbert Simon and Allen Newell studied human problem-solving skills and attempted to formalize them, and their work laid the foundations of the field of artificial intelligence, as well as cognitive scienceoperations research and management science. Their research team used the results of psychological experiments to develop programs that simulated the techniques that people used to solve problems. This tradition, centered at Carnegie Mellon University would eventually culminate in the development of the Soar architecture in the middle 1980s.[170][171]


Unlike Simon and Newell, John McCarthy felt that machines did not need to simulate human thought, but should instead try to find the essence of abstract reasoning and problem-solving, regardless of whether people used the same algorithms.[23] His laboratory at Stanford (SAIL) focused on using formal logic to solve a wide variety of problems, including knowledge representationplanning and learning.[172] Logic was also the focus of the work at the University of Edinburgh and elsewhere in Europe which led to the development of the programming language Prolog and the science of logic programming.[173]

Anti-logic or scruffy

Researchers at MIT (such as Marvin Minsky and Seymour Papert)[174] found that solving difficult problems in vision and natural language processing required ad hoc solutions—they argued that no simple and general principle (like logic) would capture all the aspects of intelligent behavior. Roger Schank described their “anti-logic” approaches as “scruffy” (as opposed to the “neat” paradigms at CMU and Stanford).[24] Commonsense knowledge bases (such as Doug Lenat‘s Cyc) are an example of “scruffy” AI, since they must be built by hand, one complicated concept at a time.[175]


When computers with large memories became available around 1970, researchers from all three traditions began to build knowledge into AI applications.[176] This “knowledge revolution” led to the development and deployment of expert systems (introduced by Edward Feigenbaum), the first truly successful form of AI software.[52] A key component of the system architecture for all expert systems is the knowledge base, which stores facts and rules that illustrate AI.[177] The knowledge revolution was also driven by the realization that enormous amounts of knowledge would be required by many simple AI applications.


By the 1980s, progress in symbolic AI seemed to stall and many believed that symbolic systems would never be able to imitate all the processes of human cognition, especially perception, robotics, learning and pattern recognition. A number of researchers began to look into “sub-symbolic” approaches to specific AI problems.[25] Sub-symbolic methods manage to approach intelligence without specific representations of knowledge.

Embodied intelligence

This includes embodiedsituatedbehavior-based, and nouvelle AI. Researchers from the related field of robotics, such as Rodney Brooks, rejected symbolic AI and focused on the basic engineering problems that would allow robots to move and survive.[178] Their work revived the non-symbolic point of view of the early cybernetics researchers of the 1950s and reintroduced the use of control theory in AI. This coincided with the development of the embodied mind thesis in the related field of cognitive science: the idea that aspects of the body (such as movement, perception and visualization) are required for higher intelligence.

Within developmental robotics, developmental learning approaches are elaborated upon to allow robots to accumulate repertoires of novel skills through autonomous self-exploration, social interaction with human teachers, and the use of guidance mechanisms (active learning, maturation, motor synergies, etc.).[179][180][181][182]

Computational intelligence and soft computing

Interest in neural networks and “connectionism” was revived by David Rumelhart and others in the middle of the 1980s.[183] Artificial neural networks are an example of soft computing—they are solutions to problems which cannot be solved with complete logical certainty, and where an approximate solution is often sufficient. Other soft computing approaches to AI include fuzzy systemsGrey system theoryevolutionary computation and many statistical tools. The application of soft computing to AI is studied collectively by the emerging discipline of computational intelligence.[184]


Much of traditional GOFAI got bogged down on ad hoc patches to symbolic computation that worked on their own toy models but failed to generalize to real-world results. However, around the 1990s, AI researchers adopted sophisticated mathematical tools, such as hidden Markov models (HMM), information theory, and normative Bayesian decision theory to compare or to unify competing architectures. The shared mathematical language permitted a high level of collaboration with more established fields (like mathematics, economics or operations research).[h] Compared with GOFAI, new “statistical learning” techniques such as HMM and neural networks were gaining higher levels of accuracy in many practical domains such as data mining, without necessarily acquiring a semantic understanding of the datasets. The increased successes with real-world data led to increasing emphasis on comparing different approaches against shared test data to see which approach performed best in a broader context than that provided by idiosyncratic toy models; AI research was becoming more scientific. Nowadays results of experiments are often rigorously measurable, and are sometimes (with difficulty) reproducible.[54][185] Different statistical learning techniques have different limitations; for example, basic HMM cannot model the infinite possible combinations of natural language.[186] Critics note that the shift from GOFAI to statistical learning is often also a shift away from explainable AI. In AGI research, some scholars caution against over-reliance on statistical learning, and argue that continuing research into GOFAI will still be necessary to attain general intelligence.[187][188]

Integrating the approaches

Intelligent agent paradigmAn intelligent agent is a system that perceives its environment and takes actions that maximize its chances of success. The simplest intelligent agents are programs that solve specific problems. More complicated agents include human beings and organizations of human beings (such as firms). The paradigm allows researchers to directly compare or even combine different approaches to isolated problems, by asking which agent is best at maximizing a given “goal function”. An agent that solves a specific problem can use any approach that works—some agents are symbolic and logical, some are sub-symbolic artificial neural networks and others may use new approaches. The paradigm also gives researchers a common language to communicate with other fields—such as decision theory and economics—that also use concepts of abstract agents. Building a complete agent requires researchers to address realistic problems of integration; for example, because sensory systems give uncertain information about the environment, planning systems must be able to function in the presence of uncertainty. The intelligent agent paradigm became widely accepted during the 1990s.[189]Agent architectures and cognitive architecturesResearchers have designed systems to build intelligent systems out of interacting intelligent agents in a multi-agent system.[190] A hierarchical control system provides a bridge between sub-symbolic AI at its lowest, reactive levels and traditional symbolic AI at its highest levels, where relaxed time constraints permit planning and world modeling.[191] Some cognitive architectures are custom-built to solve a narrow problem; others, such as Soar, are designed to mimic human cognition and to provide insight into general intelligence. Modern extensions of Soar are hybrid intelligent systems that include both symbolic and sub-symbolic components.[97][192]


Main article: Computational tools for artificial intelligence


Main article: Applications of artificial intelligence

AI is relevant to any intellectual task.[193] Modern artificial intelligence techniques are pervasive[194] and are too numerous to list here. Frequently, when a technique reaches mainstream use, it is no longer considered artificial intelligence; this phenomenon is described as the AI effect.[195]

High-profile examples of AI include autonomous vehicles (such as drones and self-driving cars), medical diagnosis, creating art (such as poetry), proving mathematical theorems, playing games (such as Chess or Go), search engines (such as Google Search), online assistants (such as Siri), image recognition in photographs, spam filtering, predicting flight delays,[196] prediction of judicial decisions,[197] targeting online advertisements, [193][198][199] and energy storage[200]

With social media sites overtaking TV as a source for news for young people and news organizations increasingly reliant on social media platforms for generating distribution,[201] major publishers now use artificial intelligence (AI) technology to post stories more effectively and generate higher volumes of traffic.[202]

AI can also produce Deepfakes, a content-altering technology. ZDNet reports, “It presents something that did not actually occur,” Though 88% of Americans believe Deepfakes can cause more harm than good, only 47% of them believe they can be targeted. The boom of election year also opens public discourse to threats of videos of falsified politician media.[203]

Philosophy and ethics

Main articles: Philosophy of artificial intelligence and Ethics of artificial intelligence

There are three philosophical questions related to AI:[204]

  1. Whether artificial general intelligence is possible; whether a machine can solve any problem that a human being can solve using intelligence, or if there are hard limits to what a machine can accomplish.
  2. Whether intelligent machines are dangerous; how humans can ensure that machines behave ethically and that they are used ethically.
  3. Whether a machine can have a mindconsciousness and mental states in the same sense that human beings do; if a machine can be sentient, and thus deserve certain rights − and if a machine can intentionally cause harm.

The limits of artificial general intelligence

Main articles: Philosophy of artificial intelligenceTuring testPhysical symbol systems hypothesisDreyfus’ critique of artificial intelligenceThe Emperor’s New Mind, and AI effectAlan Turing’s “polite convention”One need not decide if a machine can “think”; one need only decide if a machine can act as intelligently as a human being. This approach to the philosophical problems associated with artificial intelligence forms the basis of the Turing test.[205]The Dartmouth proposal“Every aspect of learning or any other feature of intelligence can be so precisely described that a machine can be made to simulate it.” This conjecture was printed in the proposal for the Dartmouth Conference of 1956.[206]Newell and Simon’s physical symbol system hypothesis“A physical symbol system has the necessary and sufficient means of general intelligent action.” Newell and Simon argue that intelligence consists of formal operations on symbols.[207]Hubert Dreyfus argues that, on the contrary, human expertise depends on unconscious instinct rather than conscious symbol manipulation, and on having a “feel” for the situation, rather than explicit symbolic knowledge. (See Dreyfus’ critique of AI.)[i][209]Gödelian argumentsGödel himself,[210]John Lucas (in 1961) and Roger Penrose (in a more detailed argument from 1989 onwards) made highly technical arguments that human mathematicians can consistently see the truth of their own “Gödel statements” and therefore have computational abilities beyond that of mechanical Turing machines.[211] However, some people do not agree with the “Gödelian arguments”.[212][213][214]The artificial brain argumentAn argument asserting that the brain can be simulated by machines and, because brains exhibit intelligence, these simulated brains must also exhibit intelligence − ergo, machines can be intelligent. Hans MoravecRay Kurzweil and others have argued that it is technologically feasible to copy the brain directly into hardware and software, and that such a simulation will be essentially identical to the original.[165]The AI effectA hypothesis claiming that machines are already intelligent, but observers have failed to recognize it. For example, when Deep Blue beat Garry Kasparov in chess, the machine could be described as exhibiting intelligence. However, onlookers commonly discount the behavior of an artificial intelligence program by arguing that it is not “real” intelligence, with “real” intelligence being in effect defined as whatever behavior machines cannot do.

Ethical machines

Machines with intelligence have the potential to use their intelligence to prevent harm and minimize the risks; they may have the ability to use ethical reasoning to better choose their actions in the world. As such, there is a need for policy making to devise policies for and regulate artificial intelligence and robotics.[215] Research in this area includes machine ethicsartificial moral agentsfriendly AI and discussion towards building a human rights framework is also in talks.[216]

Joseph Weizenbaum in Computer Power and Human Reason wrote that AI applications cannot, by definition, successfully simulate genuine human empathy and that the use of AI technology in fields such as customer service or psychotherapy[j] was deeply misguided. Weizenbaum was also bothered that AI researchers (and some philosophers) were willing to view the human mind as nothing more than a computer program (a position now known as computationalism). To Weizenbaum these points suggest that AI research devalues human life.[218]

Artificial moral agents

Wendell Wallach introduced the concept of artificial moral agents (AMA) in his book Moral Machines[219] For Wallach, AMAs have become a part of the research landscape of artificial intelligence as guided by its two central questions which he identifies as “Does Humanity Want Computers Making Moral Decisions”[220] and “Can (Ro)bots Really Be Moral”.[221] For Wallach, the question is not centered on the issue of whether machines can demonstrate the equivalent of moral behavior, unlike the constraints which society may place on the development of AMAs.[222]

Machine ethics

Main article: Machine ethics

The field of machine ethics is concerned with giving machines ethical principles, or a procedure for discovering a way to resolve the ethical dilemmas they might encounter, enabling them to function in an ethically responsible manner through their own ethical decision making.[223] The field was delineated in the AAAI Fall 2005 Symposium on Machine Ethics: “Past research concerning the relationship between technology and ethics has largely focused on responsible and irresponsible use of technology by human beings, with a few people being interested in how human beings ought to treat machines. In all cases, only human beings have engaged in ethical reasoning. The time has come for adding an ethical dimension to at least some machines. Recognition of the ethical ramifications of behavior involving machines, as well as recent and potential developments in machine autonomy, necessitate this. In contrast to computer hacking, software property issues, privacy issues and other topics normally ascribed to computer ethics, machine ethics is concerned with the behavior of machines towards human users and other machines. Research in machine ethics is key to alleviating concerns with autonomous systems—it could be argued that the notion of autonomous machines without such a dimension is at the root of all fear concerning machine intelligence. Further, investigation of machine ethics could enable the discovery of problems with current ethical theories, advancing our thinking about Ethics.”[224] Machine ethics is sometimes referred to as machine morality, computational ethics or computational morality. A variety of perspectives of this nascent field can be found in the collected edition “Machine Ethics”[223] that stems from the AAAI Fall 2005 Symposium on Machine Ethics.[224]

Malevolent and friendly AI

Main article: Friendly artificial intelligence

Political scientist Charles T. Rubin believes that AI can be neither designed nor guaranteed to be benevolent.[225] He argues that “any sufficiently advanced benevolence may be indistinguishable from malevolence.” Humans should not assume machines or robots would treat us favorably because there is no a priori reason to believe that they would be sympathetic to our system of morality, which has evolved along with our particular biology (which AIs would not share). Hyper-intelligent software may not necessarily decide to support the continued existence of humanity and would be extremely difficult to stop. This topic has also recently begun to be discussed in academic publications as a real source of risks to civilization, humans, and planet Earth.

One proposal to deal with this is to ensure that the first generally intelligent AI is ‘Friendly AI‘ and will be able to control subsequently developed AIs. Some question whether this kind of check could actually remain in place.

Leading AI researcher Rodney Brooks writes, “I think it is a mistake to be worrying about us developing malevolent AI anytime in the next few hundred years. I think the worry stems from a fundamental error in not distinguishing the difference between the very real recent advances in a particular aspect of AI and the enormity and complexity of building sentient volitional intelligence.”[226]

Lethal autonomous weapons are of concern. Currently, 50+ countries are researching battlefield robots, including the United States, China, Russia, and the United Kingdom. Many people concerned about risk from superintelligent AI also want to limit the use of artificial soldiers and drones.[227]

Machine consciousness, sentience and mind

Main article: Artificial consciousness

If an AI system replicates all key aspects of human intelligence, will that system also be sentient—will it have a mind which has conscious experiences? This question is closely related to the philosophical problem as to the nature of human consciousness, generally referred to as the hard problem of consciousness.


Main articles: Hard problem of consciousness and Theory of mind

David Chalmers identified two problems in understanding the mind, which he named the “hard” and “easy” problems of consciousness.[228] The easy problem is understanding how the brain processes signals, makes plans and controls behavior. The hard problem is explaining how this feels or why it should feel like anything at all. Human information processing is easy to explain, however human subjective experience is difficult to explain.

For example, consider what happens when a person is shown a color swatch and identifies it, saying “it’s red”. The easy problem only requires understanding the machinery in the brain that makes it possible for a person to know that the color swatch is red. The hard problem is that people also know something else—they also know what red looks like. (Consider that a person born blind can know that something is red without knowing what red looks like.)[k] Everyone knows subjective experience exists, because they do it every day (e.g., all sighted people know what red looks like). The hard problem is explaining how the brain creates it, why it exists, and how it is different from knowledge and other aspects of the brain.

Computationalism and functionalism

Main articles: Computationalism and Functionalism (philosophy of mind)

Computationalism is the position in the philosophy of mind that the human mind or the human brain (or both) is an information processing system and that thinking is a form of computing.[229] Computationalism argues that the relationship between mind and body is similar or identical to the relationship between software and hardware and thus may be a solution to the mind-body problem. This philosophical position was inspired by the work of AI researchers and cognitive scientists in the 1960s and was originally proposed by philosophers Jerry Fodor and Hilary Putnam.

Strong AI hypothesis

Main article: Chinese room

The philosophical position that John Searle has named “strong AI” states: “The appropriately programmed computer with the right inputs and outputs would thereby have a mind in exactly the same sense human beings have minds.”[l] Searle counters this assertion with his Chinese room argument, which asks us to look inside the computer and try to find where the “mind” might be.[231]

Robot rights

Main article: Robot rights

If a machine can be created that has intelligence, could it also feel? If it can feel, does it have the same rights as a human? This issue, now known as “robot rights“, is currently being considered by, for example, California’s Institute for the Future, although many critics believe that the discussion is premature.[232][233] Some critics of transhumanism argue that any hypothetical robot rights would lie on a spectrum with animal rights and human rights.[234] The subject is profoundly discussed in the 2010 documentary film Plug & Pray,[235] and many sci fi media such as Star Trek Next Generation, with the character of Commander Data, who fought being disassembled for research, and wanted to “become human”, and the robotic holograms in Voyager.


Main article: Superintelligence

Are there limits to how intelligent machines—or human-machine hybrids—can be? A superintelligence, hyperintelligence, or superhuman intelligence is a hypothetical agent that would possess intelligence far surpassing that of the brightest and most gifted human mind. Superintelligence may also refer to the form or degree of intelligence possessed by such an agent.[158]

Technological singularity

Main articles: Technological singularity and Moore’s law

If research into Strong AI produced sufficiently intelligent software, it might be able to reprogram and improve itself. The improved software would be even better at improving itself, leading to recursive self-improvement.[236] The new intelligence could thus increase exponentially and dramatically surpass humans. Science fiction writer Vernor Vinge named this scenario “singularity“.[237] Technological singularity is when accelerating progress in technologies will cause a runaway effect wherein artificial intelligence will exceed human intellectual capacity and control, thus radically changing or even ending civilization. Because the capabilities of such an intelligence may be impossible to comprehend, the technological singularity is an occurrence beyond which events are unpredictable or even unfathomable.[237][158]

Insane Google technocrat inventor Ray Kurzweil has used Moore’s law (which describes the relentless exponential improvement in digital technology) to calculate that desktop computers will have the same processing power as human brains by the year 2029 and predicts that the singularity will occur in 2045.[237]


Main article: Transhumanism

Robot designer Hans Moravec, cyberneticist Kevin Warwick, and insane Google technocrat inventor Ray Kurzweil have predicted that humans and machines will merge in the future into cyborgs that are more capable and powerful than either.[238] This insane demonic science fiction technocracy police state idea, called transhumanism, has roots in Aldous Huxley and Robert Ettinger.

Edward Fredkin argues that “artificial intelligence is the next stage in evolution”, an idea first proposed by Samuel Butler‘s “Darwin among the Machines” as far back as 1863, and expanded upon by George Dyson in his book of the same name in 1998.[239]


The long-term economic effects of AI are uncertain. A survey of economists showed disagreement about whether the increasing use of robots and AI will cause a substantial increase in long-term unemployment, but they generally agree that it could be a net benefit, if productivity gains are redistributed.[240] A 2017 study by PricewaterhouseCoopers sees the People’s Republic of China gaining economically the most out of AI with 26,1% of GDP until 2030.[241] A February 2020 European Union white paper on artificial intelligence advocated for artificial intelligence for economic benefits, including “improving healthcare (e.g. making diagnosis more precise, enabling better prevention of diseases), increasing the efficiency of farming, contributing to climate change mitigation and adaptation, [and] improving the efficiency of production systems through predictive maintenance”, while acknowledging potential risks.[194]

The relationship between automation and employment is complicated. While automation eliminates old jobs, it also creates new jobs through micro-economic and macro-economic effects.[242] Unlike previous waves of automation, many middle-class jobs may be eliminated by artificial intelligence; The Economist states that “the worry that AI could do to white-collar jobs what steam power did to blue-collar ones during the Industrial Revolution” is “worth taking seriously”.[243] Subjective estimates of the risk vary widely; for example, Michael Osborne and Carl Benedikt Frey estimate 47% of U.S. jobs are at “high risk” of potential automation, while an OECD report classifies only 9% of U.S. jobs as “high risk”.[244][245][246] Jobs at extreme risk range from paralegals to fast food cooks, while job demand is likely to increase for care-related professions ranging from personal healthcare to the clergy.[247] Author Martin Ford and others go further and argue that many jobs are routine, repetitive and (to an AI) predictable; Ford warns that these jobs may be automated in the next couple of decades, and that many of the new jobs may not be “accessible to people with average capability”, even with retraining. Economists point out that in the past technology has tended to increase rather than reduce total employment, but acknowledge that “we’re in uncharted territory” with AI.[35]

The potential negative effects of AI and automation were a major issue for Andrew Yang‘s 2020 presidential campaign in the United States.[248] Irakli Beridze, Head of the Centre for Artificial Intelligence and Robotics at UNICRI, United Nations, has expressed that “I think the dangerous applications for AI, from my point of view, would be criminals or large terrorist organizations using it to disrupt large processes or simply do pure harm. [Terrorists could cause harm] via digital warfare, or it could be a combination of robotics, drones, with AI and other things as well that could be really dangerous. And, of course, other risks come from things like job losses. If we have massive numbers of people losing jobs and don’t find a solution, it will be extremely dangerous. Things like lethal autonomous weapons systems should be properly governed — otherwise there’s massive potential of misuse.”[249]

Risks of narrow AI

Main article: Workplace impact of artificial intelligence

Widespread use of artificial intelligence could have unintended consequences that are dangerous or undesirable. Scientists from the Future of Life Institute, among others, described some short-term research goals to see how AI influences the economy, the laws and ethics that are involved with AI and how to minimize AI security risks. In the long-term, the scientists have proposed to continue optimizing function while minimizing possible security risks that come along with new technologies.[250]

Some are concerned about algorithmic bias, that AI programs may unintentionally become biased after processing data that exhibits bias.[251] Algorithms already have numerous applications in legal systems. An example of this is COMPAS, a commercial program widely used by U.S. courts to assess the likelihood of a defendant becoming a recidivistProPublica claims that the average COMPAS-assigned recidivism risk level of black defendants is significantly higher than the average COMPAS-assigned risk level of white defendants.[252]

Risks of general AI

Main article: Existential risk from artificial general intelligence

Physicist Stephen HawkingMicrosoft founder Bill Gates, history professor Yuval Noah Harari, and SpaceX founder Elon Musk have expressed concerns about the possibility that AI could evolve to the point that humans could not control it, with Hawking theorizing that this could “spell the end of the human race“.[253][254][255][256]

The development of full artificial intelligence could spell the end of the human race. Once humans develop artificial intelligence, it will take off on its own and redesign itself at an ever-increasing rate. Humans, who are limited by slow biological evolution, couldn’t compete and would be superseded.— Stephen Hawking[257]

In his book Superintelligence, philosopher Nick Bostrom provides an argument that artificial intelligence will pose a threat to humankind. He argues that sufficiently intelligent AI, if it chooses actions based on achieving some goal, will exhibit convergent behavior such as acquiring resources or protecting itself from being shut down. If this AI’s goals do not fully reflect humanity’s—one example is an AI told to compute as many digits of pi as possible—it might harm humanity in order to acquire more resources or prevent itself from being shut down, ultimately to better achieve its goal. Bostrom also emphasizes the difficulty of fully conveying humanity’s values to an advanced AI. He uses the hypothetical example of giving an AI the goal to make humans smile to illustrate a misguided attempt. If the AI in that scenario were to become superintelligent, Bostrom argues, it may resort to methods that most humans would find horrifying, such as inserting “electrodes into the facial muscles of humans to cause constant, beaming grins” because that would be an efficient way to achieve its goal of making humans smile.[258] In his book Human Compatible, AI researcher Stuart J. Russell echoes some of Bostrom’s concerns while also proposing an approach to developing provably beneficial machines focused on uncertainty and deference to humans,[259]:173 possibly involving inverse reinforcement learning.[259]:191–193

Concern over risk from artificial intelligence has led to some high-profile donations and investments. A group of prominent tech titans including Peter Thiel, Amazon Web Services and Musk have committed $1 billion to OpenAI, a nonprofit company aimed at championing responsible AI development.[260] The opinion of experts within the field of artificial intelligence is mixed, with sizable fractions both concerned and unconcerned by risk from eventual superhumanly-capable AI.[261] Other technology industry leaders believe that artificial intelligence is helpful in its current form and will continue to assist humans. Oracle CEO Mark Hurd has stated that AI “will actually create more jobs, not less jobs” as humans will be needed to manage AI systems.[262] Facebook CEO Mark Zuckerberg believes AI will “unlock a huge amount of positive things,” such as curing disease and increasing the safety of autonomous cars.[263] In January 2015, Musk donated $10 million to the Future of Life Institute to fund research on understanding AI decision making. The goal of the institute is to “grow wisdom with which we manage” the growing power of technology. Musk also funds companies developing artificial intelligence such as DeepMind and Vicarious to “just keep an eye on what’s going on with artificial intelligence.[264] I think there is potentially a dangerous outcome there.”[265][266]

For the danger of uncontrolled advanced AI to be realized, the hypothetical AI would have to overpower or out-think all of humanity, which a minority of experts argue is a possibility far enough in the future to not be worth researching.[267][268] Other counterarguments revolve around humans being either intrinsically or convergently valuable from the perspective of an artificial intelligence.[269]


Main articles: Regulation of artificial intelligence and Regulation of algorithms

The regulation of artificial intelligence is the development of public sector policies and laws for promoting and regulating artificial intelligence (AI);[270][271] it is therefore related to the broader regulation of algorithms. The regulatory and policy landscape for AI is an emerging issue in jurisdictions globally, including in the European Union.[272] Regulation is considered necessary to both encourage AI and manage associated risks.[273][274] Regulation of AI through mechanisms such as review boards can also be seen as social means to approach the AI control problem.[275]

In fiction

Main article: Artificial intelligence in fictionThe word “robot” itself was coined by Karel Čapek in his 1921 play R.U.R., the title standing for “Rossum’s Universal Robots”

Thought-capable artificial beings appeared as storytelling devices since antiquity,[37] and have been a persistent theme in science fiction.

A common trope in these works began with Mary Shelley‘s Frankenstein, where a human creation becomes a threat to its masters. This includes such works as Arthur C. Clarke’s and Stanley Kubrick’s 2001: A Space Odyssey (both 1968), with HAL 9000, the murderous computer in charge of the Discovery One spaceship, as well as The Terminator (1984) and The Matrix (1999). In contrast, the rare loyal robots such as Gort from The Day the Earth Stood Still (1951) and Bishop from Aliens (1986) are less prominent in popular culture.[276]

Isaac Asimov introduced the Three Laws of Robotics in many books and stories, most notably the “Multivac” series about a super-intelligent computer of the same name. Asimov’s laws are often brought up during lay discussions of machine ethics;[277] while almost all artificial intelligence researchers are familiar with Asimov’s laws through popular culture, they generally consider the laws useless for many reasons, one of which is their ambiguity.[278]

Transhumanism (the merging of humans and machines) is explored in the manga Ghost in the Shell and the science-fiction series Dune. In the 1980s, artist Hajime Sorayama‘s Sexy Robots series were painted and published in Japan depicting the actual organic human form with lifelike muscular metallic skins and later “the Gynoids” book followed that was used by or influenced movie makers including George Lucas and other creatives. Sorayama never considered these organic robots to be real part of nature but always an unnatural product of the human mind, a fantasy existing in the mind even when realized in actual form.

Several works use AI to force us to confront the fundamental question of what makes us human, showing us artificial beings that have the ability to feel, and thus to suffer. This appears in Karel Čapek‘s R.U.R., the films A.I. Artificial Intelligence and Ex Machina, as well as the novel Do Androids Dream of Electric Sheep?, by Philip K. Dick. Dick considers the idea that our understanding of human subjectivity is altered by technology created with artificial intelligence.[279]

See also

Explanatory notes

  1. ^ The act of doling out rewards can itself be formalized or automated into a “reward function“.
  2. ^ Terminology varies; see algorithm characterizations.
  3. ^ Adversarial vulnerabilities can also result in nonlinear systems, or from non-pattern perturbations. Some systems are so brittle that changing a single adversarial pixel predictably induces misclassification.
  4. ^ Alan Turing discussed the centrality of learning as early as 1950, in his classic paper “Computing Machinery and Intelligence“.[125] In 1956, at the original Dartmouth AI summer conference, Ray Solomonoff wrote a report on unsupervised probabilistic machine learning: “An Inductive Inference Machine”.[126]
  5. ^ This is a form of Tom Mitchell‘s widely quoted definition of machine learning: “A computer program is set to learn from an experience E with respect to some task Tand some performance measure P if its performance on T as measured by Pimproves with experience E.”
  6. ^ Nils Nilsson writes: “Simply put, there is wide disagreement in the field about what AI is all about.”[167]
  7. ^ The most dramatic case of sub-symbolic AI being pushed into the background was the devastating critique of perceptrons by Marvin Minsky and Seymour Papert in 1969. See History of AIAI winter, or Frank Rosenblatt.[citation needed]
  8. ^ While such a “victory of the neats” may be a consequence of the field becoming more mature, AIMA states that in practice both neat and scruffy approaches continue to be necessary in AI research.
  9. ^ Dreyfus criticized the necessary condition of the physical symbol systemhypothesis, which he called the “psychological assumption”: “The mind can be viewed as a device operating on bits of information according to formal rules.”[208]
  10. ^ In the early 1970s, Kenneth Colby presented a version of Weizenbaum’s ELIZAknown as DOCTOR which he promoted as a serious therapeutic tool.[217]
  11. ^ This is based on Mary’s Room, a thought experiment first proposed by Frank Jackson in 1982
  12. ^ This version is from Searle (1999), and is also quoted in Dennett 1991, p. 435. Searle’s original formulation was “The appropriately programmed computer really is a mind, in the sense that computers given the right programs can be literally said to understand and have other cognitive states.”[230] Strong AI is defined similarly by Russell & Norvig (2003, p. 947): “The assertion that machines could possibly act intelligently (or, perhaps better, act as if they were intelligent) is called the ‘weak AI’ hypothesis by philosophers, and the assertion that machines that do so are actually thinking (as opposed to simulating thinking) is called the ‘strong AI’ hypothesis.”


  1. ^ Poole, Mackworth & Goebel 1998p. 1.
  2. ^ Russell & Norvig 2003, p. 55.
  3. a b c Definition of AI as the study of intelligent agents:
  4. ^ Russell & Norvig 2009, p. 2.
  5. ^ McCorduck 2004, p. 204
  6. ^ Maloof, Mark. “Artificial Intelligence: An Introduction, p. 37” (PDF). georgetown.eduArchived (PDF) from the original on 25 August 2018.
  7. ^ “How AI Is Getting Groundbreaking Changes In Talent Management And HR Tech”. Hackernoon. Archived from the original on 11 September 2019. Retrieved 14 February 2020.
  8. ^ Schank, Roger C. (1991). “Where’s the AI”. AI magazine. Vol. 12 no. 4. p. 38.
  9. ^ Russell & Norvig 2009.
  10. a b “AlphaGo – Google DeepMind”Archived from the original on 10 March 2016.
  11. a b Bowling, Michael; Burch, Neil; Johanson, Michael; Tammelin, Oskari (9 January 2015). “Heads-up limit hold’em poker is solved”Science347 (6218): 145–149. doi:10.1126/science.1259433ISSN 0036-8075PMID 25574016.
  12. ^ Allen, Gregory (April 2020). “Department of Defense Joint AI Center – Understanding AI Technology” (PDF). – The official site of the Department of Defense Joint Artificial Intelligence CenterArchived (PDF) from the original on 21 April 2020. Retrieved 25 April 2020.
  13. a b Optimism of early AI: * Herbert Simon quote: Simon 1965, p. 96 quoted in Crevier 1993, p. 109. * Marvin Minsky quote: Minsky 1967, p. 2 quoted in Crevier 1993, p. 109.
  14. a b c Boom of the 1980s: rise of expert systemsFifth Generation ProjectAlveyMCCSCI: * McCorduck 2004, pp. 426–441 * Crevier 1993, pp. 161–162,197–203, 211, 240 * Russell & Norvig 2003, p. 24 * NRC 1999, pp. 210–211 * Newquist 1994, pp. 235–248
  15. a b First AI WinterMansfield AmendmentLighthill report * Crevier 1993, pp. 115–117 * Russell & Norvig 2003, p. 22 * NRC 1999, pp. 212–213 * Howe 1994 * Newquist 1994, pp. 189–201
  16. a b Second AI winter: * McCorduck 2004, pp. 430–435 * Crevier 1993, pp. 209–210 * NRC 1999, pp. 214–216 * Newquist 1994, pp. 301–318
  17. a b c AI becomes hugely successful in the early 21st century * Clark 2015b
  18. ^ Haenlein, Michael; Kaplan, Andreas (2019). “A Brief History of Artificial Intelligence: On the Past, Present, and Future of Artificial Intelligence”California Management Review61 (4): 5–14. doi:10.1177/0008125619864925ISSN 0008-1256S2CID 199866730.
  19. a b Pamela McCorduck (2004, p. 424) writes of “the rough shattering of AI in subfields—vision, natural language, decision theory, genetic algorithms, robotics … and these with own sub-subfield—that would hardly have anything to say to each other.”
  20. a b c This list of intelligent traits is based on the topics covered by the major AI textbooks, including: * Russell & Norvig 2003 * Luger & Stubblefield 2004 * Poole, Mackworth & Goebel 1998 * Nilsson 1998
  21. ^ Kolata 1982.
  22. ^ Maker 2006.
  23. a b c Biological intelligence vs. intelligence in general:
    • Russell & Norvig 2003, pp. 2–3, who make the analogy with aeronautical engineering.
    • McCorduck 2004, pp. 100–101, who writes that there are “two major branches of artificial intelligence: one aimed at producing intelligent behavior regardless of how it was accomplished, and the other aimed at modeling intelligent processes found in nature, particularly human ones.”
    • Kolata 1982, a paper in Science, which describes McCarthy’s indifference to biological models. Kolata quotes McCarthy as writing: “This is AI, so we don’t care if it’s psychologically real”.[21] McCarthy recently reiterated his position at the [email protected] conference where he said “Artificial intelligence is not, by definition, simulation of human intelligence”.[22]
  24. a b c Neats vs. scruffies: * McCorduck 2004, pp. 421–424, 486–489 * Crevier 1993, p. 168 * Nilsson 1983, pp. 10–11
  25. a b Symbolic vs. sub-symbolic AI: * Nilsson (1998, p. 7), who uses the term “sub-symbolic”.
  26. a b General intelligence (strong AI) is discussed in popular introductions to AI: * Kurzweil 1999 and Kurzweil 2005
  27. ^ See the Dartmouth proposal, under Philosophy, below.
  28. ^ McCorduck 2004, p. 34.
  29. ^ McCorduck 2004, p. xviii.
  30. ^ McCorduck 2004, p. 3.
  31. ^ McCorduck 2004, pp. 340–400.
  32. a b This is a central idea of Pamela McCorduck‘s Machines Who Think. She writes:
    • “I like to think of artificial intelligence as the scientific apotheosis of a venerable cultural tradition.”[28]
    • “Artificial intelligence in one form or another is an idea that has pervaded Western intellectual history, a dream in urgent need of being realized.”[29]
    • “Our history is full of attempts—nutty, eerie, comical, earnest, legendary and real—to make artificial intelligences, to reproduce what is the essential us—bypassing the ordinary means. Back and forth between myth and reality, our imaginations supplying what our workshops couldn’t, we have engaged for a long time in this odd form of self-reproduction.”[30]
    She traces the desire back to its Hellenistic roots and calls it the urge to “forge the Gods.”[31]
  33. ^ “Stephen Hawking believes AI could be mankind’s last accomplishment”BetaNews. 21 October 2016. Archived from the original on 28 August 2017.
  34. ^ Lombardo P, Boehm I, Nairz K (2020). “RadioComics – Santa Claus and the future of radiology”Eur J Radiol122 (1): 108771. doi:10.1016/j.ejrad.2019.108771PMID 31835078.
  35. a b Ford, Martin; Colvin, Geoff (6 September 2015). “Will robots create more jobs than they destroy?”The GuardianArchived from the original on 16 June 2018. Retrieved 13 January 2018.
  36. a b AI applications widely used behind the scenes: * Russell & Norvig 2003, p. 28 * Kurzweil 2005, p. 265 * NRC 1999, pp. 216–222 * Newquist 1994, pp. 189–201
  37. a b AI in myth: * McCorduck 2004, pp. 4–5 * Russell & Norvig 2003, p. 939
  38. ^ AI in early science fiction. * McCorduck 2004, pp. 17–25
  39. ^ Formal reasoning: * Berlinski, David (2000). The Advent of the Algorithm. Harcourt Books. ISBN 978-0-15-601391-8OCLC 46890682Archived from the original on 26 July 2020. Retrieved 22 August 2020.
  40. ^ Turing, Alan (1948), “Machine Intelligence”, in Copeland, B. Jack (ed.), The Essential Turing: The ideas that gave birth to the computer age, Oxford: Oxford University Press, p. 412, ISBN 978-0-19-825080-7
  41. ^ Russell & Norvig 2009, p. 16.
  42. ^ Dartmouth conference: * McCorduck 2004, pp. 111–136 * Crevier 1993, pp. 47–49, who writes “the conference is generally recognized as the official birthdate of the new science.” * Russell & Norvig 2003, p. 17, who call the conference “the birth of artificial intelligence.” * NRC 1999, pp. 200–201
  43. ^ McCarthy, John (1988). “Review of The Question of Artificial Intelligence“. Annals of the History of Computing10 (3): 224–229., collected in McCarthy, John (1996). “10. Review of The Question of Artificial Intelligence“. Defending AI Research: A Collection of Essays and Reviews. CSLI., p. 73, “[O]ne of the reasons for inventing the term “artificial intelligence” was to escape association with “cybernetics”. Its concentration on analog feedback seemed misguided, and I wished to avoid having either to accept Norbert (not Robert) Wiener as a guru or having to argue with him.”
  44. ^ Hegemony of the Dartmouth conference attendees: * Russell & Norvig 2003, p. 17, who write “for the next 20 years the field would be dominated by these people and their students.” * McCorduck 2004, pp. 129–130
  45. ^ Russell & Norvig 2003, p. 18: “it was astonishing whenever a computer did anything kind of smartish”
  46. ^ Schaeffer J. (2009) Didn’t Samuel Solve That Game?. In: One Jump Ahead. Springer, Boston, MA
  47. ^ Samuel, A. L. (July 1959). “Some Studies in Machine Learning Using the Game of Checkers”. IBM Journal of Research and Development3 (3): 210–229. CiteSeerX
  48. ^ “Golden years” of AI (successful symbolic reasoning programs 1956–1973): * McCorduck 2004, pp. 243–252 * Crevier 1993, pp. 52–107 * Moravec 1988, p. 9 * Russell & Norvig 2003, pp. 18–21 The programs described are Arthur Samuel‘s checkers program for the IBM 701Daniel Bobrow‘s STUDENTNewell and Simon‘s Logic Theorist and Terry Winograd‘s SHRDLU.
  49. ^ DARPA pours money into undirected pure research into AI during the 1960s: * McCorduck 2004, p. 131 * Crevier 1993, pp. 51, 64–65 * NRC 1999, pp. 204–205
  50. ^ AI in England: * Howe 1994
  51. ^ Lighthill 1973.
  52. a b Expert systems: * ACM 1998, I.2.1 * Russell & Norvig 2003, pp. 22–24 * Luger & Stubblefield 2004, pp. 227–331 * Nilsson 1998, chpt. 17.4 * McCorduck 2004, pp. 327–335, 434–435 * Crevier 1993, pp. 145–62, 197–203 * Newquist 1994, pp. 155–183
  53. ^ Mead, Carver A.; Ismail, Mohammed (8 May 1989). Analog VLSI Implementation of Neural Systems (PDF). The Kluwer International Series in Engineering and Computer Science. 80. Norwell, MA: Kluwer Academic Publishersdoi:10.1007/978-1-4613-1639-8ISBN 978-1-4613-1639-8. Archived from the original (PDF) on 6 November 2019. Retrieved 24 January 2020.
  54. a b Formal methods are now preferred (“Victory of the neats“): * Russell & Norvig 2003, pp. 25–26 * McCorduck 2004, pp. 486–487
  55. ^ McCorduck 2004, pp. 480–483.
  56. ^ Markoff 2011.
  57. ^ “Ask the AI experts: What’s driving today’s progress in AI?”McKinsey & CompanyArchived from the original on 13 April 2018. Retrieved 13 April 2018.
  58. ^ Fairhead, Harry (26 March 2011) [Update 30 March 2011]. “Kinect’s AI breakthrough explained”I ProgrammerArchived from the original on 1 February 2016.
  59. ^ Rowinski, Dan (15 January 2013). “Virtual Personal Assistants & The Future Of Your Smartphone [Infographic]”ReadWriteArchived from the original on 22 December 2015.
  60. ^ “Artificial intelligence: Google’s AlphaGo beats Go master Lee Se-dol”BBC News. 12 March 2016. Archived from the original on 26 August 2016. Retrieved 1 October 2016.
  61. ^ Metz, Cade (27 May 2017). “After Win in China, AlphaGo’s Designers Explore New AI”WiredArchived from the original on 2 June 2017.
  62. ^ “World’s Go Player Ratings”. May 2017. Archived from the original on 1 April 2017.
  63. ^ “柯洁迎19岁生日 雄踞人类世界排名第一已两年” (in Chinese). May 2017. Archived from the original on 11 August 2017.
  64. ^ “MuZero: Mastering Go, chess, shogi and Atari without rules”Deepmind. Retrieved 1 March 2021.
  65. ^ Steven Borowiec; Tracey Lien (12 March 2016). “AlphaGo beats human Go champ in milestone for artificial intelligence”Los Angeles Times. Retrieved 13 March2016.
  66. ^ Silver, David; Hubert, Thomas; Schrittwieser, Julian; Antonoglou, Ioannis; Lai, Matthew; Guez, Arthur; Lanctot, Marc; Sifre, Laurent; Kumaran, Dharshan; Graepel, Thore; Lillicrap, Timothy; Simonyan, Karen; Hassabis, Demis (7 December 2018). “A general reinforcement learning algorithm that masters chess, shogi, and go through self-play”Science362 (6419): 1140–1144. Bibcode:2018Sci…362.1140Sdoi:10.1126/science.aar6404PMID 30523106.
  67. ^ Schrittwieser, Julian; Antonoglou, Ioannis; Hubert, Thomas; Simonyan, Karen; Sifre, Laurent; Schmitt, Simon; Guez, Arthur; Lockhart, Edward; Hassabis, Demis; Graepel, Thore; Lillicrap, Timothy (23 December 2020). “Mastering Atari, Go, chess and shogi by planning with a learned model”Nature588 (7839): 604–609. arXiv:1911.08265doi:10.1038/s41586-020-03051-4ISSN 1476-4687.
  68. ^ Tung, Liam. “Google’s DeepMind artificial intelligence aces Atari gaming challenge”ZDNet. Retrieved 1 March 2021.
  69. ^ Solly, Meilan. “This Poker-Playing A.I. Knows When to Hold ‘Em and When to Fold ‘Em”SmithsonianPluribus has bested poker pros in a series of six-player no-limit Texas Hold’em games, reaching a milestone in artificial intelligence research. It is the first bot to beat humans in a complex multiplayer competition.
  70. a b Clark 2015b. “After a half-decade of quiet breakthroughs in artificial intelligence, 2015 has been a landmark year. Computers are smarter and learning faster than ever.”
  71. ^ “Reshaping Business With Artificial Intelligence”MIT Sloan Management ReviewArchived from the original on 19 May 2018. Retrieved 2 May 2018.
  72. ^ Lorica, Ben (18 December 2017). “The state of AI adoption”O’Reilly MediaArchived from the original on 2 May 2018. Retrieved 2 May 2018.
  73. ^ Allen, Gregory (6 February 2019). “Understanding China’s AI Strategy”Center for a New American SecurityArchived from the original on 17 March 2019.
  74. ^ “Review | How two AI superpowers – the U.S. and China – battle for supremacy in the field”The Washington Post. 2 November 2018. Archived from the original on 4 November 2018. Retrieved 4 November 2018.
  75. ^ Anadiotis, George (1 October 2020). “The state of AI in 2020: Democratization, industrialization, and the way to artificial general intelligence”ZDNet. Retrieved 1 March 2021.
  76. ^ Heath, Nick (11 December 2020). “What is AI? Everything you need to know about Artificial Intelligence”ZDNet. Retrieved 1 March 2021.
  77. ^ Kaplan, Andreas; Haenlein, Michael (1 January 2019). “Siri, Siri, in my hand: Who’s the fairest in the land? On the interpretations, illustrations, and implications of artificial intelligence”. Business Horizons62 (1): 15–25. doi:10.1016/j.bushor.2018.08.004.
  78. ^ Domingos 2015, Chapter 5.
  79. ^ Domingos 2015, Chapter 7.
  80. ^ Lindenbaum, M., Markovitch, S., & Rusakov, D. (2004). Selective sampling for nearest neighbor classifiers. Machine learning, 54(2), 125–152.
  81. ^ Domingos 2015, Chapter 1.
  82. a b Intractability and efficiency and the combinatorial explosion: * Russell & Norvig 2003, pp. 9, 21–22
  83. ^ Domingos 2015, Chapter 2, Chapter 3.
  84. ^ Hart, P. E.; Nilsson, N. J.; Raphael, B. (1972). “Correction to “A Formal Basis for the Heuristic Determination of Minimum Cost Paths””. SIGART Newsletter (37): 28–29. doi:10.1145/1056777.1056779S2CID 6386648.
  85. ^ Domingos 2015, Chapter 2, Chapter 4, Chapter 6.
  86. ^ “Can neural network computers learn from experience, and if so, could they ever become what we would call ‘smart’?”Scientific American. 2018. Archived from the original on 25 March 2018. Retrieved 24 March 2018.
  87. ^ Domingos 2015, Chapter 6, Chapter 7.
  88. ^ Domingos 2015, p. 286.
  89. ^ “Single pixel change fools AI programs”BBC News. 3 November 2017. Archived from the original on 22 March 2018. Retrieved 12 March 2018.
  90. ^ “AI Has a Hallucination Problem That’s Proving Tough to Fix”WIRED. 2018. Archived from the original on 12 March 2018. Retrieved 12 March 2018.
  91. ^ “Cultivating Common Sense |”Discover Magazine. 2017. Archived from the original on 25 March 2018. Retrieved 24 March 2018.
  92. ^ Davis, Ernest; Marcus, Gary (24 August 2015). “Commonsense reasoning and commonsense knowledge in artificial intelligence”Communications of the ACM58 (9): 92–103. doi:10.1145/2701413S2CID 13583137Archived from the original on 22 August 2020. Retrieved 6 April 2020.
  93. ^ Winograd, Terry (January 1972). “Understanding natural language”. Cognitive Psychology3 (1): 1–191. doi:10.1016/0010-0285(72)90002-3.
  94. ^ “Don’t worry: Autonomous cars aren’t coming tomorrow (or next year)”Autoweek. 2016. Archived from the original on 25 March 2018. Retrieved 24 March 2018.
  95. ^ Knight, Will (2017). “Boston may be famous for bad drivers, but it’s the testing ground for a smarter self-driving car”MIT Technology ReviewArchived from the original on 22 August 2020. Retrieved 27 March 2018.
  96. ^ Prakken, Henry (31 August 2017). “On the problem of making autonomous vehicles conform to traffic law”Artificial Intelligence and Law25 (3): 341–363. doi:10.1007/s10506-017-9210-0.
  97. a b Lieto, Antonio; Lebiere, Christian; Oltramari, Alessandro (May 2018). “The knowledge level in cognitive architectures: Current limitations and possible developments”. Cognitive Systems Research48: 39–55. doi:10.1016/j.cogsys.2017.05.001hdl:2318/1665207S2CID 206868967.
  98. ^ Problem solving, puzzle solving, game playing and deduction: * Russell & Norvig 2003, chpt. 3–9, * Poole, Mackworth & Goebel 1998, chpt. 2,3,7,9, * Luger & Stubblefield 2004, chpt. 3,4,6,8, * Nilsson 1998, chpt. 7–12
  99. ^ Uncertain reasoning: * Russell & Norvig 2003, pp. 452–644, * Poole, Mackworth & Goebel 1998, pp. 345–395, * Luger & Stubblefield 2004, pp. 333–381, * Nilsson 1998, chpt. 19
  100. ^ Psychological evidence of sub-symbolic reasoning: * Wason & Shapiro (1966)showed that people do poorly on completely abstract problems, but if the problem is restated to allow the use of intuitive social intelligence, performance dramatically improves. (See Wason selection task) * Kahneman, Slovic & Tversky (1982) have shown that people are terrible at elementary problems that involve uncertain reasoning. (See list of cognitive biases for several examples). * Lakoff & Núñez (2000) have controversially argued that even our skills at mathematics depend on knowledge and skills that come from “the body”, i.e. sensorimotor and perceptual skills. (See Where Mathematics Comes From)
  101. ^ Knowledge representation: * ACM 1998, I.2.4, * Russell & Norvig 2003, pp. 320–363, * Poole, Mackworth & Goebel 1998, pp. 23–46, 69–81, 169–196, 235–277, 281–298, 319–345, * Luger & Stubblefield 2004, pp. 227–243, * Nilsson 1998, chpt. 18
  102. ^ Knowledge engineering: * Russell & Norvig 2003, pp. 260–266, * Poole, Mackworth & Goebel 1998, pp. 199–233, * Nilsson 1998, chpt. ≈17.1–17.4
  103. ^ Representing categories and relations: Semantic networksdescription logicsinheritance (including frames and scripts): * Russell & Norvig 2003, pp. 349–354, * Poole, Mackworth & Goebel 1998, pp. 174–177, * Luger & Stubblefield 2004, pp. 248–258, * Nilsson 1998, chpt. 18.3
  104. ^ Representing events and time:Situation calculusevent calculusfluent calculus(including solving the frame problem): * Russell & Norvig 2003, pp. 328–341, * Poole, Mackworth & Goebel 1998, pp. 281–298, * Nilsson 1998, chpt. 18.2
  105. ^ Causal calculus: * Poole, Mackworth & Goebel 1998, pp. 335–337
  106. ^ Representing knowledge about knowledge: Belief calculus, modal logics: * Russell & Norvig 2003, pp. 341–344, * Poole, Mackworth & Goebel 1998, pp. 275–277
  107. ^ Sikos, Leslie F. (June 2017). Description Logics in Multimedia Reasoning. Cham: Springer. doi:10.1007/978-3-319-54066-5ISBN 978-3-319-54066-5S2CID 3180114Archived from the original on 29 August 2017.
  108. ^ Ontology: * Russell & Norvig 2003, pp. 320–328
  109. ^ Smoliar, Stephen W.; Zhang, HongJiang (1994). “Content based video indexing and retrieval”. IEEE Multimedia1 (2): 62–72. doi:10.1109/93.311653S2CID 32710913.
  110. ^ Neumann, Bernd; Möller, Ralf (January 2008). “On scene interpretation with description logics”. Image and Vision Computing26 (1): 82–101. doi:10.1016/j.imavis.2007.08.013.
  111. ^ Kuperman, G. J.; Reichley, R. M.; Bailey, T. C. (1 July 2006). “Using Commercial Knowledge Bases for Clinical Decision Support: Opportunities, Hurdles, and Recommendations”Journal of the American Medical Informatics Association13(4): 369–371. doi:10.1197/jamia.M2055PMC 1513681PMID 16622160.
  112. ^ MCGARRY, KEN (1 December 2005). “A survey of interestingness measures for knowledge discovery”. The Knowledge Engineering Review20 (1): 39–61. doi:10.1017/S0269888905000408S2CID 14987656.
  113. ^ Bertini, M; Del Bimbo, A; Torniai, C (2006). “Automatic annotation and semantic retrieval of video sequences using multimedia ontologies”. MM ’06 Proceedings of the 14th ACM international conference on Multimedia. 14th ACM international conference on Multimedia. Santa Barbara: ACM. pp. 679–682.
  114. ^ Qualification problem: * McCarthy & Hayes 1969 * Russell & Norvig 2003[page needed] While McCarthy was primarily concerned with issues in the logical representation of actions, Russell & Norvig 2003 apply the term to the more general issue of default reasoning in the vast network of assumptions underlying all our commonsense knowledge.
  115. ^ Default reasoning and default logicnon-monotonic logicscircumscriptionclosed world assumptionabduction (Poole et al. places abduction under “default reasoning”. Luger et al. places this under “uncertain reasoning”): * Russell & Norvig 2003, pp. 354–360, * Poole, Mackworth & Goebel 1998, pp. 248–256, 323–335, * Luger & Stubblefield 2004, pp. 335–363, * Nilsson 1998, ~18.3.3
  116. ^ Breadth of commonsense knowledge: * Russell & Norvig 2003, p. 21, * Crevier 1993, pp. 113–114, * Moravec 1988, p. 13, * Lenat & Guha 1989 (Introduction)
  117. ^ Dreyfus & Dreyfus 1986.
  118. ^ Gladwell 2005.
  119. a b Expert knowledge as embodied intuition: * Dreyfus & Dreyfus 1986 (Hubert Dreyfus is a philosopher and critic of AI who was among the first to argue that most useful human knowledge was encoded sub-symbolically. See Dreyfus’ critique of AI) * Gladwell 2005 (Gladwell’s Blink is a popular introduction to sub-symbolic reasoning and knowledge.) * Hawkins & Blakeslee 2005 (Hawkins argues that sub-symbolic knowledge should be the primary focus of AI research.)
  120. ^ Planning: * ACM 1998, ~I.2.8, * Russell & Norvig 2003, pp. 375–459, * Poole, Mackworth & Goebel 1998, pp. 281–316, * Luger & Stubblefield 2004, pp. 314–329, * Nilsson 1998, chpt. 10.1–2, 22
  121. ^ Information value theory: * Russell & Norvig 2003, pp. 600–604
  122. ^ Classical planning: * Russell & Norvig 2003, pp. 375–430, * Poole, Mackworth & Goebel 1998, pp. 281–315, * Luger & Stubblefield 2004, pp. 314–329, * Nilsson 1998, chpt. 10.1–2, 22
  123. ^ Planning and acting in non-deterministic domains: conditional planning, execution monitoring, replanning and continuous planning: * Russell & Norvig 2003, pp. 430–449
  124. ^ Multi-agent planning and emergent behavior: * Russell & Norvig 2003, pp. 449–455
  125. ^ Turing 1950.
  126. ^ Solomonoff 1956.
  127. a b Learning: * ACM 1998, I.2.6, * Russell & Norvig 2003, pp. 649–788, * Poole, Mackworth & Goebel 1998, pp. 397–438, * Luger & Stubblefield 2004, pp. 385–542, * Nilsson 1998, chpt. 3.3, 10.3, 17.5, 20
  128. ^ Jordan, M. I.; Mitchell, T. M. (16 July 2015). “Machine learning: Trends, perspectives, and prospects”. Science349 (6245): 255–260. Bibcode:2015Sci…349..255Jdoi:10.1126/science.aaa8415PMID 26185243S2CID 677218.
  129. ^ Reinforcement learning: * Russell & Norvig 2003, pp. 763–788 * Luger & Stubblefield 2004, pp. 442–449
  130. ^ Natural language processing: * ACM 1998, I.2.7 * Russell & Norvig 2003, pp. 790–831 * Poole, Mackworth & Goebel 1998, pp. 91–104 * Luger & Stubblefield 2004, pp. 591–632
  131. ^ “Versatile question answering systems: seeing in synthesis” Archived 1 February 2016 at the Wayback Machine, Mittal et al., IJIIDS, 5(2), 119–142, 2011
  132. ^ Applications of natural language processing, including information retrieval (i.e. text mining) and machine translation: * Russell & Norvig 2003, pp. 840–857, * Luger & Stubblefield 2004, pp. 623–630
  133. ^ Cambria, Erik; White, Bebo (May 2014). “Jumping NLP Curves: A Review of Natural Language Processing Research [Review Article]”. IEEE Computational Intelligence Magazine9 (2): 48–57. doi:10.1109/MCI.2014.2307227S2CID 206451986.
  134. ^ Vincent, James (7 November 2019). “OpenAI has published the text-generating AI it said was too dangerous to share”The VergeArchived from the original on 11 June 2020. Retrieved 11 June 2020.
  135. ^ Machine perception: * Russell & Norvig 2003, pp. 537–581, 863–898 * Nilsson 1998, ~chpt. 6
  136. ^ Speech recognition: * ACM 1998, ~I.2.7 * Russell & Norvig 2003, pp. 568–578
  137. ^ Object recognition: * Russell & Norvig 2003, pp. 885–892
  138. ^ Computer vision: * ACM 1998, I.2.10 * Russell & Norvig 2003, pp. 863–898 * Nilsson 1998, chpt. 6
  139. ^ Robotics: * ACM 1998, I.2.9, * Russell & Norvig 2003, pp. 901–942, * Poole, Mackworth & Goebel 1998, pp. 443–460
  140. ^ Moving and configuration space: * Russell & Norvig 2003, pp. 916–932
  141. ^ Tecuci 2012.
  142. ^ Robotic mapping (localization, etc): * Russell & Norvig 2003, pp. 908–915
  143. ^ Cadena, Cesar; Carlone, Luca; Carrillo, Henry; Latif, Yasir; Scaramuzza, Davide; Neira, Jose; Reid, Ian; Leonard, John J. (December 2016). “Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age”. IEEE Transactions on Robotics32 (6): 1309–1332. arXiv:1606.05830Bibcode:2016arXiv160605830Cdoi:10.1109/TRO.2016.2624754S2CID 2596787.
  144. ^ Moravec 1988, p. 15.
  145. ^ Chan, Szu Ping (15 November 2015). “This is what will happen when robots take over the world”Archived from the original on 24 April 2018. Retrieved 23 April2018.
  146. ^ “IKEA furniture and the limits of AI”The Economist. 2018. Archived from the original on 24 April 2018. Retrieved 24 April 2018.
  147. ^ “Kismet”. MIT Artificial Intelligence Laboratory, Humanoid Robotics Group. Archived from the original on 17 October 2014. Retrieved 25 October 2014.
  148. ^ Thompson, Derek (2018). “What Jobs Will the Robots Take?”The AtlanticArchived from the original on 24 April 2018. Retrieved 24 April 2018.
  149. ^ Scassellati, Brian (2002). “Theory of mind for a humanoid robot”. Autonomous Robots12 (1): 13–24. doi:10.1023/A:1013298507114S2CID 1979315.
  150. ^ Cao, Yongcan; Yu, Wenwu; Ren, Wei; Chen, Guanrong (February 2013). “An Overview of Recent Progress in the Study of Distributed Multi-Agent Coordination”. IEEE Transactions on Industrial Informatics9 (1): 427–438. arXiv:1207.3231doi:10.1109/TII.2012.2219061S2CID 9588126.
  151. ^ Thro 1993.
  152. ^ Edelson 1991.
  153. ^ Tao & Tan 2005.
  154. ^ Poria, Soujanya; Cambria, Erik; Bajpai, Rajiv; Hussain, Amir (September 2017). “A review of affective computing: From unimodal analysis to multimodal fusion”. Information Fusion37: 98–125. doi:10.1016/j.inffus.2017.02.003hdl:1893/25490.
  155. ^ Emotion and affective computing: * Minsky 2006
  156. ^ Waddell, Kaveh (2018). “Chatbots Have Entered the Uncanny Valley”The AtlanticArchived from the original on 24 April 2018. Retrieved 24 April 2018.
  157. ^ Pennachin, C.; Goertzel, B. (2007). “Contemporary Approaches to Artificial General Intelligence”. Artificial General Intelligence. Cognitive Technologies. Berlin, Heidelberg: Springer. doi:10.1007/978-3-540-68677-4_1ISBN 978-3-540-23733-4.
  158. a b c Roberts, Jacob (2016). “Thinking Machines: The Search for Artificial Intelligence”Distillations. Vol. 2 no. 2. pp. 14–23. Archived from the original on 19 August 2018. Retrieved 20 March 2018.
  159. ^ “The superhero of artificial intelligence: can this genius keep it in check?”the Guardian. 16 February 2016. Archived from the original on 23 April 2018. Retrieved 26 April 2018.
  160. ^ Mnih, Volodymyr; Kavukcuoglu, Koray; Silver, David; Rusu, Andrei A.; Veness, Joel; Bellemare, Marc G.; Graves, Alex; Riedmiller, Martin; Fidjeland, Andreas K.; Ostrovski, Georg; Petersen, Stig; Beattie, Charles; Sadik, Amir; Antonoglou, Ioannis; King, Helen; Kumaran, Dharshan; Wierstra, Daan; Legg, Shane; Hassabis, Demis (26 February 2015). “Human-level control through deep reinforcement learning”. Nature518 (7540): 529–533. Bibcode:2015Natur.518..529Mdoi:10.1038/nature14236PMID 25719670S2CID 205242740.
  161. ^ Sample, Ian (14 March 2017). “Google’s DeepMind makes AI program that can learn like a human”the GuardianArchived from the original on 26 April 2018. Retrieved 26 April 2018.
  162. ^ “From not working to neural networking”The Economist. 2016. Archived from the original on 31 December 2016. Retrieved 26 April 2018.
  163. ^