diff --git a/1-Intro/README.md b/1-Intro/README.md index d9e4564546dfada01507dda36925a47689b7cd93..d0cf425ee0f7bb269de85ae368e3689f0c4c818c 100644 --- a/1-Intro/README.md +++ b/1-Intro/README.md @@ -10,7 +10,7 @@ Originally, computers were invented by [Charles Babbage](https://en.wikipedia.org/wiki/Charles_Babbage) to operate on numbers following a well-defined procedure - an algorithm. Modern computers, even though significantly more advanced than the original model proposed in the 19th century, still follow the same idea of controlled computations. Thus it is possible to program a computer to do something if we know the exact sequence of steps that we need to do in order to achieve the goal. - + > Photo by [Vickie Soshnikova](http://twitter.com/vickievalerie) @@ -40,7 +40,7 @@ To see the ambiguity of a term *intelligence*, try answering a question: "Is a c --- -When speaking about AGI we need to have some way to tell if we have created a truly intelligent system. [Alan Turing](https://en.wikipedia.org/wiki/Alan_Turing) proposed a way called a **[Turing Test](https://en.wikipedia.org/wiki/Turing_test)**, which also acts like a definition of intelligence. The test compares a given system to something inherently intelligent - a real human being. And because any automatic comparison can be bypassed by a computer program, we use a human interrogator. So, if a human being is unable to distinguish between a real person and a computer system in text-based dialogue - the system is considered intelligent. +When speaking about AGI we need to have some way to tell if we have created a truly intelligent system. [Alan Turing](https://en.wikipedia.org/wiki/Alan_Turing) proposed a way called a **[Turing Test](https://en.wikipedia.org/wiki/Turing_test)**, which also acts like a definition of intelligence. The test compares a given system to something inherently intelligent - a real human being, and because any automatic comparison can be bypassed by a computer program, we use a human interrogator. So, if a human being is unable to distinguish between a real person and a computer system in text-based dialogue - the system is considered intelligent. > A chat-bot called [Eugene Goostman](https://en.wikipedia.org/wiki/Eugene_Goostman), developed in St.Petersburg, came close to passing the Turing test in 2014 by using a clever personality trick. It announced up front that it was a 13-year old Ukrainian boy, which would explain the lack of knowledge and some discrepancies in the text. The bot convinced 30% of the judges that it was human after a 5 minute dialogue, a metric that Turing believed a machine would be able to pass by 2000. However, one should understand that this does not indicate that we have created an intelligent system, or that a computer system has fooled the human interrogator - the system didn't fool the humans, but rather the bot creators did! @@ -68,22 +68,21 @@ We will consider those approaches later in the course, but right now we will foc ### The Top-Down Approach -In a **top-down approach**, we try to model our reasoning. Because we can follow our thoughts when we reason, we can try to formalize this process and program it inside the computer. This is called **symbolic reasoning**. +In a **top-down approach**, we try to model our reasoning. Because we can follow our thoughts when we reason, we can try to formalize this process and program it inside the computer. This is called **symbolic reasoning**. -People tend to have some rules in their head that guide their decision making processes. For example, when a doctor is diagnosing a patient, he or she may realize that a person has a fever, and thus there might be some inflammation going on inside the body. By applying a large set of rules to a specific problem a doctor may be able to come up with the final diagnosis. +People tend to have some rules in their head that guide their decision making processes. For example, when a doctor is diagnosing a patient, he or she may realize that a person has a fever, and thus there might be some inflammation going on inside the body. By applying a large set of rules to a specific problem a doctor may be able to come up with the final diagnosis. This approach relies heavily on **knowledge representation** and **reasoning**. Extracting knowledge from a human expert might be the most difficult part, because a doctor in many cases would not know exactly why he or she is coming up with a particular diagnosis. Sometimes the solution just comes up in his or her head without explicit thinking. Some tasks, such as determining the age of a person from a photograph, cannot be at all reduced to manipulating knowledge. ### Bottom-Up Approach -Alternately, we can try to model the simplest elements inside our brain – a neuron. We can construct a so-called **artificial neural network** inside a computer, and then try to teach it to solve problems by giving it examples. This process is similar to how a newborn child learns about his or her surroundings by making observations. +Alternately, we can try to model the simplest elements inside our brain – a neuron. We can construct a so-called **artificial neural network** inside a computer, and then try to teach it to solve problems by giving it examples. This process is similar to how a newborn child learns about his or her surroundings by making observations. ✅ Do a little research on how babies learn. What are the basic elements of a baby's brain? > | What about ML? | | > |--------------|-----------| -> | Part of Artificial Intelligence that is based on computer learning to solve a problem based on some data is called **Machine Learning**. We will not consider classical machine learning in this course - we refer you to a separate [Machine Learning for Beginners](http://aka.ms/ml-beginners) curriculum. |  | - +> | Part of Artificial Intelligence that is based on computer learning to solve a problem based on some data is called **Machine Learning**. We will not consider classical machine learning in this course - we refer you to a separate [Machine Learning for Beginners](http://aka.ms/ml-beginners) curriculum. |  | ## A Brief History of AI @@ -127,7 +126,7 @@ Since then, Neural Networks demonstrated very successful behaviour in many tasks --- -Year | Human Parity achieved +Year | Human Parity achieved -----|-------- 2015 | [Image Classification](https://doi.org/10.1109/ICCV.2015.123) 2016 | [Conversational Speech Recognition](https://arxiv.org/abs/1610.05256) diff --git a/1-Intro/assignment.md b/1-Intro/assignment.md index d08f14b26b969f01f9e0bf5305a88b6a33516c98..99aca7662bdcd19c46f3ff5dfed7c792331a3d84 100644 --- a/1-Intro/assignment.md +++ b/1-Intro/assignment.md @@ -1,3 +1,3 @@ # Game Jam -Games are an area that have been heavily influence by developments in AI and ML. In this assignment, write a short paper on a game that you like that has been influenced by the evolution of AI. It should be an old enough game to have been influenced by several types of computer processing systems. A good example is Chess or Go, but also take a look at video games like pong or Pac-Man. Write an essay that discusses the game's past, present, and AI future. \ No newline at end of file +Games are an area that have been heavily influence by developments in AI and ML. In this assignment, write a short paper on a game that you like that has been influenced by the evolution of AI. It should be an old enough game to have been influenced by several types of computer processing systems. A good example is Chess or Go, but also take a look at video games like pong or Pac-Man. Write an essay that discusses the game's past, present, and AI future. diff --git a/2-Symbolic/README.md b/2-Symbolic/README.md index 739ce3593434cc2e1935ad39ca7ba9753cab3d5a..c9e38f710aaac2a82d33c97d4d6709d07dc6119b 100644 --- a/2-Symbolic/README.md +++ b/2-Symbolic/README.md @@ -4,7 +4,7 @@ > Sketchnote by [Tomomi Imura](https://twitter.com/girlie_mac) -The quest for artificial intelligence is based on a search for knowledge, to make sense of the world similar to how humans do. But how can you go about doing this? +The quest for artificial intelligence is based on a search for knowledge, to make sense of the world similar to how humans do. But how can you go about doing this? ## [Pre-lecture quiz](https://black-ground-0cc93280f.1.azurestaticapps.net/quiz/3) @@ -49,7 +49,7 @@ We can classify different computer knowledge representation methods in the follo 1. **Object-Attribute-Value triplets** or **attribute-value pairs**. Since a graph can be represented inside a computer as a list of nodes and edges, we can represent a semantic network by a list of triplets, containing objects, attributes, and values. For example, we build the following triplets about programming languages: -Object | Attribute | Value +Object | Attribute | Value -------|-----------|------ Python | is | Untyped-Language Python | invented-by | Guido van Rossum @@ -60,8 +60,8 @@ Untyped-Language | doesn't have | type definitions 2. **Hierarchical representations** emphasize the fact that we often create a hierarchy of objects inside our head. For example, we know that canary is a bird, and all birds have wings. We also have some idea about what colour canary usually is, and what is their flight speed. - - **Frame representation** is based on representing each object or class of objects as a **frame** which contains **slots**. Slots have possible default values, value restrictions, or stored procedures that can be called to obtain the value of a slot. All frames form a hierarchy similar to an object hierarchy in object-oriented programming languages. - - **Scenarios** are special kind of frames that represent complex situations that can unfold in time. + - **Frame representation** is based on representing each object or class of objects as a **frame** which contains **slots**. Slots have possible default values, value restrictions, or stored procedures that can be called to obtain the value of a slot. All frames form a hierarchy similar to an object hierarchy in object-oriented programming languages. + - **Scenarios** are special kind of frames that represent complex situations that can unfold in time. **Python** Slot | Value | Default value | Interval | @@ -76,19 +76,20 @@ Block Syntax | Indent | | | - Production rules are if-then statements that allow us to draw conclusions. For example, a doctor can have a rule saying that **IF** a patient has high fever **OR** high level of C-reactive protein in blood test **THEN** he has an inflammation. Once we encounter one of the conditions, we can make a conclusion about inflammation, and then use it in further reasoning. - Algorithms can be considered another form of procedural representation, although they are almost never used directly in knowledge-based systems. -4. **Logic** was originally proposed by Aristotle as a way to represent universal human knowledge +4. **Logic** was originally proposed by Aristotle as a way to represent universal human knowledge. - Predicate Logic as a mathematical theory is too rich to be computable, therefore some subset of it is normally used, such as Horn clauses used in Prolog. - - Descriptive Logic is a family of logical systems used to represent and reason about hierarchies of objects distributed knowledge representations such as *semantic web*. + - Descriptive Logic is a family of logical systems used to represent and reason about hierarchies of objects distributed knowledge representations such as *semantic web*. ## Expert Systems -One of the early successes of symbolic AI were so-called **expert systems** - computer systems that were designed to act as an expert in some limited problem domain. There were based on a **knowledge base** extracted from one or more human experts, and they contained an **inference engine** that performed some reasoning on top of it. +One of the early successes of symbolic AI were so-called **expert systems** - computer systems that were designed to act as an expert in some limited problem domain. They were based on a **knowledge base** extracted from one or more human experts, and they contained an **inference engine** that performed some reasoning on top of it.  |  ---------------------------------------------|------------------------------------------------ Simplified structure of a human neural system | Architecture of a knowledge-based system Expert systems are built like the human reasoning system, which contains **short-term memory** and **long-term memory**. Similarly, in knowledge-based systems we distinguish the following components: + * **Problem memory**: contains the knowledge about the problem being currently solved, i.e. the temperature or blood pressure of a patient, whether he has inflammation or not, etc. This knowledge is also called **static knowledge**, because it contains a snapshot of what we currently know about the problem - the so-called *problem state*. * **Knowledge base**: represents long-term knowledge about a problem domain. It is extracted manually from human experts, and does not change from consultation to consultation. Because it allows us to navigate from one problem state to another, it is also called **dynamic knowledge**. * **Inference engine**: orchestrates the whole process of searching in the problem state space, asking questions of the user when necessary. It is also responsible for finding the right rules to be applied to each state. @@ -100,6 +101,7 @@ As an example, let's consider the following expert system of determining an anim > Image by [Dmitry Soshnikov](http://soshnikov.com) This diagram is called an **AND-OR tree**, and it is a graphical representation of a set of production rules. Drawing a tree is useful at the beginning of extracting knowledge from the expert. To represent the knowledge inside the computer it is more convenient to use rules: + ``` IF the animal eats meat OR (animal has sharp teeth @@ -108,6 +110,7 @@ OR (animal has sharp teeth ) THEN the animal is a carnivore ``` + You can notice that each condition on the left-hand-side of the rule and the action are essentially object-attribute-value (OAV) triplets. **Working memory** contains the set of OAV triplets that correspond to the problem currently being solved. A **rules engine** looks for rules for which a condition is satisfied and applies them, adding another triplet to the working memory. > ✅ Write your own AND-OR tree on a topic you like! @@ -128,6 +131,7 @@ The process described above is called **forward inference**. It starts with some However, in some cases we might want to start with an empty knowledge about the problem, and ask questions that will help us arrive to the conclusion. For example, when doing medical diagnosis, we usually do not perform all medical analyses in advance before starting diagnosing the patient. We rather want to perform analyses when a decision needs to be made. This process can be modeled using **backward inference**. It is driven by the **goal** - the attribute value that we are looking to find: + 1. Select all rules that can give us the value of a goal (i.e. with the goal on the RHS ("right-hand-side")) - a conflict set 1. If there are no rules for this attribute, or there is a rule saying that we should ask the value from the user - ask for it, otherwise: 1. Use conflict resolution strategy to select one rule that we will use as *hypothesis* - we will try to prove it @@ -139,23 +143,25 @@ This process can be modeled using **backward inference**. It is driven by the ** ### Implementing Expert Systems Expert systems can be implemented using different tools: + * Programming them directly in some high level programming language. This is not the best idea, because the main advantage of a knowledge-based system is that knowledge is separated from inference, and potentially a problem domain expert should be able to write rules without understanding the details of the inference process -* Using **expert systems shell**, i.e. a system specifically designed to be populated by knowledge using some knowledge representation language. +* Using **expert systems shell**, i.e. a system specifically designed to be populated by knowledge using some knowledge representation language. ## ✍️ Exercise: Animal Inference See [Animals.ipynb](Animals.ipynb) for an example of implementing forward and backward inference expert system. -> **Note**: This example is rather simple, and only gives the idea of how an expert system looks like. Once you start creating such a system, you will only notice some *intelligent* behaviour from it once you reach certain number of rules, around 200+. At some point, rules become too complex to keep all of them in mind, and at this point you may start wondering why a system makes certain decisions. However, the important characteristics of knowledge-based systems is that you can always *explain* exactly any of the decisions were made. +> **Note**: This example is rather simple, and only gives the idea of how an expert system looks like. Once you start creating such a system, you will only notice some *intelligent* behaviour from it once you reach certain number of rules, around 200+. At some point, rules become too complex to keep all of them in mind, and at this point you may start wondering why a system makes certain decisions. However, the important characteristics of knowledge-based systems is that you can always *explain* exactly how any of the decisions were made. ## Ontologies and the Semantic Web At the end of 20th century there was an initiative to use knowledge representation to annotate Internet resources, so that it would be possible to find resources that correspond to very specific queries. This motion was called **Semantic Web**, and it relied on several concepts: -- A special knowledge representation based on **[description logics](https://en.wikipedia.org/wiki/Description_logic)** (DL). It is similar to frame knowledge representation, because it builds a hierarchy of objects with properties, but it has formal logical semantics and inference. There is a whole family of DLs which balance between expressiveness and algorithmic complexity of inference. + +- A special knowledge representation based on **[description logics](https://en.wikipedia.org/wiki/Description_logic)** (DL). It is similar to frame knowledge representation, because it builds a hierarchy of objects with properties, but it has formal logical semantics and inference. There is a whole family of DLs which balance between expressiveness and algorithmic complexity of inference. - Distributed knowledge representation, where all concepts are represented by a global URI identifier, making it possible to create knowledge hierarchies that span the internet. - A family of XML-based languages for knowledge description: RDF (Resource Description Framework), RDFS (RDF Schema), OWL (Ontology Web Language). -A core concept in the Semantic Web is a concept of **Ontology**. It refers to a explicit specification of a problem domain using some formal knowledge representation. The simplest ontology can be just a hierarchy of objects in a problem domain, but more complex ontologies will include rules that can be used for inference. +A core concept in the Semantic Web is a concept of **Ontology**. It refers to a explicit specification of a problem domain using some formal knowledge representation. The simplest ontology can be just a hierarchy of objects in a problem domain, but more complex ontologies will include rules that can be used for inference. In the semantic web, all representations are based on triplets. Each object and each relation are uniquely identified by the URI. For example, if we want to state the fact that this AI Curriculum has been developed by Dmitry Soshnikov on Jan 1st, 2022 - here are the triplets we can use: @@ -165,7 +171,8 @@ In the semantic web, all representations are based on triplets. Each object and http://github.com/microsoft/ai-for-beginners http://www.example.com/terms/creation-date “Jan 13, 2007” http://github.com/microsoft/ai-for-beginners http://purl.org/dc/elements/1.1/creator http://soshnikov.com ``` -> > ✅ Here `http://www.example.com/terms/creation-date` and `http://purl.org/dc/elements/1.1/creator` are some well-known and universally accepted URIs to express the concepts of *creator* and *creation date*. + +> ✅ Here `http://www.example.com/terms/creation-date` and `http://purl.org/dc/elements/1.1/creator` are some well-known and universally accepted URIs to express the concepts of *creator* and *creation date*. In a more complex case, if we want to define a list of creators, we can use some data structures defined in RDF. @@ -173,8 +180,10 @@ In a more complex case, if we want to define a list of creators, we can use some > Diagrams above by [Dmitry Soshnikov](http://soshnikov.com) -The progress of building the Semantic Web was somehow slowed down by the success of search engines and natural language processing techniques, which allow extracting structured data from text. However, in some areas there are still significant efforts to maintain ontologies and knowledgebases. A few projects worth noting: +The progress of building the Semantic Web was somehow slowed down by the success of search engines and natural language processing techniques, which allow extracting structured data from text. However, in some areas there are still significant efforts to maintain ontologies and knowledge bases. A few projects worth noting: + * [WikiData](https://wikidata.org/) is a collection of machine readable knowledge bases associated with Wikipedia. Most of the data is mined from Wikipedia *InfoBoxes*, pieces of structured content inside Wikipedia pages. You can [query](https://query.wikidata.org/) wikidata in SPARQL, a special query language for Semantic Web. Here is a sample query that displays most popular eye colors among humans: + ```sparql #defaultView:BubbleChart SELECT ?eyeColorLabel (COUNT(?human) AS ?count) @@ -186,6 +195,7 @@ WHERE } GROUP BY ?eyeColorLabel ``` + * [DBpedia](https://www.dbpedia.org/) is another effort similar to WikiData. > ✅ If you want to experiment with building your own ontologies, or opening existing ones, there is a great visual ontology editor called [Protégé](https://protege.stanford.edu/). Download it, or use it online. @@ -200,7 +210,7 @@ See [FamilyOntology.ipynb](FamilyOntology.ipynb) for an example of using Semanti ## Microsoft Concept Graph -In most of the cases, ontologies are carefully created by hand. However, it is also possible to **mine** ontologies from unstructured data, for example, from natural language texts. +In most of the cases, ontologies are carefully created by hand. However, it is also possible to **mine** ontologies from unstructured data, for example, from natural language texts. One such attempt was done by Microsoft Research, and resulted in [Microsoft Concept Graph](https://blogs.microsoft.com/ai/microsoft-researchers-release-graph-that-helps-machines-conceptualize/?WT.mc_id=academic-57639-dmitryso). diff --git a/2-Symbolic/assignment.md b/2-Symbolic/assignment.md index 343941752929d5966cf02bc5ae46fe07e7bb9b23..b30745005040bf30565c6bcb900449ad091dd828 100644 --- a/2-Symbolic/assignment.md +++ b/2-Symbolic/assignment.md @@ -1,3 +1,3 @@ # Build an Ontology -Building a knowledge base is all about categorizing a model representing facts about a topic. Pick a topic - like a person, a place, or a thing - and then build a model of that topic. Use some of the techniques and model-building strategies described in this lesson. An example would be creating an ontology of a living room with furniture, lights, and so on. How does the living room differ from the kitchen? The bathroom? How do you know it's a living room and not a dining room? Use [Protégé](https://protege.stanford.edu/) to build your ontology. \ No newline at end of file +Building a knowledge base is all about categorizing a model representing facts about a topic. Pick a topic - like a person, a place, or a thing - and then build a model of that topic. Use some of the techniques and model-building strategies described in this lesson. An example would be creating an ontology of a living room with furniture, lights, and so on. How does the living room differ from the kitchen? The bathroom? How do you know it's a living room and not a dining room? Use [Protégé](https://protege.stanford.edu/) to build your ontology. diff --git a/3-NeuralNetworks/03-Perceptron/README.md b/3-NeuralNetworks/03-Perceptron/README.md index 72551d0e6356620359235c2b1d894f03899b04df..4b37fb89217ab93880127eeb969ea225d1652cd9 100644 --- a/3-NeuralNetworks/03-Perceptron/README.md +++ b/3-NeuralNetworks/03-Perceptron/README.md @@ -4,7 +4,7 @@ One of the first attempts to implement something similar to a modern neural network was done by Frank Rosenblatt from Cornell Aeronautical Laboratory in 1957. It was a hardware implementation called "Mark-1", designed to recognize primitive geometric figures, such as triangles, squares and circles. -| | | +| | | |--------------|-----------| |<img src='images/Rosenblatt-wikipedia.jpg' alt='Frank Rosenblatt'/> | <img src='images/Mark_I_perceptron_wikipedia.jpg' alt='The Mark 1 Perceptron' />| @@ -14,8 +14,7 @@ An input image was represented by 20x20 photocell array, so the neural network h > ✅ A potentiometer is a device that allows the user to adjust the resistance of a circuit. -> The New York Times wrote about perceptron at that time: -> *the embryo of an electronic computer that [the Navy] expects will be able to walk, talk, see, write, reproduce itself and be conscious of its existence.* +> The New York Times wrote about perceptron at that time: *the embryo of an electronic computer that [the Navy] expects will be able to walk, talk, see, write, reproduce itself and be conscious of its existence.* ## Perceptron Model @@ -37,7 +36,7 @@ E(w) = -∑w<sup>T</sup>x<sub>i</sub>t<sub>i</sub> where: * the sum is taken on those training data points i that result in the wrong classification -* x<sub>i</sub> is the input data, and t<sub>i</sub> is either -1 or +1 for negative and positive examples accordingly. +* x<sub>i</sub> is the input data, and t<sub>i</sub> is either -1 or +1 for negative and positive examples accordingly. This criteria is considered as a function of weights w, and we need to minimize it. Often, a method called **gradient descent** is used, in which we start with some initial weights w<sup>(0)</sup>, and then at each step update the weights according to the formula: @@ -71,7 +70,7 @@ def train(positive_examples, negative_examples, num_iterations = 100, eta = 1): ## Conclusion -In this lesson, you learned about a perceptron, which is a binary classification model, and how to training it by using a weights vector. +In this lesson, you learned about a perceptron, which is a binary classification model, and how to train it by using a weights vector. ## 🚀 Challenge diff --git a/3-NeuralNetworks/README.md b/3-NeuralNetworks/README.md index 9391e1b4bef2512965589308878df4c86ec4a7a6..144ea748c86fcee44453fd06ebad763195db2cec 100644 --- a/3-NeuralNetworks/README.md +++ b/3-NeuralNetworks/README.md @@ -2,7 +2,7 @@  -As we discussed in the introduction, one of the ways to achieve intelligence is to train a **computer model** or an **artificial brain**. Since the middle of 20th century, researchers tried different mathematical models, until in recent years this direction proved to by hugely successful. Such mathematical models of the brain are called **neural networks**. +As we discussed in the introduction, one of the ways to achieve intelligence is to train a **computer model** or an **artificial brain**. Since the middle of 20th century, researchers tried different mathematical models, until in recent years this direction proved to by hugely successful. Such mathematical models of the brain are called **neural networks**. > Sometimes neural networks are called *Artificial Neural Networks*, ANNs, in order to indicate that we are talking about models, not real networks of neurons. @@ -15,7 +15,8 @@ Neural Networks are a part of a larger discipline called **Machine Learning**, w In Machine Learning, we assume that we have some dataset of examples **X**, and corresponding output values **Y**. Examples are often N-dimensional vectors that consist of **features**, and outputs are called **labels**. We will consider the two most common machine learning problems: -* **Classification**, where we need to classify an input object into two or more classes. + +* **Classification**, where we need to classify an input object into two or more classes. * **Regression**, where we need to predict a numerical number for each of the input samples. > When representing inputs and outputs as tensors, the input dataset is a matrix of size M×N, where M is number of samples and N is the number of features. Output labels Y is the vector of size M. @@ -38,7 +39,6 @@ where f is some non-linear **activation function**. > Early models of neuron were described in the classical paper [A logical calculus of the ideas immanent in nervous activity](http://www.springerlink.com/content/61446605110620kg/fulltext.pdf) by Warren McCullock and Walter Pitts in 1943. Donald Hebb in his book "[The Organization of Behavior: A Neuropsychological Theory](https://books.google.com/books?id=VNetYrB8EBoC)" proposed the way those networks can be trained. - ## In this Section In this section we will learn about: