Merge branch 'microsoft:main' into main

microsoft · Jul 11, 2023 · a3c6b81 · a3c6b81
2 parents ba35158 + 30090b4
commit a3c6b81
Show file tree

Hide file tree

Showing 9 changed files with 43 additions and 15 deletions.
diff --git a/.devcontainer/requirements.txt b/.devcontainer/requirements.txt
@@ -19,4 +19,4 @@ tensorboard==2.8.0
 tokenizers==0.10.3
 torchinfo==0.0.8
 tqdm==4.62.3
-transformers==4.3.3
+transformers==4.30.0
diff --git a/README.md b/README.md
@@ -28,7 +28,7 @@ In this curriculum, you will learn:
 What we will not cover in this curriculum:
 
 * Business cases of using **AI in Business**. Consider taking [Introduction to AI for business users](https://docs.microsoft.com/learn/paths/introduction-ai-for-business-users/?WT.mc_id=academic-77998-cacaste) learning path on Microsoft Learn, or [AI Business School](https://www.microsoft.com/ai/ai-business-school/?WT.mc_id=academic-77998-cacaste), developed in cooperation with [INSEAD](https://www.insead.edu/).
-* **Classic Machine Learning**, which is well described in our [Machine Learning for Beginners Curriculum](http:/Microsoft/ML-for-Beginners)
+* **Classic Machine Learning**, which is well described in our [Machine Learning for Beginners Curriculum](http:/Microsoft/ML-for-Beginners).
 * Practical AI applications built using **[Cognitive Services](https://azure.microsoft.com/services/cognitive-services/?WT.mc_id=academic-77998-cacaste)**. For this, we recommend that you start with modules Microsoft Learn for [vision](https://docs.microsoft.com/learn/paths/create-computer-vision-solutions-azure-cognitive-services/?WT.mc_id=academic-77998-cacaste), [natural language processing](https://docs.microsoft.com/learn/paths/explore-natural-language-processing/?WT.mc_id=academic-77998-cacaste), **[Generative AI with Azure OpenAI Service](https://learn.microsoft.com/en-us/training/paths/develop-ai-solutions-azure-openai/?WT.mc_id=academic-77998-bethanycheum)** and others.
 * Specific ML **Cloud Frameworks**, such as [Azure Machine Learning](https://azure.microsoft.com/services/machine-learning/?WT.mc_id=academic-77998-cacaste), [Microsoft Fabric](https://learn.microsoft.com/en-us/training/paths/get-started-fabric/?WT.mc_id=academic-77998-bethanycheum), or [Azure Databricks](https://docs.microsoft.com/learn/paths/data-engineer-azure-databricks?WT.mc_id=academic-77998-cacaste). Consider using [Build and operate machine learning solutions with Azure Machine Learning](https://docs.microsoft.com/learn/paths/build-ai-solutions-with-azure-ml-service/?WT.mc_id=academic-77998-cacaste) and [Build and Operate Machine Learning Solutions with Azure Databricks](https://docs.microsoft.com/learn/paths/build-operate-machine-learning-solutions-azure-databricks/?WT.mc_id=academic-77998-cacaste) learning paths.
 * **Conversational AI** and **Chat Bots**. There is a separate [Create conversational AI solutions](https://docs.microsoft.com/learn/paths/create-conversational-ai-solutions/?WT.mc_id=academic-77998-cacaste) learning path, and you can also refer to [this blog post](https://soshnikov.com/azure/hello-bot-conversational-ai-on-microsoft-platform/) for more detail.
@@ -118,12 +118,12 @@ Get started with the following resources:
 
 However, if you would like to take the course as a self-study project, we suggest that you fork the entire repo to your own GitHub account and complete the exercises on your own or with a group:
 
-- Start with a pre-lecture quiz
-- Read the intro text for the lecture 
-- If the lecture has additional notebooks, go through them, reading and executing the code. If both TensorFlow and PyTorch notebooks are provided, you can focus on one of them - choose your favorite framework
-- Notebooks often contain some of the challenges that require you to tweak the code a little bit to experiment
-- Take the post-lecture quiz
-- If there is a lab attached to the module - complete the assignment
+- Start with a pre-lecture quiz.
+- Read the intro text for the lecture.
+- If the lecture has additional notebooks, go through them, reading and executing the code. If both TensorFlow and PyTorch notebooks are provided, you can focus on one of them - choose your favorite framework.
+- Notebooks often contain some of the challenges that require you to tweak the code a little bit to experiment.
+- Take the post-lecture quiz.
+- If there is a lab attached to the module - complete the assignment.
 - Visit the [Discussion board](https:/microsoft/AI-For-Beginners/discussions) to "learn out loud".
 
 

diff --git a/binder/requirements.txt b/binder/requirements.txt
@@ -19,4 +19,4 @@ tensorboard==2.8.0
 tokenizers==0.10.3
 torchinfo==0.0.8
 tqdm==4.62.3
-transformers==4.3.3
+transformers==4.30.0
diff --git a/lessons/2-Symbolic/README.md b/lessons/2-Symbolic/README.md
@@ -64,6 +64,7 @@ Untyped-Language | doesn't have | type definitions
  - **Scenarios** are special kind of frames that represent complex situations that can unfold in time.
 
 **Python**
+
 Slot | Value | Default value | Interval |
 -----|-------|---------------|----------|
 Name | Python | | |

diff --git a/lessons/3-NeuralNetworks/03-Perceptron/Perceptron.ipynb b/lessons/3-NeuralNetworks/03-Perceptron/Perceptron.ipynb
@@ -213,7 +213,7 @@
  " * $t_{n} \\in \\{-1, +1\\}$ for negative and positive training samples, respectively\n",
  " * $\\mathcal{M}$ - a set of wrongly classified examples\n",
  " \n",
- "We will use the process of **graident descent**. Starting with some initial random weights $\\mathbf{w}^{(0)}$, we will adjust weights on each step of the training using the gradient of $E$:\n",
+ "We will use the process of **gradient descent**. Starting with some initial random weights $\\mathbf{w}^{(0)}$, we will adjust weights on each step of the training using the gradient of $E$:\n",
  "\n",
  "$$\\mathbf{w}^{\\tau + 1}=\\mathbf{w}^{\\tau} - \\eta \\nabla E(\\mathbf{w}) = \\mathbf{w}^{\\tau} + \\eta \\mathbf{x}_{n} t_{n}$$\n",
  "\n",

diff --git a/lessons/3-NeuralNetworks/05-Frameworks/IntroKeras.ipynb b/lessons/3-NeuralNetworks/05-Frameworks/IntroKeras.ipynb
@@ -1,6 +1,7 @@
 {
  "cells": [
  {
+ "attachments": {},
  "cell_type": "markdown",
  "metadata": {
  "id": "En2vX4FuwHlu"
@@ -16,6 +17,7 @@
  ]
  },
  {
+ "attachments": {},
  "cell_type": "markdown",
  "metadata": {
  "id": "8cACQoFMwHl3"
@@ -65,6 +67,7 @@
  ]
  },
  {
+ "attachments": {},
  "cell_type": "markdown",
  "metadata": {
  "id": "6tp2xGV7wHl4"
@@ -83,6 +86,7 @@
  ]
  },
  {
+ "attachments": {},
  "cell_type": "markdown",
  "metadata": {
  "id": "A10prCPowHl7"
@@ -185,6 +189,7 @@
  ]
  },
  {
+ "attachments": {},
  "cell_type": "markdown",
  "metadata": {},
  "source": [
@@ -206,6 +211,7 @@
  ]
  },
  {
+ "attachments": {},
  "cell_type": "markdown",
  "metadata": {
  "id": "SjPlpf2-wHl8"
@@ -251,6 +257,7 @@
  ]
  },
  {
+ "attachments": {},
  "cell_type": "markdown",
  "metadata": {},
  "source": [
@@ -292,6 +299,7 @@
  ]
  },
  {
+ "attachments": {},
  "cell_type": "markdown",
  "metadata": {},
  "source": [
@@ -313,14 +321,15 @@
  ]
  },
  {
+ "attachments": {},
  "cell_type": "markdown",
  "metadata": {},
  "source": [
  "After compiling the model, we can do the actual training by calling `fit` method. The most important parameters are:\n",
  "* `x` and `y` specify training data, features and labels respectively\n",
  "* If we want validation to be performed on each epoch, we can specify `validation_data` parameter, which would be a tuple of features and labels\n",
  "* `epochs` specified the number of epochs\n",
- "* If we want training to happen in minibatches, we can speficu `batch_size` parameter. You can also pre-batch the data manually before passing it to `x`/`y`/`validation_data`, in which case you do not need `batch_size`"
+ "* If we want training to happen in minibatches, we can specify `batch_size` parameter. You can also pre-batch the data manually before passing it to `x`/`y`/`validation_data`, in which case you do not need `batch_size`"
  ]
  },
  {
@@ -370,6 +379,7 @@
  ]
  },
  {
+ "attachments": {},
  "cell_type": "markdown",
  "metadata": {
  "id": "s4_Atvn5K4K9"
@@ -423,6 +433,7 @@
  ]
  },
  {
+ "attachments": {},
  "cell_type": "markdown",
  "metadata": {
  "id": "dvAiaj_JndyP"
@@ -508,6 +519,7 @@
  ]
  },
  {
+ "attachments": {},
  "cell_type": "markdown",
  "metadata": {},
  "source": [
@@ -578,6 +590,7 @@
  ]
  },
  {
+ "attachments": {},
  "cell_type": "markdown",
  "metadata": {},
  "source": [
@@ -634,6 +647,7 @@
  ]
  },
  {
+ "attachments": {},
  "cell_type": "markdown",
  "metadata": {},
  "source": [
@@ -647,6 +661,7 @@
  ]
  },
  {
+ "attachments": {},
  "cell_type": "markdown",
  "metadata": {
  "id": "BmHNhUU8bqEX"
@@ -669,6 +684,7 @@
  ]
  },
  {
+ "attachments": {},
  "cell_type": "markdown",
  "metadata": {
  "id": "gZ-kWx84bMDH"
@@ -683,6 +699,7 @@
  ]
  },
  {
+ "attachments": {},
  "cell_type": "markdown",
  "metadata": {
  "id": "yX6hqiafwHl9"

diff --git a/lessons/5-NLP/14-Embeddings/EmbeddingsTF.ipynb b/lessons/5-NLP/14-Embeddings/EmbeddingsTF.ipynb
@@ -292,6 +292,16 @@
  "> **Note:** When you first create word vectors, downloading them can take some time!"
  ]
  },
+ {
+ "cell_type": "code",
+ "execution_count": 8,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import gensim.downloader as api\n",
+ "w2v = api.load('word2vec-google-news-300')"
+ ]
+ },
  {
  "cell_type": "code",
  "execution_count": 12,
@@ -409,7 +419,7 @@
  "d = np.sum((w2v.vectors-qvec)**2,axis=1)\n",
  "min_idx = np.argmin(d)\n",
  "# find the corresponding word\n",
- "w2v.index2word[min_idx]"
+ "w2v.index_to_key[min_idx]"
  ]
  },
  {
@@ -432,7 +442,7 @@
  "\n",
  "**FastText** tries to overcome the second limitation, and builds on Word2Vec by learning vector representations for each word and the charachter n-grams found within each word. The values of the representations are then averaged into one vector at each training step. While this adds a lot of additional computation to pretraining, it enables word embeddings to encode sub-word information.\n",
  "\n",
- "Another method, **GloVe**, uses a different approach to word embeddings, based on the factorization of the word-context matrix. First, it builds a large matrix that counts the number of word occurences in different contexts, and then it tries to represent this matrix in lower dimensions in a way that minimizes reconstruction loss.\n",
+ "Another method, **GloVe**, uses a different approach to word embeddings, based on the factorization of the word-context matrix. First, it builds a large matrix that counts the number of word occurrences in different contexts, and then it tries to represent this matrix in lower dimensions in a way that minimizes reconstruction loss.\n",
  "\n",
  "The gensim library supports those word embeddings, and you can experiment with them by changing the model loading code above."
  ]

diff --git a/lessons/5-NLP/requirements-pytorch.txt b/lessons/5-NLP/requirements-pytorch.txt
@@ -12,4 +12,4 @@ torchaudio==0.8.1
 torchinfo==0.0.8
 torchtext==0.9.1
 torchvision==0.9.1
-transformers==4.3.3
+transformers==4.30.0
diff --git a/lessons/5-NLP/requirements-tf.txt b/lessons/5-NLP/requirements-tf.txt
@@ -10,4 +10,4 @@ scipy
 TensorFlow
 TensorFlow_datasets
 TensorFlow_text
-transformers==4.3.3
+transformers==4.30.0