林軒田_機器學習入門必學 Machine Learning Foundations

第一課：The Learning Problem

what machine learning is and its connection to applications and other fields

Course Introduction

什麼是「學習」？學習就是人類通過觀察、積累經驗，掌握某項技能或能力。就好像人類從小學習識別字母、認識漢字，就是學習的過程。而機器學習（Machine Learning），顧名思義，就是讓機器（計算機）也能向人類一樣，通過觀察大量的數據和訓練，發現事物規律，獲得某種分析問題、解決問題的能力。影片內容 pdf：https://www.csie.ntu.edu.tw/~htlin/course/mlfound18fall/doc/01_handout.pdf
What Is Machine Learning

什麼情況下會使用機器學習來解決問題呢？其實，目前機器學習的應用非常廣泛，基本上任何場合都能夠看到它的身影。其應用場合大致可歸納為三個條件： 1.事物本身存在某種潛在規律 2.某些問題難以使用普通編程解決 3.有大量的數據樣本可供使用影片內容 pdf：https://www.csie.ntu.edu.tw/~htlin/course/mlfound18fall/doc/01_handout.pdf
Applications of Machine Learning

機器學習在衣、食、住、行、教育、娛樂等各個方面都有著廣泛的應用，生活處處都離不開機器學習。比如，用戶打開購物網站，網站就會自動推薦用戶可能會喜歡的商品；電影頻道會根據用戶的瀏覽記錄和觀影記錄，向不同用戶推薦他們可能喜歡的電影等等，到處都有機器學習的影子。影片內容 pdf：https://www.csie.ntu.edu.tw/~htlin/course/mlfound18fall/doc/01_handout.pdf
Components of Machine Learning

本系列的課程對機器學習問題有一些基本的術語需要注意一下： 1.輸入 x 2.輸出 y 3.目標函數 f，即最接近實際樣本分佈的規律 4.訓練樣本 data 5.假設 hypothesis，一個機器學習模型對應了很多不同的 hypothesis，通過演算法 A，選擇一個最佳的 hypothesis 對應的函數稱為矩 g，g 能最好地表示事物的內在規律，也是最終想要得到的模型表達式。影片內容 pdf：https://www.csie.ntu.edu.tw/~htlin/course/mlfound18fall/doc/01_handout.pdf
Machine Learning and Other Fields

本節課主要介紹了什麼是機器學習，什麼樣的場合下可以使用機器學習解決問題，然後用流程圖的形式展示了機器學習的整個過程，最後把機器學習和數據挖掘、人工智能、統計這三個領域做個比較。影片內容 pdf：https://www.csie.ntu.edu.tw/~htlin/course/mlfound18fall/doc/01_handout.pdf

第二課Learning to Answer Yes/No

your first learning algorithm (and the world's first!) that "draws the line" between yes and no by adaptively searching for a good line based on data

第三課：Types of Learning

earning comes with many possibilities in different applications, with our focus being binary classification or regression from a batch of supervised data with concrete features

第四課：Feasibility of Learning

learning can be "probably approximately correct" when given enough statistical data and finite number of hypotheses

第五課：Training versus Testing

what we pay in choosing hypotheses during training: the growth function for representing effective number of choices

第六課：Theory of Generalization

test error can approximate training error if there is enough data and growth function does not grow too fast

第七課：The VC Dimension

learning happens if there is finite model complexity (called VC dimension), enough data, and low training error

第八課：Noise and Error

learning can still happen within a noisy environment and different error measures

第九課：Linear Regression

weight vector for linear hypotheses and squared error instantly calculated by analytic solution

第十課：Logistic Regression

gradient descent on cross-entropy error to get good logistic hypothesis

第十一課：Linear Models for Classification

binary classification via (logistic) regression; multiclass classification via OVA/OVO decomposition

第十二課：Nonlinear Transformation

nonlinear model via nonlinear feature transform+linear model with price of model complexity

第十三課：Hazard of Overfitting

overfitting happens with excessive power, stochastic/deterministic noise and limited data

第十四課：Regularization

minimize augmented error, where the added regularizer effectively limits model complexity

第十五課：Validation

(crossly) reserve validation data to simulate testing procedure for model selection

第十六課：Three Learning Principles

be aware of model complexity, data goodness and your professionalism