logo
Loading...

為什麼不將train_test_split 的 test_size調整為更小? - Cupoy

想請問老師們,範例中的test_size為0.25,也就是說train_size是0.75但是不將t...

ml100-2,ml100-2-d46

為什麼不將train_test_split 的 test_size調整為更小?

2019/06/27 上午 08:17
機器學習共學討論版
黃稚翔
觀看數:58
回答數:2
收藏數:0
ml100-2
ml100-2-d46

想請問老師們,範例中的test_size為0.25,也就是說train_size是0.75

但是不將test_size調整為更小的原因是要預防overfitting嗎?                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             

(我自己有測試,將test_size設為0.1,acc可達0.977...)

回答列表

  • 2019/06/27 下午 00:21
    Jimmy
    贊同數:1
    不贊同數:0
    留言數:0

    Hi 稚翔!


    test size 的大小會跟你 data 的總數有關,如果資料量非常大,那切個 10% 的資料做 testing 其實也非常多。但如果今天的資料量比較少,假設只有 50 筆,那在切 10 % 就只有五筆,這麼小的測試資料其實很難評估模型的穩定性。


        

  • 2019/07/15 上午 10:22
    張維元 (WeiYuan)
    贊同數:0
    不贊同數:0
    留言數:0

    關於評估有兩種手法:


    1. 拆成 testing 跟 training ,是避免只有一組資料做模型,testing 可以用來驗證 training 的模型

    2. 分成多回合訓練再取平均的  Cross-Validation ,避免模型太貼近某一組 training-testing 的結果


    我覺得 2. 比較像是處理 overfitting 主要的做法