Day 10 Day 11

2019/04/26 上午 01:15

機器學習共學討論版

Patrick Liou

觀看數：6

回答數：5

收藏數：0

ml100-2

語法

ml100-2-d10

ml100-2-d11

Day 10

1.plt.plot(app_train['EXT_SOURCE_3'] , np.log10(app_train['TARGET']), '.') plt.xlabel('EXT_SOURCE_3') plt.ylabel('loc_TARGET') plt.show()

corr = np.corrcoef(app_train['EXT_SOURCE_3'] , np.log10(app_train['TARGET'])

print("Correlation: %.4f" % (corr[0][1]))

最後我想照範例讀取出Correcoef 卻無法執行，請問是哪裡有錯誤

2.for col in app_train: if app_train[col].dtype == 'object':

# 如果只有兩種值的類別型欄位

if len(list(app_train[col].unique())) <= 2: -> unique( 指的是甚麼意思呢?

# 就做 Label Encoder, 以加入相關係數檢查

app_train[col] = le.fit_transform(app_train[col])

Day 11

1.bin_cut = [i for i in range(20, 70, (70-20) // 10)] + [70] -> // 指的是甚麼意思呢?

bin_cut = 10 -> 這部份不能直接用10 去區分嗎?

2.for i in range(len(year_group_sorted)):

sns.distplot(age_data.loc[(age_data['YEARS_BINNED'] == year_group_sorted[i]) & \

(age_data['TARGET'] == 0), 'YEARS_BIRTH'], label = str(year_group_sorted[i]))

sns.distplot(age_data.loc[(age_data['YEARS_BINNED'] == year_group_sorted[i]) & \

(age_data['TARGET'] == 1), 'YEARS_BIRTH'], label = str(year_group_sorted[i]))

-> 中的& \指的是甚麼意思呢?

3.np.random.normal(0, 10, 1000)-> 範圍內新增1000筆常態分佈的值

np.random.randint(0, 50, 1000)-> 範圍內新增1000筆隨機的值

np.random.randn(10)->新增10筆常態分佈的值，無法設定range

以上我的理解是否有錯誤?

回答列表

2019/04/26 下午 01:56

張維元 (WeiYuan)

贊同數：0

不贊同數：0

留言數：1

1.plt.plot(app_train['EXT_SOURCE_3'] , np.log10(app_train['TARGET']), '.') plt.xlabel('EXT_SOURCE_3') plt.ylabel('loc_TARGET') plt.show()

corr = np.corrcoef(app_train['EXT_SOURCE_3'] , np.log10(app_train['TARGET'])

print("Correlation: %.4f" % (corr[0][1]))

最後我想照範例讀取出Correcoef 卻無法執行，請問是哪裡有錯誤

=> 可以把錯誤訊息附上來嗎？
2019/04/26 下午 01:56

張維元 (WeiYuan)

贊同數：0

不贊同數：0

留言數：0

2.for col in app_train:     if app_train[col].dtype == 'object':

# 如果只有兩種值的類別型欄位

if len(list(app_train[col].unique())) <= 2: -> unique( 指的是甚麼意思呢?

# 就做 Label Encoder, 以加入相關係數檢查

app_train[col] = le.fit_transform(app_train[col])

=> 取不重複的資料！
2019/04/26 下午 01:57

張維元 (WeiYuan)

贊同數：0

不贊同數：0

留言數：1

1.bin_cut = [i for i in range(20, 70, (70-20) // 10)] + [70] -> // 指的是甚麼意思呢?

=> 哪一段不知道？或是說說看你這邊理解到哪邊？
2019/04/26 下午 01:58

張維元 (WeiYuan)

贊同數：0

不贊同數：0

留言數：0

2.for i in range(len(year_group_sorted)):

sns.distplot(age_data.loc[(age_data['YEARS_BINNED'] == year_group_sorted[i]) & \

(age_data['TARGET'] == 0), 'YEARS_BIRTH'], label = str(year_group_sorted[i]))

sns.distplot(age_data.loc[(age_data['YEARS_BINNED'] == year_group_sorted[i]) & \

(age_data['TARGET'] == 1), 'YEARS_BIRTH'], label = str(year_group_sorted[i]))

-> 中的& \指的是甚麼意思呢?

=> & 是位元運算，\ 是告訴程式這邊要換行執行的時候要當作同一行
2019/04/26 下午 02:01

張維元 (WeiYuan)

贊同數：0

不贊同數：0

留言數：0

3.np.random.normal(0, 10, 1000)-> 範圍內新增1000筆常態分佈的值

np.random.randint(0, 50, 1000)-> 範圍內新增1000筆隨機的值

np.random.randn(10)->新增10筆常態分佈的值，無法設定range

以上我的理解是否有錯誤?

=> 嗨，np.random.normal 是給平均數跟標準差哦，不是給範圍