logo
Loading...

Day 23 - Cupoy

#只取類別值 (object) 型欄位, 存於 object_features 中object_fe...

只取類別值,對照組,ml100-2,ml100-2-d23

Day 23

2019/05/09 05:34 PM
機器學習新手論壇
Patrick Liou
觀看數:0
回答數:3
收藏數:0
只取類別值
對照組
ml100-2
ml100-2-d23

#只取類別值 (object) 型欄位, 存於 object_features 中

object_features = []

for dtype, feature in zip(df.dtypes, df.columns):

    if dtype == 'object':

        object_features.append(feature)

print(f'{len(object_features)} Numeric Features : {object_features}\n')


# 只留類別型欄位

df = df[object_features]

df = df.fillna('None')

train_num = train_Y.shape[0] ---> 不太了解此行的用意

df.head()


# 對照組 : 標籤編碼 + 邏輯斯迴歸

df_temp = pd.DataFrame()

for c in df.columns:

    df_temp[c] = LabelEncoder().fit_transform(df[c])

train_X = df_temp[:train_num]---> 為什麼要加入 [:train_num],直接寫成df_temp 會有差異嗎?

estimator = LogisticRegression()

start = time.time()

print(f'shape : {train_X.shape}')

print(f'score : {cross_val_score(estimator, train_X, train_Y, cv=5).mean()}')

print(f'time : {time.time() - start} sec')