菜鸟笔记
提升您的技术认知

K折交叉验证之Python实现

一、二折交叉验证 

import numpy as np
from sklearn.model_selection import KFold
X = np.array([[1, 2], [3, 4], [1, 2], [3, 4]])
#y = np.array([1, 2, 3, 4])

kf = KFold(n_splits=2)
#2折交叉验证,将数据分为两份即前后对半分,每次取一份作为test集
for train_index, test_index in kf.split(X):
    print('train_index', train_index, 'test_index', test_index)
    #train_index与test_index为下标
    train_X = X[train_index]
    test_X= X[test_index]
print("train_X",train_X)
print("test_X",test_X)

实验结果

说明:因为是二折交叉验证,将数据集分为两个小块

D1与D2分别作为训练集和测试集

 

实验结果

train_index [2 3] test_index [0 1]
train_index [0 1] test_index [2 3]
train_X [[1 2]
 [3 4]]
test_X [[1 2]
 [3 4]]

 

二、三折

Y = np.array([[1, 2], [3, 4], [5, 6], [7, 8],[9,10],[11,12]])
#y = np.array([1, 2, 3, 4])
i=0
kf = KFold(n_splits=3)
#2折交叉验证,将数据分为两份即前后对半分,每次取一份作为test集
for train_index, test_index in kf.split(Y):
    i=i+1
    print(i)
    print('train_index', train_index, 'test_index', test_index)
    #train_index与test_index为下标
    train_Y = Y[train_index]
    test_Y= Y[test_index]

    print("train_Y",train_Y)
    print("test_Y",test_Y)

说明:三折交叉验证将整个数据集分为三份

实验结果:

#第一次D2、D3作为训练集,D1作为测试集
train_index [2 3 4 5] test_index [0 1]
train_Y [[ 5  6]
 [ 7  8]
 [ 9 10]
 [11 12]]
test_Y [[1 2]
 [3 4]]
#第二次D1、D3作为训练集,D2作为测试集
train_index [0 1 4 5] test_index [2 3]
train_Y [[ 1  2]
 [ 3  4]
 [ 9 10]
 [11 12]]
test_Y [[5 6]
 [7 8]]
#第一次D1、D2作为训练集,D3作为测试集
train_index [0 1 2 3] test_index [4 5]
train_Y [[1 2]
 [3 4]
 [5 6]
 [7 8]]
test_Y [[ 9 10]
 [11 12]]