Cactus Identification fastai v1.0.46 ensemble

Author: https://www.kaggle.com/mnpinto

From: https://www.kaggle.com/mnpinto/cactus-identification-fastai-v1-0-46-ensemble

License: Apache 2.0

Score: 0.9996

Cactus Identification fastai baseline

In [1]:

import fastai
from fastai.vision import *
from sklearn.model_selection import KFold

In [2]:

# Copy pretrained model weights to the default path
!mkdir '/tmp/.torch'
!mkdir '/tmp/.torch/models/'
#!cp '../input/resnet18/resnet18.pth' '/tmp/.torch/models/resnet18-5c106cde.pth'
#!cp '../input/densenet121/densenet121.pth' '/tmp/.torch/models/densenet121-a639ec97.pth'
!cp '../input/densenet201/densenet201.pth' '/tmp/.torch/models/densenet201-c1103571.pth'

In [3]:

fastai.__version__

Out[3]:

'1.0.46'

In [4]:

data_path = Path('../input/aerial-cactus-identification')
df = pd.read_csv(data_path/'train.csv')
df.head()

Out[4]:

	id	has_cactus
0	0004be2cfeaba1c0361d39e2b000257b.jpg	1
---	---	---
1	000c8a36845c0208e833c79c1bffedd1.jpg	1
---	---	---
2	000d1e9a533f62e55c289303b072733d.jpg	1
---	---	---
3	0011485b40695e9138e92d0b3fb55128.jpg	1
---	---	---
4	0014d7a11e90b62848904c1418fc8cf2.jpg	1
---	---	---

In [5]:

sub_csv = pd.read_csv(data_path/'sample_submission.csv')
sub_csv.head()

Out[5]:

	id	has_cactus
0	000940378805c44108d287872b2f04ce.jpg	0.5
---	---	---
1	0017242f54ececa4512b4d7937d1e21e.jpg	0.5
---	---	---
2	001ee6d8564003107853118ab87df407.jpg	0.5
---	---	---
3	002e175c3c1e060769475f52182583d0.jpg	0.5
---	---	---
4	0036e44a7e8f7218e9bc7bf8137e4943.jpg	0.5
---	---	---

In [6]:

def create_databunch(valid_idx):
    test = ImageList.from_df(sub_csv, path=data_path/'test', folder='test')
    data = (ImageList.from_df(df, path=data_path/'train', folder='train')
            .split_by_idx(valid_idx)
            .label_from_df()
            .add_test(test)
            .transform(get_transforms(flip_vert=True, max_rotate=20.0), size=128)
            .databunch(path='.', bs=64)
            .normalize(imagenet_stats)
           )
    return data

5 fold ensemble

In [7]:

kf = KFold(n_splits=5, random_state=379)
epochs = 6
lr = 1e-2
preds = []
for train_idx, valid_idx in kf.split(df):
    data = create_databunch(valid_idx)
    learn = create_cnn(data, models.densenet201, metrics=[accuracy])
    learn.fit_one_cycle(epochs, slice(lr))
    learn.unfreeze()
    learn.fit_one_cycle(epochs, slice(lr/400, lr/4))
    learn.fit_one_cycle(epochs, slice(lr/800, lr/8))
    preds.append(learn.get_preds(ds_type=DatasetType.Test))

Total time: 06:07

epoch	train_loss	valid_loss	accuracy	time
1	0.056767	0.018588	0.994000	01:08
2	0.022490	0.010378	0.995429	01:00
3	0.025892	0.003561	0.998857	01:00
4	0.012079	0.038481	0.996571	00:59
5	0.006535	0.002324	0.999143	00:57
6	0.003622	0.002164	0.999143	01:01

66.67% [4/6 04:47<02:23]

epoch	train_loss	valid_loss	accuracy	time
1	0.007291	0.003572	0.998286	01:12
2	0.014519	0.005793	0.998286	01:11
3	0.008674	0.003707	0.998286	01:10
4	0.007575	0.001763	0.999143	01:13

91.28% [199/218 00:58<00:05 0.0037]In [8]:

ens = torch.cat([preds[i][0][:,1].view(-1, 1) for i in range(5)], dim=1)
ens  = (ens.mean(1)>0.5).long(); ens[:10]

Out[8]:

tensor([1, 1, 0, 0, 1, 1, 1, 1, 1, 0])

In [9]:

sub_csv['has_cactus'] = ens

In [10]:

sub_csv.to_csv('submission.csv', index=False)

我们一直在努力

apachecn/AiLearning