Pytorch【直播】2019 年县域农业大脑AI挑战赛-初级准备(一)切图

比赛地址：https://tianchi.aliyun.com/competition/entrance/231717/introduction

这次比赛给的图非常大5万x5万，在训练之前必须要进行数据的切割。通常切割后的大小为512x512,或者1024x1024.

按照512x512切完后的结果如下：

Pytorch【直播】2019 年县域农业大脑AI挑战赛---初级准备(一)切图

切图时需要注意的几点是：

gdal的二进制安装包wheels在：https://www.lfd.uci.edu/~gohlke/pythonlibs/ 这里获取

图像是4个channel，前三个是RGB,第四个是alpha通道（透明）丢掉

图像的区域很多事空白的需要滤掉，不处理。

切割的时候需要有冗余。

大小不能按照完全的512，1024等切割，切割的要大一点数据在后期需要增强：弱缩放，旋转等。

上代码：

from osgeo import gdal
from PIL import Image
import os




if __name__=='__main__':
    name=input("input the image number 1 or 2 you want clip:")
    imagepath='./data/image_{}.png'.format(name)
    n=os.path.basename(imagepath)[:-4]
    labelname='./data/'+n+'_label.png'
    dslb=gdal.Open(labelname)
    ds=gdal.Open(imagepath)
    wx=ds.RasterXSize
    wy=ds.RasterYSize
    stx=0
    sty=0
    step=900
    outsize=1500
    nullthresh=outsize*outsize*0.7
    cx=0
    cy=0
    while cy+outsize<wy:
        cx=0
        while cx+outsize<wx:
            img=ds.ReadAsArray(cx,cy,outsize,outsize)
            img2=img[:3,:,:].transpose(1,2,0)
            if (img2[:,:,0]==0).sum()>nullthresh:
                cx+=step
                print('kongbai...',cx,cy)
                continue
            
            img2=Image.fromarray(img2,'RGB')
            img2.save('./data/train/data1500/'+n+'_{}_{}.bmp'.format(cx,cy))
            #deal with label
            img=dslb.ReadAsArray(cx,cy,outsize,outsize)
            img=Image.fromarray(img).convert('L')
            img.save('./data/train/label1500/'+n+'_{}_{}.bmp'.format(cx,cy))

            cx+=step
        cy+=step

　　路径需要修改，就可使用。

这里我按照1500x1500大小切割的，打算用1024训练。

这样的数据的切图就算准备完了。如下图：

Pytorch【直播】2019 年县域农业大脑AI挑战赛---初级准备(一)切图