PyTorch?使用torchvision進行圖片數(shù)據(jù)增廣

更新時間：2022年05月06日 11:48:08 作者：峽谷的小魚

本文主要介紹了PyTorch?使用torchvision進行圖片數(shù)據(jù)增廣，文中通過示例代碼介紹的非常詳細，對大家的學(xué)習(xí)或者工作具有一定的參考學(xué)習(xí)價值，需要的朋友們下面隨著小編來一起學(xué)習(xí)學(xué)習(xí)吧

使用torchvision來進行圖片的數(shù)據(jù)增廣

數(shù)據(jù)增強就是增強一個已有數(shù)據(jù)集，使得有更多的多樣性。對于圖片數(shù)據(jù)來說，就是改變圖片的顏色和形狀等等。比如常見的：

左右翻轉(zhuǎn)，對于大多數(shù)數(shù)據(jù)集都可以使用；
上下翻轉(zhuǎn)：部分數(shù)據(jù)集不適合使用；
圖片切割：從圖片中切割出一個固定的形狀，

隨機高寬比（e.g. [3/4, 4/3）
隨機大小（e.g. [8%, 100%]）
隨機位置

改變圖片的顏色

改變色調(diào)，飽和度，明亮度（e.g. [0.5, 1.5]）

1. 讀取圖片

加載相關(guān)包。

import torch
import torchvision
import matplotlib

from torch import nn
from torchvision import transforms
from PIL import Image
from IPython import display
from matplotlib import pyplot as plt

選取一個狗的圖片作為示例：

def set_figsize(figsize=(3.5, 2.5)):
    
    display.set_matplotlib_formats('svg')
    plt.rcParams['figure.figsize'] = figsize    

def show_images(imgs, num_rows, num_cols, titles=None, scale=1.5):
    r"""
    展示一列圖片
    img: Image對象的列表
    """
    figsize = (num_cols * scale, num_rows * scale)
    fig, axes = plt.subplots(num_rows, num_cols, figsize=figsize)

    axes = axes.flatten()
    
    for i, (ax, img) in enumerate(zip(axes, imgs)):
        if torch.is_tensor(img):
            ax.imshow(img.numpy())
        else:
            ax.imshow(img)
            
        ax.get_xaxis().set_visible(False)
        ax.get_yaxis().set_visible(False)
        ax.get_xaxis().set_label('x')
        if titles:
            ax.set_title(titles[i])
    
    return axes

set_figsize()
img = Image.open('img/dog1.jpg')
plt.imshow(img);

2. 圖片增廣

def apply(img, aug, num_rows=2, num_clos=4, scale=1.5):
    # 對圖片應(yīng)用圖片增廣
    # img: Image object
    # aug: 增廣操作
    Y = [aug(img) for _ in range(num_clos * num_rows)]
    d2l.show_images(Y, num_rows, num_clos, scale=scale)

2.1 圖片水平翻轉(zhuǎn)

class RandomHorizontalFlip(torch.nn.modules.module.Module):
	r'''
	RandomHorizontalFlip(p=0.5)
	給圖片一個一定概率的水平翻轉(zhuǎn)操作，如果是Tensor，要求形狀為[..., H, W]
	Args:
		p: float, 圖片翻轉(zhuǎn)的概率，默認值0.5
	'''
	def __init__(self, p=0.5):
		pass

示例。可以看到，有一般的幾率對圖片進行了水平翻轉(zhuǎn)。

aug = transforms.RandomHorizontalFlip(0.5)
apply(img, aug)

2.2 圖片上下翻轉(zhuǎn)

class RandomVerticalFlip(torch.nn.modules.module.Module):
	r'''
	RandomVerticalFlip(p=0.5)
	給圖片一個一定概率的上下翻轉(zhuǎn)操作，如果是Tensor，要求形狀為[..., H, W]
	Args:
		p: float, 圖片翻轉(zhuǎn)的概率，默認值0.5
	'''
	def __init__(self, p=0.5):
		pass

示例?？梢钥吹剑幸话愕膸茁蕦D片進行了上下翻轉(zhuǎn)。

aug = transforms.RandomHorizontalFlip(0.5)
apply(img, aug)

2.3 圖片旋轉(zhuǎn)

class RandomRotation(torch.nn.modules.module.Module):
    r'''
    將圖片旋轉(zhuǎn)一定角度。
    '''
    def __init__(self, 
                 degrees, 
                 interpolation=<InterpolationMode.NEAREST: 'nearest'>, 
                 expand=False, 
                 center=None, 
                 fill=0):
        r"""
        Args:
            degrees: number or sequence, 可選擇的角度范圍(min, max)，
                        如果是一個數(shù)字，則范圍是(-degrees, +degrees)
            interpolation: Default is ``InterpolationMode.NEAREST``.
            expand: bool, 如果為True，則擴展輸出，使其足夠大來容納整個旋轉(zhuǎn)的圖像
                          如果為False, 將輸出圖像與輸入圖像的大小相同。
            center: sequence, 以左上角為原點的旋轉(zhuǎn)中心，默認是圖片中心。
            fill: sequence or number: 旋轉(zhuǎn)圖像外部區(qū)域的像素填充值，默認0。
        """
        pass
    
    def forward(self, input):
        r"""
        Args:
            img: PIL Image or Tensor, 被旋轉(zhuǎn)的圖片。
        Return:
            PIL Image or Tensor: 旋轉(zhuǎn)后的圖片。
        """
        pass

使用實例：

aug = transforms.RandomRotation(degrees=(-90, 90), fill=128)
apply(img, aug)

2.4 中心裁切

class CenterCrop(torch.nn.modules.module.Module):
    r'''
    中心裁切。
    
    '''
    def __init__(self, size):
        r"""
        Args:
           size: sequence or int, 裁切尺寸(H, W)， 如果是int，尺寸為(size, size)
        """
        pass
    
    def forward(self, input):
        r"""
        Args:
            img: PIL Image or Tensor, 被裁切的圖片。
        Return:
            PIL Image or Tensor: 裁切后的圖片。
        """
        pass

實例：

aug = transforms.CenterCrop((200, 300))
apply(img, aug)

2.5 隨機裁切

class RandomCrop(torch.nn.modules.module.Module):
    r'''
    隨機裁切。
    
    '''
    def __init__(self, size):
        r"""
        Args:
           size: sequence or int, 裁切尺寸(H, W)， 如果是int，尺寸為(size, size)
           padding: sequence or int, 填充大小，
                如果值為 a , 四周填充a個像素
                如果值為 (a, b), 左右填充a,上下填充b
                如果值為 (a, b, c, d), 左上右下依次填充
           pad_if_need: bool, 如果裁切尺寸大于原圖片，則填充
           fill: number or str or tuple: 填充像素的值
           padding_mode: str, 填充類型。
                   `constant`: 使用 fill 填充
                   `edge`: 使用邊緣的最后一個值填充在圖像邊緣。
                   `reflect`: 鏡像填充
        """
        pass
    
    def forward(self, input):
        r"""
        Args:
            img: PIL Image or Tensor, 被裁切的圖片。
        Return:
            PIL Image or Tensor: 裁切后的圖片。
        """
        pass

示例：

aug = transforms.RandomCrop((200, 300))
apply(img, aug)

輸出：

2.6 隨機裁切并修改尺寸

class RandomResizedCrop(torch.nn.modules.module.Module):
    r'''
    隨機裁切, 并重設(shè)尺寸。
    
    '''
    def __init__(self, size， scale=(0.08, 1.0), ratio=(0.75, 1.3333333333333333)):
        r"""
        Args:
           size: sequence or int, 需要輸出的尺寸(H, W)， 如果是int，尺寸為(size, size)
           scale: tuple of float, 原始圖片中裁切大小，百分比
           ratio: tuple of float, resize前的裁切的縱橫比范圍
        """
        pass
    
    def forward(self, input):
        r"""
        Args:
            img: PIL Image or Tensor, 被裁切的圖片。
        Return:
            PIL Image or Tensor: 輸出的圖片。
        """
        pass

示例：

aug = transforms.RandomResizedCrop((200, 200), scale=(0.2, 1))
apply(img, aug)

2. 7 修改圖片顏色

class ColorJitter(torch.nn.modules.module.Module):
    r'''
    修改顏色。
    
    '''
    def __init__(self, brightness=0, contrast=0, saturation=0, hue=0):
        r"""
        Args:
           brightness: float or tuple of float (min, max), 亮度的偏移幅度，范圍[max(0, 1 - brightness), 1 + brightness]
           contrast: float or tuple of float (min, max), 對比度偏移幅度，范圍[max(0, 1 - contrast), 1 + contrast]
           saturation: float or tuple of float (min, max), 飽和度偏移幅度，范圍[max(0, 1 - saturation), 1 + saturation]
           hue: float or tuple of float (min, max), 色相偏移幅度，范圍[-hue, hue]
        """
        pass
    
    def forward(self, input):
        r"""
        Args:
            img: PIL Image or Tensor, 輸入的圖片。
        Return:
            PIL Image or Tensor: 輸出的圖片。
        """
        pass

示例：

aug = transforms.ColorJitter(brightness=0.5, contrast=0.5, saturation=0.5, hue=0.5)
apply(img, aug)

3. 訓(xùn)練數(shù)據(jù)集加載

train_augs = transforms.Compose([transforms.RandomHorizontalFlip(), 
                                 torchvision.transforms.ToTensor()])
dataset = torchvision.datasets.CIFAR10(root="../data", train=is_train,
                                           transform=augs, download=True)

到此這篇關(guān)于PyTorch 使用torchvision進行圖片數(shù)據(jù)增廣的文章就介紹到這了,更多相關(guān)PyTorch torchvision 圖片增廣內(nèi)容請搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家！

您可能感興趣的文章: