pytorch cnn 識別手寫的字實現(xiàn)自建圖片數(shù)據(jù)

更新時間：2018年05月20日 17:03:26 作者：瓦力冫

這篇文章主要介紹了pytorch cnn 識別手寫的字實現(xiàn)自建圖片數(shù)據(jù)，小編覺得挺不錯的，現(xiàn)在分享給大家，也給大家做個參考。一起跟隨小編過來看看吧

本文主要介紹了pytorch cnn 識別手寫的字實現(xiàn)自建圖片數(shù)據(jù)，分享給大家，具體如下：

# library
# standard library
import os 
# third-party library
import torch
import torch.nn as nn
from torch.autograd import Variable
from torch.utils.data import Dataset, DataLoader
import torchvision
import matplotlib.pyplot as plt
from PIL import Image
import numpy as np
# torch.manual_seed(1)  # reproducible 
# Hyper Parameters
EPOCH = 1        # train the training data n times, to save time, we just train 1 epoch
BATCH_SIZE = 50
LR = 0.001       # learning rate 
 
root = "./mnist/raw/"
 
def default_loader(path):
  # return Image.open(path).convert('RGB')
  return Image.open(path)
 
class MyDataset(Dataset):
  def __init__(self, txt, transform=None, target_transform=None, loader=default_loader):
    fh = open(txt, 'r')
    imgs = []
    for line in fh:
      line = line.strip('\n')
      line = line.rstrip()
      words = line.split()
      imgs.append((words[0], int(words[1])))
    self.imgs = imgs
    self.transform = transform
    self.target_transform = target_transform
    self.loader = loader
    fh.close()
  def __getitem__(self, index):
    fn, label = self.imgs[index]
    img = self.loader(fn)
    img = Image.fromarray(np.array(img), mode='L')
    if self.transform is not None:
      img = self.transform(img)
    return img,label
  def __len__(self):
    return len(self.imgs)
 
train_data = MyDataset(txt= root + 'train.txt', transform = torchvision.transforms.ToTensor())
train_loader = DataLoader(dataset = train_data, batch_size=BATCH_SIZE, shuffle=True)
 
test_data = MyDataset(txt= root + 'test.txt', transform = torchvision.transforms.ToTensor())
test_loader = DataLoader(dataset = test_data, batch_size=BATCH_SIZE)
 
class CNN(nn.Module):
  def __init__(self):
    super(CNN, self).__init__()
    self.conv1 = nn.Sequential(     # input shape (1, 28, 28)
      nn.Conv2d(
        in_channels=1,       # input height
        out_channels=16,      # n_filters
        kernel_size=5,       # filter size
        stride=1,          # filter movement/step
        padding=2,         # if want same width and length of this image after con2d, padding=(kernel_size-1)/2 if stride=1
      ),               # output shape (16, 28, 28)
      nn.ReLU(),           # activation
      nn.MaxPool2d(kernel_size=2),  # choose max value in 2x2 area, output shape (16, 14, 14)
    )
    self.conv2 = nn.Sequential(     # input shape (16, 14, 14)
      nn.Conv2d(16, 32, 5, 1, 2),   # output shape (32, 14, 14)
      nn.ReLU(),           # activation
      nn.MaxPool2d(2),        # output shape (32, 7, 7)
    )
    self.out = nn.Linear(32 * 7 * 7, 10)  # fully connected layer, output 10 classes
 
  def forward(self, x):
    x = self.conv1(x)
    x = self.conv2(x)
    x = x.view(x.size(0), -1)      # flatten the output of conv2 to (batch_size, 32 * 7 * 7)
    output = self.out(x)
    return output, x  # return x for visualization 
cnn = CNN()
print(cnn) # net architecture
 
optimizer = torch.optim.Adam(cnn.parameters(), lr=LR)  # optimize all cnn parameters
loss_func = nn.CrossEntropyLoss()            # the target label is not one-hotted 
 
# training and testing
for epoch in range(EPOCH):
  for step, (x, y) in enumerate(train_loader):  # gives batch data, normalize x when iterate train_loader
    b_x = Variable(x)  # batch x
    b_y = Variable(y)  # batch y
 
    output = cnn(b_x)[0]        # cnn output
    loss = loss_func(output, b_y)  # cross entropy loss
    optimizer.zero_grad()      # clear gradients for this training step
    loss.backward()         # backpropagation, compute gradients
    optimizer.step()        # apply gradients
 
    if step % 50 == 0:
      cnn.eval()
      eval_loss = 0.
      eval_acc = 0.
      for i, (tx, ty) in enumerate(test_loader):
        t_x = Variable(tx)
        t_y = Variable(ty)
        output = cnn(t_x)[0]
        loss = loss_func(output, t_y)
        eval_loss += loss.data[0]
        pred = torch.max(output, 1)[1]
        num_correct = (pred == t_y).sum()
        eval_acc += float(num_correct.data[0])
      acc_rate = eval_acc / float(len(test_data))
      print('Test Loss: {:.6f}, Acc: {:.6f}'.format(eval_loss / (len(test_data)), acc_rate))

圖片和label 見上一篇文章《pytorch 把MNIST數(shù)據(jù)集轉(zhuǎn)換成圖片和txt》

結(jié)果如下：

以上就是本文的全部內(nèi)容，希望對大家的學習有所幫助，也希望大家多多支持腳本之家。

您可能感興趣的文章:

Python 使用雙重循環(huán)打印圖形菱形操作
這篇文章主要介紹了Python 使用雙重循環(huán)打印圖形菱形操作，具有很好的參考價值，希望對大家有所幫助。一起跟隨小編過來看看吧
2020-08-08
python實現(xiàn)汽車管理系統(tǒng)
這篇文章主要為大家詳細介紹了python實現(xiàn)汽車管理系統(tǒng)，具有一定的參考價值，感興趣的小伙伴們可以參考一下
2018-11-11
python創(chuàng)建列表并給列表賦初始值的方法
這篇文章主要介紹了python創(chuàng)建列表并給列表賦初始值的方法,涉及Python列表的定義與賦值技巧,需要的朋友可以參考下
2015-07-07
Python?OpenCV超詳細講解基本功能
OpenCV用C++語言編寫，它具有C?++，Python，Java和MATLAB接口，并支持Windows，Linux，Android和Mac?OS，OpenCV主要傾向于實時視覺應用，并在可用時利用MMX和SSE指令，本篇文章帶你了解OpenCV的基本功能
2022-04-04
Opencv Python實現(xiàn)兩幅圖像匹配
這篇文章主要為大家詳細介紹了Opencv Python實現(xiàn)兩幅圖像匹配，文中示例代碼介紹的非常詳細，具有一定的參考價值，感興趣的小伙伴們可以參考一下
2021-06-06
python3+PyQt5重新實現(xiàn)QT事件處理程序
這篇文章主要為大家詳細介紹了python3+PyQt5重新實現(xiàn)QT事件處理程序，具有一定的參考價值，感興趣的小伙伴們可以參考一下
2018-04-04
Python調(diào)用DeepSeek?API實現(xiàn)對本地數(shù)據(jù)庫的AI管理
這篇文章主要為大家詳細介紹了Python如何基于DeepSeek模型實現(xiàn)對本地數(shù)據(jù)庫的AI管理,文中的示例代碼簡潔易懂,有需要的小伙伴可以跟隨小編一起學習一下
2025-02-02
python 已知三條邊求三角形的角度案例
這篇文章主要介紹了python 已知三條邊求三角形的角度案例，具有很好的參考價值，希望對大家有所幫助。一起跟隨小編過來看看吧
2020-04-04
Python?selenium?find_element()示例詳解
selenium定位元素的函數(shù)/方法可以分為兩類:find_element及find_elements,下面這篇文章主要給大家介紹了關于Python?selenium?find_element()的相關資料,需要的朋友可以參考下
2022-07-07
一文詳解如何在Matplotlib中更改圖例字體大小
在我們處理數(shù)據(jù)的時候,需要對大量的數(shù)據(jù)進行繪圖,就免不了要使用到Matplotlib,下面這篇文章主要給大家介紹了關于如何在Matplotlib中更改圖例字體大小的相關資料,需要的朋友可以參考下
2023-05-05