Java分批將List數(shù)據(jù)導(dǎo)入數(shù)據(jù)庫的解決過程

更新時間：2023年08月24日 08:35:15 作者：Dream_飛翔

這篇文章主要給大家介紹了關(guān)于Java分批將List數(shù)據(jù)導(dǎo)入數(shù)據(jù)庫的解決過程,文中通過代碼示例介紹的非常詳細,對大家學(xué)習(xí)或者使用java具有一定的參考學(xué)習(xí)價值,需要的朋友可以參考下

一、項目場景：

在工作中的一個需求中，需要創(chuàng)建一張新的表，表格的初始數(shù)據(jù)需要從之前的多張表格中聯(lián)查出來并且添加到當(dāng)前表格中。由于在生產(chǎn)環(huán)境中數(shù)據(jù)量級達到了百萬級別，因此在插入數(shù)據(jù)到MySQL中時需要分批次進行導(dǎo)入，我寫了三種方法進行數(shù)據(jù)的導(dǎo)入，最后采用了第三種方法來進行數(shù)據(jù)導(dǎo)入，將實現(xiàn)過程在此進行記錄。

在文章中，我將使用User來作為示例對象用于演示

二、解決方案：

1. MyBatisPlus原生方法導(dǎo)入

// 獲取到要插入數(shù)據(jù)庫的集合，數(shù)據(jù)量很大
List<User> list = new ArrayList<>();
// 插入數(shù)據(jù)到MySQL中
userService.saveBatch(list);

2. List分組導(dǎo)入

（1）UserServiceImpl類中導(dǎo)入方法

@Service
public class UserServiceImpl extends ServiceImpl<UserMapper, User> implements IUserService {
	@Autowired
	private UserMapper userMapper;
	@Override
	public void insert() {
    	// 通過一系列操作獲取到要插入的集合，在此使用list代替
    	List<User> list = new ArrayList<>();
    	// 每次插入的數(shù)量
    	int batchSize = 1000;
    	// 計算需要分多少批插入數(shù)據(jù)庫
    	int batch = list.size() / batchSize;
    	// 計算最后一批的大小
    	int lastSize = list.size() % batchSize;
    	// 將篩選出的結(jié)果分批次添加到表中
    	for (int i = batchSize; i <= batch * batchSize; i = i + batchSize) {
        	// 截取本次要添加的數(shù)據(jù)
        	List<User> insertList = list.subList(i - batchSize, i);
        	// 添加本批次數(shù)據(jù)到數(shù)據(jù)庫中
        	userMapper.batchInsert(insertList);
    	}
    	// 最后一批元素的大小是否為0
    	if (lastSize != 0) {
        	// 如果元素有剩余則將所有元素作為一個子列表一次性插入
        	List<User> lastList = list.subList(batchSize * batch, list.size());
        	// 添加集合到數(shù)據(jù)庫中
        	userMapper.batchInsert(lastList);
    	}
	}
}

代碼解析：

先將列表分成每個1000個元素一批的子列表，然后使用自定義的 batchInsert() 方法對子列表進行批量插入操作。如果列表大小不是 1000 的倍數(shù)，則將剩余元素全部一次性插入。具體實現(xiàn)細節(jié)如下：

首先定義每一批次的插入數(shù)量 batchSize ，算出需要分幾批插入變量 batch ，以及最后一批插入數(shù)量（集合中元素總量如果不是1000倍數(shù)時最后一批的剩余數(shù)量）的大小 lastSize 。
通過 for 循環(huán)，將列表分成每個 1000 個元素一批的子列表，隨后使用 subList() 方法來獲取當(dāng)前批次要插入的元素。
對于每一批要插入的子列表，使用自定義的 batchInsert() 方法進行批量插入操作。
判斷最后一批的大小 lastSize 是否為 0，如果不為 0，則使用 subList() 方法將剩余所有元素作為一個子列表進行一次性插入。

在這里對最后一批插入元素時，下標(biāo)為什么要使用 batchSize（每一批次的大?。?* batch（批次數(shù)量）來作為起始變量進行解析：

解析：使用 batchSize * batch 作為集合的起始位置，是因為在 for 循環(huán)中已經(jīng)將前 a 個元素作為起始位置插入到數(shù)據(jù)庫中了，因此下一個起始位置應(yīng)該是 batchSize 的倍數(shù)，即 batchSize * batch 。這樣可以避免重復(fù)插入已經(jīng)插入過的元素。

（2）UserMapper數(shù)據(jù)持久化接口

將集合作為參數(shù)傳遞到Mapper層中

/**
 * 用戶數(shù)據(jù)持久化接口
 *
 * @author Dream_飛翔
 * @since 2023/5/16
 */
public interface UserMapper extends BaseMapper<User> {
	/**
     * 添加指定集合內(nèi)的數(shù)據(jù)到數(shù)據(jù)庫中
     *
     * @param insertList 要添加的內(nèi)容
     * @return 受影響的行數(shù)
     */
    Integer batchInsert(@Param("insertList") List<User> insertList);
}

（3）UserMapper.xml映射文件

<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE mapper PUBLIC "-//mybatis.org//DTD Mapper 3.0//EN"
        "http://mybatis.org/dtd/mybatis-3-mapper.dtd" >
<mapper namespace="com.zrkizzy.data.mapper.UserMapper">
	<!-- 批量添加數(shù)據(jù)到數(shù)據(jù)庫中 -->
	<insert id="batchInsert">
        INSERT INTO tb_user (id, username, password)
        VALUES
        <foreach collection ="userList" item="user" separator =",">
            (#{user.id}, #{user.username}, #{user.password})
        </foreach>
    </insert>
</mapper>

3. 多線程分批次插入

在第二種方法中，使用了分批次處理的數(shù)據(jù)導(dǎo)入方式，但是在數(shù)據(jù)量特別大的情況下，單線程的壓力還是很大，因此使用多線程是比較好的一種方式。

（1）UserServiceImpl類中導(dǎo)入方法

@Service
public class UserServiceImpl extends ServiceImpl<UserMapper, User> implements IUserService {
	@Autowired
	private UserMapper userMapper;
	@Override
	public void insert() {
    	// 通過一系列操作獲取到要插入的集合，在此使用list代替
    	List<User> list = new ArrayList<>();
        // 獲取虛擬機可用的最大處理器數(shù)量
        int availableProcessors = Runtime.getRuntime().availableProcessors();
        // 獲取要添加的數(shù)據(jù)集合大小
        int total = list.size();
        // 每次插入的數(shù)量
        int batchSize = 1000;
        // 計算需要分多少批插入數(shù)據(jù)庫（向上取整）
        int totalBatch = (total + batchSize - 1) / batchSize;
        // 手動創(chuàng)建線程池
        ExecutorService executor = new ThreadPoolExecutor(
                // 線程池核心線程數(shù)量
                availableProcessors,
                // 線程池最大數(shù)量
                availableProcessors + 1000,
                // 空閑線程存活時間
                1000,
                // 時間單位
                TimeUnit.MILLISECONDS,
                // 線程池所使用的緩沖隊列
                new ArrayBlockingQueue<>(100),
                // 線程池對拒絕任務(wù)的處理策略
                new ThreadPoolExecutor.CallerRunsPolicy());
        // 將篩選出的結(jié)果分批次添加到表中
        for (int batchIndex = 0; batchIndex < totalBatch; batchIndex++) {
            // 當(dāng)前插入批次的起始索引
            int startIndex = batchIndex * batchSize;
            // 當(dāng)前插入批次的結(jié)束索引
            int endIndex = Math.min((batchIndex + 1) * batchSize, total);
            // 截取本次要添加的數(shù)據(jù)
            List<LuckyDrawHistory> insertList = list.subList(startIndex, endIndex);
            // 將每個批次的插入邏輯封裝成一個Runnable對象
            Runnable task = () -> {
                // 添加本批次數(shù)據(jù)到數(shù)據(jù)庫中
                userMapper.batchInsert(insertList);
            };
            // 提交添加任務(wù)
            executor.submit(task);
        }
        // 關(guān)閉線程池釋放資源
        executor.shutdown();
    }
}

總結(jié)

到此這篇關(guān)于Java分批將List數(shù)據(jù)導(dǎo)入數(shù)據(jù)庫的文章就介紹到這了,更多相關(guān)Java List數(shù)據(jù)導(dǎo)入數(shù)據(jù)庫內(nèi)容請搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家！

您可能感興趣的文章: