如何快速对比2个数组数据?

需求:有份1G+的json文件,存放着一个列表数据,文件会变更,数据全量写入,在监听到文件变更后需要知道变更前后增加、删除、变更了哪些数据?
列表数据的格式如下:(id字段唯一,name字段数据可能会变)

[{

"id": "id1",

"name": "name1"

}, {

"id": "id2",

"name": "name2"

}]

大致思路方案如下:

let oldList = [

{"id": "id1", "name": "name1"},

{"id": "id2", "name": "name2"}

];

let newList = [

{"id": "id1", "name": "name4"},

{"id": "id3", "name": "name3"}

];

// getDiffData为方案所需实现的算法

let {addList, removeList, changeList} = getDiffData(oldList, newList, 'id');

console.log(addList);

// [{"id": "id3", "name": "name3"}]

console.log(removeList);

// [{"id": "id2", "name": "name2"}]

console.log(changeList);

// [{"id": "id1", "name": "name4"}] or [{"id": "id1", "name": "name1"}]

问题:需要对比的文件数据较大(几百万条数据),希望该计算的时间尽量小点

回答

试下这个库:https://www.npmjs.com/package...

function getDiffData(oldList, newList) {

const oldData = oldList.reduce((data, item) => {

data[item.id] = item;

return data;

}, {});

const addList = [];

const changeList = [];

for (const item of newList) {

if (!oldData[item.id]) {

addList.push(item);

} else {

if (oldData[item.id].name !== item.name) {

changeList.push(item);

}

oldData[item.id] = true;

}

}

const removeList = oldList.filter(item => oldData[item.id] !== true);

return {

addList,

changeList,

removeList,

};

}

有唯一字段可以尝试把其中一个数组转换成对象后对比,可以有效提升效率

const oldList = Array.from({length: 300000}, (v, id) => ({

id,

name: (((1 + Math.random()) * 0x10000) | 0).toString(16).substring(1),

}));

const newList = Array.from({length: 350000}, (v, id) => ({

id,

name: (((1 + Math.random()) * 0x10000) | 0).toString(16).substring(1),

}));

console.time('getDiffData');

const {addList, changeList, removeList} = getDiffData(oldList, newList);

console.timeEnd('getDiffData');

console.log(`addList: ${addList.length}`, `changeList: ${changeList.length}`, `removeList: ${removeList.length}`);

写了个分别创建的30万和35万条数组进行对比,只要40ms左右
image

以上是 如何快速对比2个数组数据? 的全部内容, 来源链接: utcz.com/a/50706.html

回到顶部