【Java】一致性 hash 算法理解与实现

一致性 hash 算法理解与实现

cartoon发布于 7 分钟前

前言

近段时间在了解分布式时,经常绕不开一个算法: 一致性哈希算法。于是在了解并实践这个算法后,就有了此文章。

算法间的对比

在分布式分片中,存在着几种算法: 取模,分段,一致性 hash。

取模分段一致性哈希
上层是否感知
迁移成本低,只涉及相邻节点
单点故障影响低,只影响相邻节点
算法复杂度
热点数据存在存在存在

一致性哈希主要解决问题

从上述对比可知,一致性哈希主要降低节点上下线中带来的数据迁移成本,同时节点数量的变化与分片原则于上层无感,使上层更专注于领域内逻辑的编写,使整体架构更加灵活。

一致性 hash 原理

  1. 基本数据结构

​ 基本数据类型因人而已,常规的哈希取模采用大多采用将数据 hash 到 2^32 - 1的空间中,一致性哈希通常在此基础上将空间变成一个环。如下图所示。

【Java】一致性 hash 算法理解与实现

​ 本次实现采用的是 key 按大小排列的哈希表。原打算使用数组承接数据,但排序成本随 key 的增多而加大,遂放弃。

  1. 数据存储

    数据存储与哈希取模算法大致相同,都是通过计算存入数据的哈希值放入对应的哈希槽上。但一致性哈希差异之处在于当计算 hash 不在环上,数据存入首个 hash 槽中。

    场景假设: 现已上线 4 节点(server1 ~ 4),对应 hash 值为 hash1 ~ 4。现有5个数据(hash1 ~ 5)于存入节点中,结果如下图所示。

    【Java】一致性 hash 算法理解与实现

​ 本次实现采用的思路是

1. 计算存入数据的 hash 值

2. 寻找最近的(比数据 hash 值大的最小的节点 hash)节点并写入

3. 若 2 中未能寻找服务器,则写入第一个(hash 最小)节点中

  1. 节点上线

​ 新节点加入一致性哈希环中,原理是通过计算节点所代表的 hash 值,并根据计算值将节点映射在环上,最后迁移相邻节点数据到新节点上。

​ 场景假设: 现已上线 4 台服务器(server1 ~ 4),对应 hash 值为 hash1 ~ 4。现有一个新节点(hash5)节点上线到环上。结果如下图所示。

【Java】一致性 hash 算法理解与实现

​ 本次实现采用的思路是

1. 计算上线节点 hash 值

2. 计算上线节点所新增的虚拟节点的 hash 值(若初始化指定虚拟节点数量)

3. 寻找最近的(比上线节点与虚拟节点 hash 值大的最小的节点 hash)节点,取出节点数据

4. 将1 2点节点加入到 hash 环中

5. 将 3 中取出的数据重新放入到 hash 环上

  1. 节点下线

​ 已有节点下线,原理是通过计算节点所代表的 hash 值,取出节点所含数据,下线节点,将取出数据重新放入 hash 环上。

​ 场景假设: 现已上线 5 台服务器(server1 ~ 5),对应 hash 值为 hash1 ~ 5。现节点 server4 下线。结果如下图所示。

【Java】一致性 hash 算法理解与实现

​ 本次实现采用的思路是

1. 计算下线节点 hash 值

2. 取出下线节点以及虚拟节点(若初始化指定虚拟节点数量)存储数据

3. 将下线节点以及虚拟节点(若初始化指定虚拟节点数量)从 hash 环上移除

4. 将 2 中数据重新放入到环上

代码实现

一致性哈希分为两个方案: 不带虚拟节点与带虚拟节点。而两个方案实现类似,所以本次实现将两种方案合在一起实现。实现如下。

package org.CommonAlgorithms.ConsistentHash;

import org.CommonAlgorithms.HashAlgorithm.HashService;

import java.util.List;

/**

* consistentHashing interface

* @author cartoon

* @since 10/01/2021

* @version 1.1

*/

public interface ConsistentHashing {

/**

* put data to hash loop

* @param data data list

* @return put result

*/

boolean putData(List<String> data);

/**

* put data to hash loop

* @param data data

* @return put result

*/

boolean putData(String data);

/**

* remove node from hash loop

* @param node removing node

* @return remove result

*/

boolean removeNode(String node);

/**

* add node to hash loop

* @param node adding node

* @return add result

*/

boolean addNode(String node);

/**

* inject hash method to hash loop

* @param hashService hash method

* @throws UnsupportedOperationException if loop already has node

*/

void setHashMethod(HashService hashService);

/**

* print all data in loop according ascending hash value with nodes

*/

void printAllData();

}

package org.CommonAlgorithms.ConsistentHash;

import org.CommonAlgorithms.HashAlgorithm.HashService;

import org.slf4j.Logger;

import org.slf4j.LoggerFactory;

import java.util.*;

/**

* consistent hash achieve

* @author cartoon

* @since 2021/01/17

*/

public class ConsistentHashingImpl implements ConsistentHashing {

private static final Logger log = LoggerFactory.getLogger(ConsistentHashingImpl.class);

/**

* virtual node name template

*/

private static final String virtualNodeFormat = "%s&&%d";

/**

* real node and its virtual node mapping

*/

private SortedMap<String, List<String>> realNodeToVirtualNode;

/**

* hash and its node mapping

*/

private SortedMap<Integer, String> hashToNodes;

/**

* node and its data mapping

*/

private Map<String, List<String>> nodeToData;

/**

* determine virtual node's number of each node

*/

private int virtualNodeNum;

/**

* inject hash method, if null, use loop default hash method

*/

private HashService hashService;

public ConsistentHashingImpl() {

this(0, new String[0]);

}

public ConsistentHashingImpl(String... nodes) {

this(0, nodes);

}

public ConsistentHashingImpl(int virtualNodeNum) {

this(virtualNodeNum, new String[0]);

}

public ConsistentHashingImpl(int virtualNodeNum, String... nodes) {

//1. intercept virtual num smaller than 0

if(virtualNodeNum < 0){

log.error("virtual num is not allow smaller than 0");

throw new IllegalArgumentException();

}

//2. initialize loop member attributes

this.virtualNodeNum = virtualNodeNum;

realNodeToVirtualNode = new TreeMap<>();

hashToNodes = new TreeMap<>();

nodeToData = new HashMap<>();

for(String server : nodes){

hashToNodes.put(getHash(server), server);

nodeToData.put(server, new LinkedList<>());

}

//3. if virtual node number bigger than 0, add virtual node

if(virtualNodeNum > 0){

for(String server : nodes){

addVirtualNode(server);

}

}

}

@Override

public boolean putData(List<String> data) {

//1. circulate call put data method to add data to loop

for(String incomingData : data){

if(!putData(incomingData)){

return false;

}

}

return true;

}

@Override

public boolean putData(String data) {

if(hashToNodes.isEmpty()){

log.error("put data, usable server is empty");

return false;

}

//1. calculate data's hash value

int currentHash = getHash(data);

//2. get usual node(node's hash value is bigger than data's hash value), if usual node list is empty, get first node in loop

SortedMap<Integer, String> usableNodes = hashToNodes.tailMap(currentHash);

String node = usableNodes.isEmpty() ? hashToNodes.get(hashToNodes.firstKey()) : usableNodes.get(usableNodes.firstKey());

//3. add data to node

List<String> dataList = nodeToData.get(node);

dataList.add(data);

log.info("put data, data {} is placed to server {}, hash: {}", data, node, currentHash);

return true;

}

@Override

public boolean removeNode(String node) {

//1. calculate hash value of removing node

int removeServerHash = getHash(node);

if(!hashToNodes.containsKey(removeServerHash)){

log.error("remove server, current server is not in server list, please check server ip");

return false;

}

//2. get data from removing node

List<String> removeServerData = nodeToData.get(node);

//3. get removing node's virtual node data, remove all virtual node with removing node

if(virtualNodeNum != 0){

for(String virtualNode : realNodeToVirtualNode.get(node)){

removeServerData.addAll(nodeToData.get(virtualNode));

hashToNodes.remove(getHash(virtualNode));

nodeToData.remove(virtualNode);

}

}

//4. remove node from hash loop

hashToNodes.remove(removeServerHash);

nodeToData.remove(node);

if(hashToNodes.size() == 0){

log.info("remove server, after remove, server list is empty");

return true;

}

//5. put data to loop by call put data method

putData(removeServerData);

log.info("remove server, remove server {} success", node);

return true;

}

@Override

public boolean addNode(String node) {

//1, calculate adding node's hash value

int addServerHash = getHash(node);

//2. add node and migrate data

if(hashToNodes.isEmpty()){

//2.1 add node and its virtual node to loop directly when current loop is empty

hashToNodes.put(addServerHash, node);

nodeToData.put(node, new LinkedList<>());

if(virtualNodeNum > 0){

addVirtualNode(node);

}

} else{

//2.2.1 get data to be migrated from loop

SortedMap<Integer, String> greatServers = hashToNodes.tailMap(addServerHash);

String greatServer = greatServers.isEmpty() ? hashToNodes.get(hashToNodes.firstKey()) : greatServers.get(greatServers.firstKey());

List<String> firstGreatServerData = new LinkedList<>(nodeToData.get(greatServer));

//2.2.2 add node and its virtual node to loop

hashToNodes.put(addServerHash, node);

nodeToData.put(greatServer, new LinkedList<>());

nodeToData.put(node, new LinkedList<>());

if(virtualNodeNum != 0){

addVirtualNode(node);

}

//2.2.3 migrate 2.2.1 data to loop by call put data method

putData(firstGreatServerData);

}

log.info("add server, server {} has been added", node);

return true;

}

@Override

public void printAllData() {

nodeToData.forEach((server, data) -> log.info("server {} contains data {}", server, data));

}

@Override

public void setHashMethod(HashService hashService) {

if(!hashToNodes.isEmpty()){

throw new UnsupportedOperationException();

}

this.hashService = hashService;

}

private void addVirtualNode(String realNode){

if(virtualNodeNum > 0){

List<String> virtualNodeList = new LinkedList<>();

for(int cnt = 0; cnt < this.virtualNodeNum; cnt++){

//1. generate virtual node name by default format

String virtualNodeName = String.format(virtualNodeFormat, realNode, cnt);

//2. calculate each virtual node's hash value

int virtualNodeHash = getHash(virtualNodeName);

//3. current node already exist in loop, continue

if(hashToNodes.containsKey(virtualNodeHash)){

continue;

}

//4. add node to loop

virtualNodeList.add(virtualNodeName);

hashToNodes.put(virtualNodeHash, virtualNodeName);

nodeToData.put(virtualNodeName, new LinkedList<>());

}

//5. map virtual node to real node

realNodeToVirtualNode.put(realNode, virtualNodeList);

}

}

private int getHash(String data){

return hashService == null ? defaultGetHash(data) : hashService.getHash(data);

}

private int defaultGetHash(String data){

int res = 0;

for(char tempChar : data.toCharArray()){

if(tempChar >= '0' && tempChar <= '9'){

res += tempChar;

}

}

return res;

}

}

测试结果

不带虚拟节点的一致性哈希

测试代码

package ConsistentHash;

import org.CommonAlgorithms.ConsistentHash.ConsistentHashing;

import org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl;

import org.junit.Assert;

import org.junit.Before;

import org.junit.Test;

import org.slf4j.Logger;

import org.slf4j.LoggerFactory;

/**

* @author cartoon

* @date 2020/12/27

*/

public class ConsistentHashingWithoutVirtualNodeTest {

private static final Logger log = LoggerFactory.getLogger(ConsistentHashingWithoutVirtualNodeTest.class);

private ConsistentHashing consistentHashing;

private String[] servers;

private String[] data;

@Before

public void before(){

servers = new String[]{"000", "111", "222", "333", "555"};

consistentHashing = new ConsistentHashingImpl(servers);

data = new String[]{"000", "111", "222", "333", "555"};

}

@Test

public void testConsistentHashing(){

for(String str : data){

Assert.assertTrue(consistentHashing.putData(str));

}

consistentHashing.removeNode("333");

consistentHashing.addNode("444");

consistentHashing.putData("444");

consistentHashing.printAllData();

}

}

测试结果

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 000 is placed to server 000, hash: 144

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 111 is placed to server 111, hash: 147

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 222 is placed to server 222, hash: 150

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 333 is placed to server 333, hash: 153

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 555 is placed to server 555, hash: 159

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 333 is placed to server 555, hash: 153

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - remove server, remove server 333 success

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 555 is placed to server 555, hash: 159

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 333 is placed to server 444, hash: 153

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - add server, server 444 has been added

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 444 is placed to server 444, hash: 156

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 000 contains data [000]

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 111 contains data [111]

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 222 contains data [222]

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 444 contains data [333, 444]

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 555 contains data [555]

含虚拟节点的一致性哈希测试

测试代码

package ConsistentHash;

import org.CommonAlgorithms.ConsistentHash.ConsistentHashing;

import org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl;

import org.junit.Assert;

import org.junit.Before;

import org.junit.Test;

import org.slf4j.Logger;

import org.slf4j.LoggerFactory;

/**

* @author cartoon

* @date 2021/01/17

*/

public class ConsistentHashingWithVirtualNodeTest {

private static final Logger log = LoggerFactory.getLogger(ConsistentHashingWithVirtualNodeTest.class);

private ConsistentHashing consistentHashing;

private String[] servers;

private String[] data;

@Before

public void before(){

servers = new String[]{"000", "111", "222", "333", "555"};

consistentHashing = new ConsistentHashingImpl(3, servers);

data = new String[]{"000", "111", "222", "333", "555"};

}

@Test

public void testConsistentHashing(){

for(String str : data){

Assert.assertTrue(consistentHashing.putData(str));

}

consistentHashing.removeNode("333");

consistentHashing.addNode("444");

consistentHashing.putData("444");

consistentHashing.putData("555&&0");

consistentHashing.printAllData();

}

}

测试结果

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 000 is placed to server 000, hash: 144

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 111 is placed to server 111, hash: 147

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 222 is placed to server 222, hash: 150

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 333 is placed to server 333, hash: 153

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 555 is placed to server 555, hash: 159

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 333 is placed to server 555, hash: 153

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - remove server, remove server 333 success

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 555 is placed to server 555, hash: 159

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 333 is placed to server 444, hash: 153

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - add server, server 444 has been added

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 444 is placed to server 444, hash: 156

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 555&&0 is placed to server 555&&0, hash: 207

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 000&&2 contains data []

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 000&&1 contains data []

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 000&&0 contains data []

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 111&&1 contains data []

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 111&&2 contains data []

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 555&&1 contains data []

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 555&&2 contains data []

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 222&&2 contains data []

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 444&&0 contains data []

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 444&&1 contains data []

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 444&&2 contains data []

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 555&&0 contains data [555&&0]

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 000 contains data [000]

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 111 contains data [111]

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 222 contains data [222]

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 222&&0 contains data []

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 444 contains data [333, 444]

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 555 contains data [555]

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 222&&1 contains data []

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 111&&0 contains data []

实现存在的缺陷

1. 哈希算法过于简单,哈希冲突概率较大

2. 真实节点含有虚拟节点的数量不均

3. 节点上线时真实节点与已存在的虚拟节点的顺序冲突尚未解决

后记

本次实现的所有代码已全部上传到 github,项目主要包含一些常用的算法,如排序算法,限流算法的简单实现,欢迎提 issue。

java一致性哈希算法

阅读 10发布于 7 分钟前

本作品系原创,采用《署名-非商业性使用-禁止演绎 4.0 国际》许可协议

avatar

cartoon

do what I like,love who I love

197 声望

7 粉丝

0 条评论

得票时间

avatar

cartoon

do what I like,love who I love

197 声望

7 粉丝

宣传栏

前言

近段时间在了解分布式时,经常绕不开一个算法: 一致性哈希算法。于是在了解并实践这个算法后,就有了此文章。

算法间的对比

在分布式分片中,存在着几种算法: 取模,分段,一致性 hash。

取模分段一致性哈希
上层是否感知
迁移成本低,只涉及相邻节点
单点故障影响低,只影响相邻节点
算法复杂度
热点数据存在存在存在

一致性哈希主要解决问题

从上述对比可知,一致性哈希主要降低节点上下线中带来的数据迁移成本,同时节点数量的变化与分片原则于上层无感,使上层更专注于领域内逻辑的编写,使整体架构更加灵活。

一致性 hash 原理

  1. 基本数据结构

​ 基本数据类型因人而已,常规的哈希取模采用大多采用将数据 hash 到 2^32 - 1的空间中,一致性哈希通常在此基础上将空间变成一个环。如下图所示。

【Java】一致性 hash 算法理解与实现

​ 本次实现采用的是 key 按大小排列的哈希表。原打算使用数组承接数据,但排序成本随 key 的增多而加大,遂放弃。

  1. 数据存储

    数据存储与哈希取模算法大致相同,都是通过计算存入数据的哈希值放入对应的哈希槽上。但一致性哈希差异之处在于当计算 hash 不在环上,数据存入首个 hash 槽中。

    场景假设: 现已上线 4 节点(server1 ~ 4),对应 hash 值为 hash1 ~ 4。现有5个数据(hash1 ~ 5)于存入节点中,结果如下图所示。

    【Java】一致性 hash 算法理解与实现

​ 本次实现采用的思路是

1. 计算存入数据的 hash 值

2. 寻找最近的(比数据 hash 值大的最小的节点 hash)节点并写入

3. 若 2 中未能寻找服务器,则写入第一个(hash 最小)节点中

  1. 节点上线

​ 新节点加入一致性哈希环中,原理是通过计算节点所代表的 hash 值,并根据计算值将节点映射在环上,最后迁移相邻节点数据到新节点上。

​ 场景假设: 现已上线 4 台服务器(server1 ~ 4),对应 hash 值为 hash1 ~ 4。现有一个新节点(hash5)节点上线到环上。结果如下图所示。

【Java】一致性 hash 算法理解与实现

​ 本次实现采用的思路是

1. 计算上线节点 hash 值

2. 计算上线节点所新增的虚拟节点的 hash 值(若初始化指定虚拟节点数量)

3. 寻找最近的(比上线节点与虚拟节点 hash 值大的最小的节点 hash)节点,取出节点数据

4. 将1 2点节点加入到 hash 环中

5. 将 3 中取出的数据重新放入到 hash 环上

  1. 节点下线

​ 已有节点下线,原理是通过计算节点所代表的 hash 值,取出节点所含数据,下线节点,将取出数据重新放入 hash 环上。

​ 场景假设: 现已上线 5 台服务器(server1 ~ 5),对应 hash 值为 hash1 ~ 5。现节点 server4 下线。结果如下图所示。

【Java】一致性 hash 算法理解与实现

​ 本次实现采用的思路是

1. 计算下线节点 hash 值

2. 取出下线节点以及虚拟节点(若初始化指定虚拟节点数量)存储数据

3. 将下线节点以及虚拟节点(若初始化指定虚拟节点数量)从 hash 环上移除

4. 将 2 中数据重新放入到环上

代码实现

一致性哈希分为两个方案: 不带虚拟节点与带虚拟节点。而两个方案实现类似,所以本次实现将两种方案合在一起实现。实现如下。

package org.CommonAlgorithms.ConsistentHash;

import org.CommonAlgorithms.HashAlgorithm.HashService;

import java.util.List;

/**

* consistentHashing interface

* @author cartoon

* @since 10/01/2021

* @version 1.1

*/

public interface ConsistentHashing {

/**

* put data to hash loop

* @param data data list

* @return put result

*/

boolean putData(List<String> data);

/**

* put data to hash loop

* @param data data

* @return put result

*/

boolean putData(String data);

/**

* remove node from hash loop

* @param node removing node

* @return remove result

*/

boolean removeNode(String node);

/**

* add node to hash loop

* @param node adding node

* @return add result

*/

boolean addNode(String node);

/**

* inject hash method to hash loop

* @param hashService hash method

* @throws UnsupportedOperationException if loop already has node

*/

void setHashMethod(HashService hashService);

/**

* print all data in loop according ascending hash value with nodes

*/

void printAllData();

}

package org.CommonAlgorithms.ConsistentHash;

import org.CommonAlgorithms.HashAlgorithm.HashService;

import org.slf4j.Logger;

import org.slf4j.LoggerFactory;

import java.util.*;

/**

* consistent hash achieve

* @author cartoon

* @since 2021/01/17

*/

public class ConsistentHashingImpl implements ConsistentHashing {

private static final Logger log = LoggerFactory.getLogger(ConsistentHashingImpl.class);

/**

* virtual node name template

*/

private static final String virtualNodeFormat = "%s&&%d";

/**

* real node and its virtual node mapping

*/

private SortedMap<String, List<String>> realNodeToVirtualNode;

/**

* hash and its node mapping

*/

private SortedMap<Integer, String> hashToNodes;

/**

* node and its data mapping

*/

private Map<String, List<String>> nodeToData;

/**

* determine virtual node's number of each node

*/

private int virtualNodeNum;

/**

* inject hash method, if null, use loop default hash method

*/

private HashService hashService;

public ConsistentHashingImpl() {

this(0, new String[0]);

}

public ConsistentHashingImpl(String... nodes) {

this(0, nodes);

}

public ConsistentHashingImpl(int virtualNodeNum) {

this(virtualNodeNum, new String[0]);

}

public ConsistentHashingImpl(int virtualNodeNum, String... nodes) {

//1. intercept virtual num smaller than 0

if(virtualNodeNum < 0){

log.error("virtual num is not allow smaller than 0");

throw new IllegalArgumentException();

}

//2. initialize loop member attributes

this.virtualNodeNum = virtualNodeNum;

realNodeToVirtualNode = new TreeMap<>();

hashToNodes = new TreeMap<>();

nodeToData = new HashMap<>();

for(String server : nodes){

hashToNodes.put(getHash(server), server);

nodeToData.put(server, new LinkedList<>());

}

//3. if virtual node number bigger than 0, add virtual node

if(virtualNodeNum > 0){

for(String server : nodes){

addVirtualNode(server);

}

}

}

@Override

public boolean putData(List<String> data) {

//1. circulate call put data method to add data to loop

for(String incomingData : data){

if(!putData(incomingData)){

return false;

}

}

return true;

}

@Override

public boolean putData(String data) {

if(hashToNodes.isEmpty()){

log.error("put data, usable server is empty");

return false;

}

//1. calculate data's hash value

int currentHash = getHash(data);

//2. get usual node(node's hash value is bigger than data's hash value), if usual node list is empty, get first node in loop

SortedMap<Integer, String> usableNodes = hashToNodes.tailMap(currentHash);

String node = usableNodes.isEmpty() ? hashToNodes.get(hashToNodes.firstKey()) : usableNodes.get(usableNodes.firstKey());

//3. add data to node

List<String> dataList = nodeToData.get(node);

dataList.add(data);

log.info("put data, data {} is placed to server {}, hash: {}", data, node, currentHash);

return true;

}

@Override

public boolean removeNode(String node) {

//1. calculate hash value of removing node

int removeServerHash = getHash(node);

if(!hashToNodes.containsKey(removeServerHash)){

log.error("remove server, current server is not in server list, please check server ip");

return false;

}

//2. get data from removing node

List<String> removeServerData = nodeToData.get(node);

//3. get removing node's virtual node data, remove all virtual node with removing node

if(virtualNodeNum != 0){

for(String virtualNode : realNodeToVirtualNode.get(node)){

removeServerData.addAll(nodeToData.get(virtualNode));

hashToNodes.remove(getHash(virtualNode));

nodeToData.remove(virtualNode);

}

}

//4. remove node from hash loop

hashToNodes.remove(removeServerHash);

nodeToData.remove(node);

if(hashToNodes.size() == 0){

log.info("remove server, after remove, server list is empty");

return true;

}

//5. put data to loop by call put data method

putData(removeServerData);

log.info("remove server, remove server {} success", node);

return true;

}

@Override

public boolean addNode(String node) {

//1, calculate adding node's hash value

int addServerHash = getHash(node);

//2. add node and migrate data

if(hashToNodes.isEmpty()){

//2.1 add node and its virtual node to loop directly when current loop is empty

hashToNodes.put(addServerHash, node);

nodeToData.put(node, new LinkedList<>());

if(virtualNodeNum > 0){

addVirtualNode(node);

}

} else{

//2.2.1 get data to be migrated from loop

SortedMap<Integer, String> greatServers = hashToNodes.tailMap(addServerHash);

String greatServer = greatServers.isEmpty() ? hashToNodes.get(hashToNodes.firstKey()) : greatServers.get(greatServers.firstKey());

List<String> firstGreatServerData = new LinkedList<>(nodeToData.get(greatServer));

//2.2.2 add node and its virtual node to loop

hashToNodes.put(addServerHash, node);

nodeToData.put(greatServer, new LinkedList<>());

nodeToData.put(node, new LinkedList<>());

if(virtualNodeNum != 0){

addVirtualNode(node);

}

//2.2.3 migrate 2.2.1 data to loop by call put data method

putData(firstGreatServerData);

}

log.info("add server, server {} has been added", node);

return true;

}

@Override

public void printAllData() {

nodeToData.forEach((server, data) -> log.info("server {} contains data {}", server, data));

}

@Override

public void setHashMethod(HashService hashService) {

if(!hashToNodes.isEmpty()){

throw new UnsupportedOperationException();

}

this.hashService = hashService;

}

private void addVirtualNode(String realNode){

if(virtualNodeNum > 0){

List<String> virtualNodeList = new LinkedList<>();

for(int cnt = 0; cnt < this.virtualNodeNum; cnt++){

//1. generate virtual node name by default format

String virtualNodeName = String.format(virtualNodeFormat, realNode, cnt);

//2. calculate each virtual node's hash value

int virtualNodeHash = getHash(virtualNodeName);

//3. current node already exist in loop, continue

if(hashToNodes.containsKey(virtualNodeHash)){

continue;

}

//4. add node to loop

virtualNodeList.add(virtualNodeName);

hashToNodes.put(virtualNodeHash, virtualNodeName);

nodeToData.put(virtualNodeName, new LinkedList<>());

}

//5. map virtual node to real node

realNodeToVirtualNode.put(realNode, virtualNodeList);

}

}

private int getHash(String data){

return hashService == null ? defaultGetHash(data) : hashService.getHash(data);

}

private int defaultGetHash(String data){

int res = 0;

for(char tempChar : data.toCharArray()){

if(tempChar >= '0' && tempChar <= '9'){

res += tempChar;

}

}

return res;

}

}

测试结果

不带虚拟节点的一致性哈希

测试代码

package ConsistentHash;

import org.CommonAlgorithms.ConsistentHash.ConsistentHashing;

import org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl;

import org.junit.Assert;

import org.junit.Before;

import org.junit.Test;

import org.slf4j.Logger;

import org.slf4j.LoggerFactory;

/**

* @author cartoon

* @date 2020/12/27

*/

public class ConsistentHashingWithoutVirtualNodeTest {

private static final Logger log = LoggerFactory.getLogger(ConsistentHashingWithoutVirtualNodeTest.class);

private ConsistentHashing consistentHashing;

private String[] servers;

private String[] data;

@Before

public void before(){

servers = new String[]{"000", "111", "222", "333", "555"};

consistentHashing = new ConsistentHashingImpl(servers);

data = new String[]{"000", "111", "222", "333", "555"};

}

@Test

public void testConsistentHashing(){

for(String str : data){

Assert.assertTrue(consistentHashing.putData(str));

}

consistentHashing.removeNode("333");

consistentHashing.addNode("444");

consistentHashing.putData("444");

consistentHashing.printAllData();

}

}

测试结果

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 000 is placed to server 000, hash: 144

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 111 is placed to server 111, hash: 147

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 222 is placed to server 222, hash: 150

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 333 is placed to server 333, hash: 153

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 555 is placed to server 555, hash: 159

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 333 is placed to server 555, hash: 153

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - remove server, remove server 333 success

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 555 is placed to server 555, hash: 159

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 333 is placed to server 444, hash: 153

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - add server, server 444 has been added

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 444 is placed to server 444, hash: 156

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 000 contains data [000]

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 111 contains data [111]

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 222 contains data [222]

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 444 contains data [333, 444]

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 555 contains data [555]

含虚拟节点的一致性哈希测试

测试代码

package ConsistentHash;

import org.CommonAlgorithms.ConsistentHash.ConsistentHashing;

import org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl;

import org.junit.Assert;

import org.junit.Before;

import org.junit.Test;

import org.slf4j.Logger;

import org.slf4j.LoggerFactory;

/**

* @author cartoon

* @date 2021/01/17

*/

public class ConsistentHashingWithVirtualNodeTest {

private static final Logger log = LoggerFactory.getLogger(ConsistentHashingWithVirtualNodeTest.class);

private ConsistentHashing consistentHashing;

private String[] servers;

private String[] data;

@Before

public void before(){

servers = new String[]{"000", "111", "222", "333", "555"};

consistentHashing = new ConsistentHashingImpl(3, servers);

data = new String[]{"000", "111", "222", "333", "555"};

}

@Test

public void testConsistentHashing(){

for(String str : data){

Assert.assertTrue(consistentHashing.putData(str));

}

consistentHashing.removeNode("333");

consistentHashing.addNode("444");

consistentHashing.putData("444");

consistentHashing.putData("555&&0");

consistentHashing.printAllData();

}

}

测试结果

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 000 is placed to server 000, hash: 144

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 111 is placed to server 111, hash: 147

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 222 is placed to server 222, hash: 150

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 333 is placed to server 333, hash: 153

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 555 is placed to server 555, hash: 159

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 333 is placed to server 555, hash: 153

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - remove server, remove server 333 success

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 555 is placed to server 555, hash: 159

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 333 is placed to server 444, hash: 153

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - add server, server 444 has been added

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 444 is placed to server 444, hash: 156

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 555&&0 is placed to server 555&&0, hash: 207

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 000&&2 contains data []

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 000&&1 contains data []

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 000&&0 contains data []

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 111&&1 contains data []

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 111&&2 contains data []

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 555&&1 contains data []

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 555&&2 contains data []

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 222&&2 contains data []

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 444&&0 contains data []

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 444&&1 contains data []

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 444&&2 contains data []

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 555&&0 contains data [555&&0]

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 000 contains data [000]

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 111 contains data [111]

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 222 contains data [222]

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 222&&0 contains data []

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 444 contains data [333, 444]

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 555 contains data [555]

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 222&&1 contains data []

[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 111&&0 contains data []

实现存在的缺陷

1. 哈希算法过于简单,哈希冲突概率较大

2. 真实节点含有虚拟节点的数量不均

3. 节点上线时真实节点与已存在的虚拟节点的顺序冲突尚未解决

后记

本次实现的所有代码已全部上传到 github,项目主要包含一些常用的算法,如排序算法,限流算法的简单实现,欢迎提 issue。

以上是 【Java】一致性 hash 算法理解与实现 的全部内容, 来源链接: utcz.com/a/112720.html

回到顶部