10、SpringCloud第十章,升级篇,服务降级、熔断与实时监控Hystrix
SpringCloud第九章,升级篇,服务降级、熔断与实时监控Hystrix
一、Hystrix概述
1、服务雪崩
服务雪崩service avalanche:假设服务存在如上调用,service a流量波动很大,流量经常会突然性增加!那么在这种情况下,就算Service A能扛得住请求,Service B和Service C未必能扛得住这突发的请求。
此时,如果Service C因为抗不住请求,变得不可用。那么Service B的请求也会阻塞,慢慢耗尽Service B的线程资源,Service B就会变得不可用。紧接着,对Service A的调用就会占用越来越多的资源,进而引起系统崩溃。
如上,一个服务失败,导致整条链路的服务都失败的情形,我们称之为服务雪崩。
服务降级和服务熔断可以视为解决服务雪崩的手段。
2、服务熔断
当下游的服务因为某种原因突然变得不可用或者响应过慢,上游服务为了保证自己服务的可用性,不再继续调用目标服务,直接返回快速释放资源。如果目标服务好转则恢复调用。
目前流行的熔断器很多,例如阿里出的Sentinel(之后会在博客中介绍),以及最多人使用的Hystrix。
Hystrix配置如下:
##滑动窗口的大小,默认为20
circuitBreaker.requestVolumeThreshold
##过多长时间,熔断器再次检测是否开启,默认为5000,即5s钟
circuitBreaker.sleepWindowInMilliseconds
##错误率,默认50%
circuitBreaker.errorThresholdPercentage
每当20个请求中,有50%失败时,熔断器就会打开,此时再调用此服务,将会直接返回失败,不再调远程服务。直到5s钟之后,重新检测该触发条件,判断是否把熔断器关闭,或者继续打开。
简单说:
类比保险丝达到最大服务访问后,直接拒绝访问,拉闸限电。然后调用服务降级的方法并返回友好提示。
3、服务降级
两种场景:a、当下游服务由于某种原因响应过慢,下游服务主动停掉一些不太重要的业务,释放服务器资源,增加响应速度。
b、当下游服务因为某种原因不可用,上游主动调用本地的一些降级逻辑,避免卡顿,迅速回馈用户。
简单说:
服务器很忙,请稍后再试,不让客户端等待并立刻返回一个友好提示fallback.
4、服务限流
秒杀高并发等操作,严禁一窝蜂的过来拥挤,大家排队,一秒钟N个,有序进行。
5、服务降级和熔断的区别
相同点:目标一致 都是从可用性和可靠性出发,为了防止系统崩溃;
用户体验类似 最终都让用户体验到的是某些功能暂时不可用;
不同点:
触发原因不同 服务熔断一般是某个服务(下游服务)故障引起,而服务降级一般是从整体负荷考虑;
管理目标的层次不太一样,熔断其实是一个框架级的处理,每个微服务都需要(无层级之分),而降级一般需要对业务有层级之分(比如降级一般是从最外围服务开始)
实现方式不太一样,服务降级具有代码侵入性(由控制器完成/或自动降级),熔断一般称为自我熔断。
二、案例
1、构建cloud-provider-hystrix-payment-8001
POM
<?xml version="1.0" encoding="UTF-8"?><project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<parent>
<artifactId>cloud_2020</artifactId>
<groupId>com.lee.springcloud</groupId>
<version>1.0-SNAPSHOT</version>
</parent>
<modelVersion>4.0.0</modelVersion>
<artifactId>cloud-provider-hystrix-payment-8001</artifactId>
<dependencies>
<!--hystrix-->
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-netflix-hystrix</artifactId>
</dependency>
<!--eureka client-->
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-netflix-eureka-client</artifactId>
</dependency>
<dependency>
<groupId>com.lee.springcloud</groupId>
<artifactId>cloud-api-common</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<!--监控-->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<!--热部署-->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-devtools</artifactId>
<scope>runtime</scope>
<optional>true</optional>
</dependency>
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<optional>true</optional>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>
</dependencies>
</project>
application.yml
server: port: 8001
spring:
application:
name: cloud-provider-hystrix-payment
eureka:
client:
register-with-eureka: true
fetch-registry: true
service-url:
defaultZone: http://eureka7001.com:7001/eureka
主启动类:
@SpringBootApplication@EnableEurekaClient
public class PaymentHystrixMain8001 {
public static void main(String[] args) {
SpringApplication.run(PaymentHystrixMain8001.class,args);
}
}
service
@Servicepublic class PaymentService {
//正常访问
public String paymentInfo_ok(Integer id){
return "thread:"+Thread.currentThread().getName()+" payment ok id : "+id+" ^_^";
}
//访问超时
public String paymentInfo_timeout(Integer id) throws InterruptedException {
int timeNumber = 3;
TimeUnit.SECONDS.sleep(timeNumber);
return "thread:"+Thread.currentThread().getName()+" payment timeout id : "+id+" ╥﹏╥";
}
}
controller
@RestController@Slf4j
public class PaymentController {
@Resource
private PaymentService paymentService;
@Value("${server.port}")
private String servicePort;
//正常访问
@GetMapping("/payment/hystrix/ok/{id}")
public String paymentInfo_OK(@PathVariable("id") Integer id) {
String result = paymentService.paymentInfo_ok(id);
return result;
}
//超时访问
@GetMapping("/payment/hystrix/timeout/{id}")
public String paymentInfo_TimeOut(@PathVariable("id") Integer id) throws InterruptedException {
String result = paymentService.paymentInfo_timeout(id);
return result;
}
}
测试:
1、启动 eureka-service-70012、启动 hystrix-payment-8001
3、访问 http://localhost:8001/payment/hystrix/ok/1
结果马上出来
4、访问 http://localhost:8001/payment/hystrix/timeout/1
结果等待3s出来
JMeter压力测试
1、jmeter线程组200或2000个线程、循环100次2、jmeter访问 http://localhost:8001/payment/hystrix/timeout/1
3、浏览器访问 http://localhost:8001/payment/hystrix/ok/1
结果转半天才回出来
原因:
jmeter在访问timeout方法时,tomcat的默认工作线程数被打满了,再访问ok方法时就没有多余的线程来分解压力来处理了。
2、构建cloud-consumer-feign-hystrix-order-80
POM
<?xml version="1.0" encoding="UTF-8"?><project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<parent>
<artifactId>cloud_2020</artifactId>
<groupId>com.lee.springcloud</groupId>
<version>1.0-SNAPSHOT</version>
</parent>
<modelVersion>4.0.0</modelVersion>
<artifactId>cloud-consumer-feign-hystrix-order-80</artifactId>
<dependencies>
<!--openfeign-->
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-openfeign</artifactId>
</dependency>
<!--eureka client-->
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-netflix-eureka-client</artifactId>
</dependency>
<dependency>
<groupId>com.lee.springcloud</groupId>
<artifactId>cloud-api-common</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<!--监控-->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<!--热部署-->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-devtools</artifactId>
<scope>runtime</scope>
<optional>true</optional>
</dependency>
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<optional>true</optional>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>
</dependencies>
</project>
application.yml
server: port: 80
eureka:
client:
register-with-eureka: false
fetch-registry: true
service-url:
defaultZone: http://eureka7001.com:7001/eureka
主启动类
@SpringBootApplication@EnableEurekaClient
@EnableFeignClients
public class OrderHystrixMain80 {
public static void main(String[] args) {
SpringApplication.run(OrderHystrixMain80.class,args);
}
}
service
@Component@FeignClient(value = "CLOUD-PROVIDER-HYSTRIX-PAYMENT")
public interface PaymentHystrixService {
//正常访问
@GetMapping("/payment/hystrix/ok/{id}")
public String paymentInfo_OK(@PathVariable("id") Integer id);
//超时访问
@GetMapping("/payment/hystrix/timeout/{id}")
public String paymentInfo_TimeOut(@PathVariable("id") Integer id);
}
controller
@RestController@Slf4j
public class OrderHyrixController {
@Autowired
private PaymentHystrixService paymentHystrixService;
@GetMapping("/consumer/payment/hystrix/ok/{id}")
public String paymentInfo_OK(@PathVariable("id") Integer id){
return paymentHystrixService.paymentInfo_OK(id);
}
@GetMapping("/consumer/payment/hystrix/timeout/{id}")
public String paymentInfo_TimeOut(@PathVariable("id") Integer id){
return paymentHystrixService.paymentInfo_TimeOut(id);
}
}
测试:
1、启动eureka-70012、启动hystrix-payment-8001
3、启动hystrix-order-80
4、访问http://localhost/consumer/payment/hystrix/ok/2
压测:
1、jmeter线程组200或2000个线程、循环100次2、jmeter访问 http://localhost:8001/payment/hystrix/timeout/2
3、浏览器访问 http://localhost/consumer/payment/hystrix/ok/2
结果转半天才回出来,或者直接报错
Read timed out executing GET http://CLOUD-PROVIDER-HYSTRIX-PAYMENT/payment/hystrix/ok/2
3、如何解决
3.1、服务降级
3.1.1、服务端降级
降级配置:@HystrixCommand8001先从自身查找问题,设置调用超时的峰值,峰值内正常运行,超过了服务降级fallback
cloud-provider-hystrix-payment-8001做如下处理:
主启动类:
@SpringBootApplication@EnableEurekaClient
@EnableHystrix //@EnableCircuitBreaker和@EnableHystrix的作用是一样的
public class PaymentHystrixMain8001 {
public static void main(String[] args) {
SpringApplication.run(PaymentHystrixMain8001.class,args);
}
}
service:
@Servicepublic class PaymentService {
//正常访问
public String paymentInfo_ok(Integer id){
return "thread:"+Thread.currentThread().getName()+" payment ok id : "+id+" ^_^";
}
//访问超时
@HystrixCommand(fallbackMethod = "paymentInfo_TimeOut_handler",commandProperties = {
@HystrixProperty(name="execution.isolation.thread.timeoutInMilliseconds",value = "3000")
})
public String paymentInfo_timeout(Integer id) throws InterruptedException {
// int a = 100/0;
int timeNumber = 5;
TimeUnit.SECONDS.sleep(timeNumber);
return "thread:"+Thread.currentThread().getName()+" payment timeout id : "+id+" ╥﹏╥";
}
//降级备用方法fallback
public String paymentInfo_TimeOut_handler(Integer id){
return "调用服务接口超时or异常 "+Thread.currentThread().getName();
}
}
测试:
1、启动eureka-70012、启动provider-hystrix-payment-8001
3、测试 http://localhost:8001/payment/hystrix/timeout/1
结果返回:调用服务接口超时or异常 HystrixTimer-1
因为hystrixCommand设置超时峰值为3s,代码内timenumber为5s,所以访问接口是超过3s就直接调用降级备用方法fallback
3.1.2、消费端降级
cloud-consumer-feign-hystrix-order-80做如下处理:
POM新增:
<!--hystrix--><dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-netflix-hystrix</artifactId>
</dependency>
application.yml新增:
feign: hystrix:
enabled: true
主启动类新增:
@EnableHystrix //@EnableCircuitBreaker和@EnableHystrix的作用是一样的##底层源码
@Target({ElementType.TYPE})
@Retention(RetentionPolicy.RUNTIME)
@Documented
@Inherited
@EnableCircuitBreaker
public @interface EnableHystrix {
}
controller:
@RestController@Slf4j
public class OrderHyrixController {
@Autowired
private PaymentHystrixService paymentHystrixService;
@GetMapping("/consumer/payment/hystrix/ok/{id}")
public String paymentInfo_OK(@PathVariable("id") Integer id){
return paymentHystrixService.paymentInfo_OK(id);
}
@GetMapping("/consumer/payment/hystrix/timeout/{id}")
@HystrixCommand(fallbackMethod = "paymentInfo_TimeOut_fallback_method",commandProperties = {
@HystrixProperty(name="execution.isolation.thread.timeoutInMilliseconds",value = "1500")
})
public String paymentInfo_TimeOut(@PathVariable("id") Integer id){
return paymentHystrixService.paymentInfo_TimeOut(id);
}
public String paymentInfo_TimeOut_fallback_method(@PathVariable("id") Integer id){
return "this is 80 port, consumer ,对方支付接口异常or超时,此时在执行自己的fallback服务降级方法"+id;
}
}
测试:
1、启动eureka-70012、启动provider-hystrix-payment-8001
3、启动consumer-feign-hystrix-order-80
3、测试 http://localhost/consumer/payment/hystrix/timeout/1
结果返回:this is 80 port, consumer ,对方支付接口异常or超时,此时在执行自己的fallback服务降级方法1
因为80端hystrixCommand设置超时峰值为1.5s,8001端hystrixCommand设置超时峰值为3s,超过了80的1.5s,所以80调用自己的fallback服务降级方法。
8001没走到调用自己fallback方法的那一步。
3.1.3、代码膨胀的问题
大部分hystrix实在consumer端解决的,所以我们修改cloud-consumer-feign-hystrix-order-80
修改controller:
@RestController@Slf4j
@DefaultProperties(defaultFallback = "paymentInfo_TimeOut_global_fallback_method")
public class OrderHyrixController {
@Autowired
private PaymentHystrixService paymentHystrixService;
@GetMapping("/consumer/payment/hystrix/ok/{id}")
public String paymentInfo_OK(@PathVariable("id") Integer id){
return paymentHystrixService.paymentInfo_OK(id);
}
@GetMapping("/consumer/payment/hystrix/timeout/{id}")
@HystrixCommand //任何需要降级method都可以添加 统一用controller上增加的fallback
public String paymentInfo_TimeOut(@PathVariable("id") Integer id){
return paymentHystrixService.paymentInfo_TimeOut(id);
}
public String paymentInfo_TimeOut_fallback_method(@PathVariable("id") Integer id){
return "this is 80 port, consumer ,对方支付接口异常or超时,此时在执行自己的fallback服务降级方法"+id;
}
//由于是全局fallback,所以不能加入参
public String paymentInfo_TimeOut_global_fallback_method(){
return "this is global 80 port, consumer ,对方支付接口异常or超时,此时在执行自己的fallback服务降级方法";
}
}
测试:
同上3.1.2再增加一个provider宕机的测试
返回结果:
this is global 80 port, consumer ,对方支付接口异常or超时,此时在执行自己的fallback服务降级方法
3.1.4、和业务逻辑混在一起
上面fallback方法和controller中的业务逻辑混在一起,分层不清晰。
解决方案:我们80服务,使用了feign,所以我们可以给每一个feign service创建一个PaymentFallbackService接口实现paymentHystrixService.
PaymentHystrixService
@Component//@FeignClient(value = "CLOUD-PROVIDER-HYSTRIX-PAYMENT")
@FeignClient(value = "CLOUD-PROVIDER-HYSTRIX-PAYMENT",fallback = PaymentFallbackService.class)
public interface PaymentHystrixService {
//正常访问
@GetMapping("/payment/hystrix/ok/{id}")
public String paymentInfo_OK(@PathVariable("id") Integer id);
//超时访问
@GetMapping("/payment/hystrix/timeout/{id}")
public String paymentInfo_TimeOut(@PathVariable("id") Integer id);
}
PaymentFallbackService
@Componentpublic class PaymentFallbackService implements PaymentHystrixService {
@Override
public String paymentInfo_OK(Integer id) {
return "----------------->paymentInfo_Ok_fallback_method";
}
@Override
public String paymentInfo_TimeOut(Integer id) {
return "----------------->paymentInfo_TimeOut_fallback_method";
}
}
controller
@DefaultProperties(defaultFallback = "paymentInfo_TimeOut_global_fallback_method")和
@HystrixCommand
都删除
测试:
同上3.1.2再增加一个provider宕机的测试
返回结果:
----------------->paymentInfo_TimeOut_fallback_method
3.2、服务熔断
熔断机制是应对雪崩效应的一种微服务链路保护机制。当扇出链路中的某个微服务不可用或者响应时间过长时,会进行服务降级,进而熔断该节点微服务的调用,快速返回错误的响应信息。当检测到该微服务调用响应正常后(达到一定正常比例),恢复链路调用。
在springcloud框架里,熔断机制通过hystrix实现,hystrix会监控微服务间的调用情况。
当调用失败比例达到一定阈值时,缺省时5s内达到20次调用失败,就会启动熔断机制,熔断机制的注解是@HystrixCommand
修改cloud-provider-hystrix-payment-8001
PaymentService新增如下:
//断路器 @HystrixCommand( fallbackMethod = "paymentInfo_circuitBreaker_handler",commandProperties = {
@HystrixProperty(name="circuitBreaker.enabled",value = "true"),//是否开启断路器
@HystrixProperty(name="circuitBreaker.requestVolumeThreshold",value = "10"),// 请求次数
@HystrixProperty(name = "circuitBreaker.sleepWindowInMilliseconds",value = "10000"),//时间窗口期
@HystrixProperty(name = "circuitBreaker.errorThresholdPercentage",value = "60")//失败率达到多少后跳闸
})
public String paymentInfo_circuitBreaker(Integer id){
if(id<0){
throw new RuntimeException("----->id 不能为负数.");
}
String serialNumber = UUID.randomUUID().toString();
return "paymentInfo_circuitBreaker 调用成功,流水号"+serialNumber;
}
//断路器降级备用方法fallback
public String paymentInfo_circuitBreaker_handler(Integer id){
return "ID 不能为负数,请稍后再试......."+id;
}
Controller新增如下:
//断路器 @GetMapping("/payment/hystrix/circuitBreaker/{id}")
public String paymentInfo_circuitBreaker(@PathVariable("id") Integer id){
String result = paymentService.paymentInfo_circuitBreaker(id);
log.info("----->"+result);
return result;
}
自测:
1、启动eureka-70012、启动cloud-provider-payment-8001
3、访问:http://localhost:8001/payment/hystrix/circuitBreaker/1
结果:paymentInfo_circuitBreaker 调用成功,流水号ae6598ff-34f5-4d36-baa2-c8125bf1722a
4、再访问:http://localhost:8001/payment/hystrix/circuitBreaker/-1
结果:ID 不能为负数,请稍后再试.......-1
5、重复快速的多次访问http://localhost:8001/payment/hystrix/circuitBreaker/-1
然后再访问http://localhost:8001/payment/hystrix/circuitBreaker/1
发现正确的也会报:ID 不能为负数,请稍后再试.......-1
重复多次访问http://localhost:8001/payment/hystrix/circuitBreaker/1后
结果:paymentInfo_circuitBreaker 调用成功,流水号472fc9d5-e171-4b65-a541-b81acc8eeb5a
成功---失败---成功(规定时间段内失败和成功次数和比例的问题,断路器起到了作用)
备注:
//@HystrixCommand circuitbreaker所需的参数都在HystrixCommandProperties类里public abstract class HystrixCommandProperties {
private static final Logger logger = LoggerFactory.getLogger(HystrixCommandProperties.class);
static final Integer default_metricsRollingStatisticalWindow = 10000;
private static final Integer default_metricsRollingStatisticalWindowBuckets = 10;
private static final Integer default_circuitBreakerRequestVolumeThreshold = 20;
private static final Integer default_circuitBreakerSleepWindowInMilliseconds = 5000;
private static final Integer default_circuitBreakerErrorThresholdPercentage = 50;
private static final Boolean default_circuitBreakerForceOpen = false;
static final Boolean default_circuitBreakerForceClosed = false;
//......
}
3.3、服务限流
后边springcloud alibaba写sentinel的时候再写
以上是 10、SpringCloud第十章,升级篇,服务降级、熔断与实时监控Hystrix 的全部内容, 来源链接: utcz.com/z/515535.html