并发编程

Z时代
2024-01-10
分类：综合

python

并发编程： https://www.processon.com/mindmap/5f636bac0791295dccc46f28

from concurrent.futures import ThreadPoolExecutor,ProcessPoolExecutor
import requests
import os
def get_page(url):
print("<进程%s> get %s" %(os.getpid(),url))
        respone=requests.get(url)
if respone.status_code == 200:
return {"url":url,"text":respone.text}
def parse_page(res):
        res=res.result()
print("<进程%s> parse %s" %(os.getpid(),res["url"]))
        parse_res="url:<%s> size:[%s]
" %(res["url"],len(res["text"]))
        with open("db.txt","a") as f:
            f.write(parse_res)
if__name__ == "__main__":
        urls=[
"https://www.baidu.com",
"https://www.python.org",
"https://www.openstack.org",
"https://help.github.com/",
"http://www.sina.com.cn/"
        ]
        p=ProcessPoolExecutor(3)
for url in urls:
            p.submit(get_page,url).add_done_callback(parse_page) #parse_page拿到的是一个future对象obj，需要用obj.result()拿到结果

并发爬虫小程序

#### 有那几种IO模型

blocking IO 阻塞IO

nonblocking IO 非阻塞IO

IO multiplexing 多路复用IO 也叫事件驱动IO（event driven IO）

signal driven IO 信号驱动IO

asynchronous IO 异步IO

由signal driven IO（信号驱动IO）在实际中并不常用，所以主要介绍其余四种IO Model。

#### IO模型的区别是在那两个阶段上？

1）wait for data 等待数据准备 (Waiting for the data to be ready)

2）copy data 将数据从内核拷贝到进程中(Copying the data from the kernel to the process)

blocking IO：在IO执行的两个阶段（等待数据和拷贝数据两个阶段）都被block了。

在非阻塞式IO：在等待数据阶段是非阻塞的，用户进程需要不断的主动询问kernel（内核）数据准备好了没有。需要注意，拷贝数据整个过程，进程仍然是属于阻塞的状态。

非阻塞IO模型绝不被推荐。

多路复用IO: 优势在于可以处理多个连接，不适用于单个连接

当用户进程调用了select，那么整个进程会被block，而同时，kernel会“监视”所有select负责的socket，当任何一个socket中的数据准备好了，select就会返回。这个时候用户进程再调用read操作，将数据从kernel拷贝到用户进程。

对于每一个socket，一般都设置成为non-blocking，但是，整个用户的process其实是一直被block的。只不过process是被select这个函数block，

而不是被socket IO给block。

异步IO：都不阻塞。当进程发起IO 操作之后，就直接返回再也不理睬了，直到kernel发送一个信号，告诉进程说IO完成。在这整个过程中，进程完全没有被block。

IO模型

以上是并发编程的全部内容，来源链接： utcz.com/z/530356.html

并发编程

其他人也看了：