下载网页的HTML作为一个UTF-8字符串
我想下载网络上的网页的内部HTML,但是当我做到这一点,像šđčćž字符由ć¡等取代。下载网页的HTML作为一个UTF-8字符串
代码我使用:
Dim sourceString As String = New System.Net.WebClient().DownloadString("SomeWebPage") TextBox1.Text = sourceString
回答:
你可能已经下载字节,然后使用Encoding
类转换为UTF-8:
Async Function GetHtmlString(address As String) As Task(Of String) Using client As New WebClient
Dim bytes = Await client.DownloadDataTaskAsync(address)
Dim s = Encoding.UTF8.GetString(bytes)
return s
End Using
End Function
一个更简单的方式感谢@戴夫的评论:
Async Function GetHtmlString(address As String) As Task(Of String) Using client As New WebClient
client.Encoding = Encoding.UTF8
Dim s = Await client.DownloadStringTaskAsync(address)
return s
End Using
End Function
使用示例:
Imports System.Net Imports System.Text
Public Class Form1
Private Async Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
Dim s = Await GetHtmlString("http://www.radiomerkury.pl/")
End Sub
Async Function GetHtmlString(address As String) As Task(Of String)
Using client As New WebClient
client.Encoding = Encoding.UTF8
Dim s = Await client.DownloadStringTaskAsync(address)
Return s
End Using
End Function
End Class
回答:
Kibi,我认为你在这里的方式,方式,方式。我没有看到VB.NET如何帮助你处理这种事情。下面是一个简单而直观的Excel VBA解决方案。我希望这能帮助你实现你的目标。
Sub DumpData() Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True
URL = "http://finance.yahoo.com/q?s=sbux&ql=1"
'Wait for site to fully load
IE.Navigate2 URL
Do While IE.Busy = True
DoEvents
Loop
RowCount = 1
With Sheets("Sheet1")
.Cells.ClearContents
RowCount = 1
For Each itm In IE.document.all
.Range("A" & RowCount) = itm.tagname
.Range("B" & RowCount) = itm.ID
.Range("C" & RowCount) = itm.classname
.Range("D" & RowCount) = Left(itm.innertext, 1024)
RowCount = RowCount + 1
Next itm
End With
End Sub
以上是 下载网页的HTML作为一个UTF-8字符串 的全部内容, 来源链接: utcz.com/qa/263925.html