下载网页的HTML作为一个UTF-8字符串

我想下载网络上的网页的内部HTML,但是当我做到这一点,像šđčćž字符由ć¡等取代。下载网页的HTML作为一个UTF-8字符串

代码我使用:

Dim sourceString As String = New System.Net.WebClient().DownloadString("SomeWebPage") 

TextBox1.Text = sourceString

回答:

你可能已经下载字节,然后使用Encoding类转换为UTF-8:

Async Function GetHtmlString(address As String) As Task(Of String) 

Using client As New WebClient

Dim bytes = Await client.DownloadDataTaskAsync(address)

Dim s = Encoding.UTF8.GetString(bytes)

return s

End Using

End Function

一个更简单的方式感谢@戴夫的评论:

Async Function GetHtmlString(address As String) As Task(Of String) 

Using client As New WebClient

client.Encoding = Encoding.UTF8

Dim s = Await client.DownloadStringTaskAsync(address)

return s

End Using

End Function

使用示例:

Imports System.Net 

Imports System.Text

Public Class Form1

Private Async Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load

Dim s = Await GetHtmlString("http://www.radiomerkury.pl/")

End Sub

Async Function GetHtmlString(address As String) As Task(Of String)

Using client As New WebClient

client.Encoding = Encoding.UTF8

Dim s = Await client.DownloadStringTaskAsync(address)

Return s

End Using

End Function

End Class

回答:

Kibi,我认为你在这里的方式,方式,方式。我没有看到VB.NET如何帮助你处理这种事情。下面是一个简单而直观的Excel VBA解决方案。我希望这能帮助你实现你的目标。

Sub DumpData() 

Set IE = CreateObject("InternetExplorer.Application")

IE.Visible = True

URL = "http://finance.yahoo.com/q?s=sbux&ql=1"

'Wait for site to fully load

IE.Navigate2 URL

Do While IE.Busy = True

DoEvents

Loop

RowCount = 1

With Sheets("Sheet1")

.Cells.ClearContents

RowCount = 1

For Each itm In IE.document.all

.Range("A" & RowCount) = itm.tagname

.Range("B" & RowCount) = itm.ID

.Range("C" & RowCount) = itm.classname

.Range("D" & RowCount) = Left(itm.innertext, 1024)

RowCount = RowCount + 1

Next itm

End With

End Sub

以上是 下载网页的HTML作为一个UTF-8字符串 的全部内容, 来源链接: utcz.com/qa/263925.html

回到顶部