使用VB读取html的内容

html-css015

使用VB读取html的内容,第1张

用webbrower控件

请看下例

’声明:该程序由csdn论坛获得

dim dwinfolder as new shellwindows

dim withevents eventie as webbrowser_v1

private sub command1_click()

dim objie as object

for each objie in dwinfolder

if objie.locationurl = list1.list(list1.listindex) then

set eventie = objie

command1.enabled = false

list1.enabled = false

text1.text = ""

exit for

end if

next

end sub

private sub eventie_navigatecomplete(byval url as string)

text1.text = text1.text + chr(13) + chr(10) + url

end sub

在运行前。点击菜单 projects | references 项,在available references 列表中选择microsoft internet controls项将internet对象引用介入到工程中

private sub form_load()

dim objie as object

for each objie in dwinfolder

if instr(1, objie.fullname, "iexplore.exe", vbtextcompare) <> 0 then

list1.additem objie.locationurl

end if

next

command1.caption = "正文"

end sub

private sub form_unload(cancel as integer)

set dwinfolder = nothing

end sub

private sub list1_click()

dim objdoc as object

dim objie as object

for each objie in dwinfolder

if objie.locationurl = list1.list(list1.listindex) then

set objdoc = objie.document

for i = 1 to objdoc.all.length - 1

if objdoc.all(i).tagname = "body" then

text1.text = objdoc.all(i).innertext

end if

next

exit for

end if

next

end sub

这个文件使用了UTF8编码或者别的编码方式,所以字符串显示不正确,解决方法有两种:

1、用记事本另存为,选择ANSI的字符集保存,之后就可以正常打开了。

2、把整个文件保存在一个字节数组里,然后用字符串转换函数strconv来转换,不过前提是你得知道是什么格式才行

UTF8编码可以用下面的函数来处理:

传入一个文件名,返回这个文件的内容,如果文件是UTF8的话则返回是转换后的代码

Public Function ReadUTF8(ByVal sUTF8File As String) As String

If Len(sUTF8File) = 0 Or Dir(sUTF8File) = vbNullString Then Exit Function

Dim ados As Object

Set ados = CreateObject("adodb.stream")

With ados

.Charset = "utf-8"

.Type = 2

.Open

.LoadFromFile sUTF8File

ReadUTF8 = .ReadText

.Close

End With

Set ados = Nothing

End Function