β

scala从url中读取内容

argcv 594 阅读

前文所说 ,scala的Source类自带各种输入,除了文件,也可以从url中获得数据.

最简单的是Source.fromURL


scala> scala.io.Source.fromURL("http://argcv.com").mkString
res1: String =
<!DOCTYPE html>
<html lang="en-US">
<head>
	<title>argcv | enjoy code, enjoy life.</title>
....

但是这个并不很安全.之前遇到过个问题,某服务器有问题,然后我们某个后端会尝试从服务器连接,然后被挂起了,直到耗尽了我们的连接数.我们应该加个timeout.

我改进一些代码后,得到一个简易的function如下:


  def fromUrlWithTimeout(url: String, timeout: Int = 1500): String = {
    import java.net.URL
    import scala.io.Source
    val conn = (new URL(url)).openConnection()
    conn.setConnectTimeout(timeout)
    conn.setReadTimeout(timeout)
    val stream = conn.getInputStream()
    val src = (scala.util.control.Exception.catching(classOf[Throwable]) opt Source.fromInputStream(stream).mkString) match {
      case Some(s: String) => s
      case _ => ""
    }
    stream.close()
    src
  }

使用也很简单


scala> fromUrlWithTimeout("http://argcv.com",3000)
res2: String =
<!DOCTYPE html>
<html lang="en-US">
<head>
	<title>argcv | enjoy code, enjoy life.</title>
...

或者设置一个很小的timeout,得到结果如下:


scala> fromUrlWithTimeout("http://argcv.com",100)
java.net.SocketTimeoutException: connect timed out
  at java.net.PlainSocketImpl.socketConnect(Native Method)
  at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345)
....

-----------------------
若要下载文件,可以参考 此处 .

Related posts:

  1. a workaround of false negative in missing interpolator warning
  2. scala读写文件
  3. Two Two
作者:argcv
enjoy code, enjoy life.
原文地址:scala从url中读取内容, 感谢原作者分享。