Skip to content

Commit

Permalink
Merge branch 'develop', version 0.9.2
Browse files Browse the repository at this point in the history
  • Loading branch information
cyfdecyf committed Jul 22, 2014
2 parents 5092cd8 + 3ccadb1 commit 97d7925
Show file tree
Hide file tree
Showing 19 changed files with 574 additions and 139 deletions.
2 changes: 1 addition & 1 deletion .travis.yml
@@ -1,6 +1,6 @@
language: go
go:
- 1.1
- 1.3
env:
- TRAVIS="yes"
install:
Expand Down
6 changes: 6 additions & 0 deletions CHANGELOG
@@ -1,3 +1,9 @@
0.9.2 (2014-07-23)
* Reduce the possibility of encountering too many open file error
* New connection latency based load balancing
* Fix auto load plist for OS X
* Identify blocked site by HTTP error code

0.9.1 (2013-12-20)
* Fix can't save site stat bug
* Improve install and startup script
Expand Down
96 changes: 96 additions & 0 deletions README-en.md
@@ -0,0 +1,96 @@
# COW (Climb Over the Wall) proxy

COW is a HTTP proxy to simplify bypassing the great firewall. It tries to automatically identify blocked websites and only use parent proxy for those sites.

Current version: 0.9.1 [CHANGELOG](CHANGELOG)
[![Build Status](https://travis-ci.org/cyfdecyf/cow.png?branch=develop)](https://travis-ci.org/cyfdecyf/cow)

## Features

- As a HTTP proxy, can be used by mobile devices
- Supports HTTP, SOCKS5, [shadowsocks](https://github.com/clowwindy/shadowsocks/wiki/Shadowsocks-%E4%BD%BF%E7%94%A8%E8%AF%B4%E6%98%8E) and COW itself as parent proxy
- Supports simple load balancing between multiple parent proxies
- Automatically identify blocked websites, only use parent proxy for those sites
- Generate and serve PAC file for browser to bypass COW for best performance
- Contain domains that can be directly accessed (recorded accoring to your visit history)

# Quickstart

Install:

- **OS X, Linux (x86, ARM):** Run the following command (also for update)

curl -L git.io/cow | bash

- **Windows:** [download](http://dl.chenyufei.info/cow/)
- If you are familiar with Go, run `go get github.com/cyfdecyf/cow` to install from source.

Modify configuration file `~/.cow/rc` (Linux) or `rc.txt` (Windows). A simple example:

Here's an example with the most important options:

# Line starting with # is comment and will be ignored
# Local proxy listen address
listen = http://127.0.0.1:7777

# SOCKS5 parent proxy
proxy = socks5://127.0.0.1:1080
# HTTP parent proxy
proxy = http://127.0.0.1:8080
proxy = http://user:password@127.0.0.1:8080
# shadowsocks parent proxy
proxy = ss://aes-128-cfb:password@1.2.3.4:8388
# cow parent proxy
proxy = cow://aes-128-cfb:password@1.2.3.4:8388

The PAC file can be accessed at `http://<listen>/pac`, for the above example: `http://127.0.0.1:7777/pac`.

Command line options can override options in the configuration file For more details, see the output of `cow -h`

## Blocked and directly accessible sites list

In ideal situation, you don't need to specify which sites are blocked and which are not, but COW hasen't reached that goal. So you may need to manually specify this if COW made the wrong judgement.

- `~/.cow/blocked` for blocked sites
- `~/.cow/direct` for blocked sites
- One line for each domain
- `google.com` means `*.google.com`
- You can use domains like `google.com.hk`

# Technical details

## Visited site recording

COW records all visited hosts and visit count in `~/.cow/stat`, which is a json file.

- **For unknown site, first try direct access, use parent proxy upon failure. After 2 minutes, try direct access again**
- Builtin [common blocked site](site_blocked.go) in order to reduce time to discover blockage and the use parent proxy
- Hosts will be put into PAC after a few times of successful direct visit
- Hosts will use parent proxy if direct access failed for a few times
- To avoid mistakes, will try direct access with some probability
- Host will be deleted if not visited for a few days
- Hosts under builtin/manually specified blocked and direct domains will not appear in `stat`

## How does COW detect blocked sites

Upon the following error, one domain is considered to be blocked

- Server connection reset
- Connection to server timeout
- Read from server timeout

COW will retry HTTP request upon these errors, But if there's some data sent back to the client, connection with the client will be dropped to signal error..

Server connection reset is usually reliable in detecting blocked sites. But timeout is not. COW tries to estimate timeout value every 30 seconds, in order to avoid considering normal sites as blocked when network condition is bad. Revert to direct access after two minutes upon first blockage is also to avoid mistakes.

If automatica timeout retry causes problem for you, try to change `readTimeout` and `dialTimeout` in configuration.

# Limitations

- No caching, COW just passes traffic between clients and web servers
- For web browsing, browsers have their own cache
- Blocked site detection is not always reliable

# Acknowledgements

Refer to [README.md](README.md).
12 changes: 7 additions & 5 deletions README.md
Expand Up @@ -2,7 +2,9 @@

COW 是一个简化穿墙的 HTTP 代理服务器。它能自动检测被墙网站,仅对这些网站使用二级代理。

当前版本:0.9.1 [CHANGELOG](CHANGELOG)
[English README](README-en.md).

当前版本:0.9.2 [CHANGELOG](CHANGELOG)
[![Build Status](https://travis-ci.org/cyfdecyf/cow.png?branch=master)](https://travis-ci.org/cyfdecyf/cow)

**欢迎在 develop branch 进行开发并发送 pull request :)**
Expand All @@ -26,6 +28,7 @@ COW 的设计目标是自动化,理想情况下用户无需关心哪些网站

curl -L git.io/cow | bash

- 环境变量 `COW_INSTALLDIR` 可以指定安装的路径,若该环境变量不是目录则询问用户
- **Windows:** [点此下载](http://dl.chenyufei.info/cow/)
- 熟悉 Go 的用户可用 `go get github.com/cyfdecyf/cow` 从源码安装

Expand Down Expand Up @@ -59,7 +62,7 @@ COW 的设计目标是自动化,理想情况下用户无需关心哪些网站

启动 COW:

- Unix 系统在命令行上执行 `cow &`
- Unix 系统在命令行上执行 `cow &` (若 COW 不在 `PATH` 所在目录,请执行 `./cow &`)
- [Linux 启动脚本](doc/init.d/cow),如何使用请参考注释(Debian 测试通过,其他 Linux 发行版应该也可使用)
- Windows
- 双击 `cow-taskbar.exe`,隐藏到托盘执行
Expand All @@ -83,8 +86,6 @@ PAC url 为 `http://<listen address>/pac`,也可将浏览器的 HTTP/HTTPS 代
- `com.hk`, `edu.cn` 等二级域名下的三级域名,作为二级域名处理。如 `google.com.hk` 相当于 `*.google.com.hk`
- 其他三级及以上域名/主机名做精确匹配,例如 `plus.google.com`

注意:对私有 IPv4 地址及 simple host name,COW 总是直接连接,生成的 PAC 也让浏览器直接访问。(因此访问 localhost 和局域网内机器会绕过 COW。)

# 技术细节

## 访问网站记录
Expand Down Expand Up @@ -126,10 +127,11 @@ COW 默认配置下检测到被墙后,过两分钟再次尝试直连也是为
- @tevino: http parent proxy basic authentication
- @xupefei: 提供 cow-hide.exe 以在 windows 上在后台执行 cow.exe
- @sunteya: 改进启动和安装脚本
- @fzerorubigd: identify blocked site by HTTP error code

Bug reporter:

- GitHub users: glacjay, trawor, Blaskyy, lucifer9, zellux, xream, hieixu, fantasticfears, perrywky, JayXon, graminc, WingGao, polong, dallascao
- GitHub users: glacjay, trawor, Blaskyy, lucifer9, zellux, xream, hieixu, fantasticfears, perrywky, JayXon, graminc, WingGao, polong, dallascao, luosheng
- Twitter users: 特别感谢 @shao222 多次帮助测试新版并报告了不少 bug, @xixitalk

@glacjay 对 0.3 版本的 COW 提出了让它更加自动化的建议,使我重新考虑 COW 的设计目标并且改进成 0.5 版本之后的工作方式。
34 changes: 22 additions & 12 deletions config.go
Expand Up @@ -15,7 +15,7 @@ import (
)

const (
version = "0.9.1"
version = "0.9.2"
defaultListenAddr = "127.0.0.1:7777"
)

Expand All @@ -24,6 +24,7 @@ type LoadBalanceMode byte
const (
loadBalanceBackup LoadBalanceMode = iota
loadBalanceHash
loadBalanceLatency
)

// allow the same tunnel ports as polipo
Expand Down Expand Up @@ -59,6 +60,8 @@ type Config struct {
Core int
DetectSSLErr bool

HttpErrorCode int

// not configurable in config file
PrintVer bool
EstimateTimeout bool // if run estimateTimeout()
Expand Down Expand Up @@ -173,7 +176,7 @@ func (p proxyParser) ProxySocks5(val string) {
if err := checkServerAddr(val); err != nil {
Fatal("parent socks server", err)
}
addParentProxy(newSocksParent(val))
parentProxy.add(newSocksParent(val))
}

func (pp proxyParser) ProxyHttp(val string) {
Expand All @@ -197,7 +200,7 @@ func (pp proxyParser) ProxyHttp(val string) {

parent := newHttpParent(server)
parent.initAuth(userPasswd)
addParentProxy(parent)
parentProxy.add(parent)
}

// Parse method:passwd@server:port
Expand Down Expand Up @@ -235,7 +238,7 @@ func (pp proxyParser) ProxySs(val string) {
}
parent := newShadowsocksParent(server)
parent.initCipher(method, passwd)
addParentProxy(parent)
parentProxy.add(parent)
}

func (pp proxyParser) ProxyCow(val string) {
Expand All @@ -258,7 +261,7 @@ func (pp proxyParser) ProxyCow(val string) {
}
config.saveReqLine = true
parent := newCowParent(server, arr[0], arr[1])
addParentProxy(parent)
parentProxy.add(parent)
}

// listenParser provides functions to parse different types of listen addresses
Expand Down Expand Up @@ -418,7 +421,7 @@ func (p configParser) ParseHttpParent(val string) {
}
config.saveReqLine = true
http.parent = newHttpParent(val)
addParentProxy(http.parent)
parentProxy.add(http.parent)
http.serverCnt++
configNeedUpgrade = true
}
Expand All @@ -444,6 +447,8 @@ func (p configParser) ParseLoadBalance(val string) {
config.LoadBalance = loadBalanceBackup
case "hash":
config.LoadBalance = loadBalanceHash
case "latency":
config.LoadBalance = loadBalanceLatency
default:
Fatalf("invalid loadBalance mode: %s\n", val)
}
Expand Down Expand Up @@ -480,7 +485,7 @@ func (p configParser) ParseShadowSocks(val string) {
Fatal("shadowsocks server", err)
}
shadow.parent = newShadowsocksParent(val)
addParentProxy(shadow.parent)
parentProxy.add(shadow.parent)
shadow.serverCnt++
configNeedUpgrade = true
}
Expand Down Expand Up @@ -548,6 +553,10 @@ func (p configParser) ParseCore(val string) {
config.Core = parseInt(val, "core")
}

func (p configParser) ParseHttpErrorCode(val string) {
config.HttpErrorCode = parseInt(val, "httpErrorCode")
}

func (p configParser) ParseReadTimeout(val string) {
config.ReadTimeout = parseDuration(val, "readTimeout")
}
Expand Down Expand Up @@ -592,7 +601,7 @@ func parseConfig(rc string, override *Config) {
continue
}

v := strings.Split(line, "=")
v := strings.SplitN(line, "=", 2)
if len(v) != 2 {
Fatal("config syntax error on line", n)
}
Expand Down Expand Up @@ -653,7 +662,11 @@ func upgradeConfig(rc string, lines []string) {
// comment out original
w.WriteString("#" + line + newLine)
case "httpParent", "shadowSocks", "socksParent":
parent := parentProxy[proxyId]
backPool, ok := parentProxy.(*backupParentPool)
if !ok {
panic("initial parent pool should be backup pool")
}
parent := backPool.parent[proxyId]
proxyId++
w.WriteString(parent.genConfig() + newLine)
// comment out original
Expand Down Expand Up @@ -716,9 +729,6 @@ func checkConfig() {
if listenProxy == nil {
listenProxy = []Proxy{newHttpProxy(defaultListenAddr, "")}
}
if len(parentProxy) <= 1 {
config.LoadBalance = loadBalanceBackup
}
}

func mkConfigDir() (err error) {
Expand Down
14 changes: 8 additions & 6 deletions config_test.go
Expand Up @@ -53,15 +53,17 @@ func TestTunnelAllowedPort(t *testing.T) {
}

func TestParseProxy(t *testing.T) {
parentProxy = nil
var ok bool
pool, ok := parentProxy.(*backupParentPool)
if !ok {
t.Fatal("parentPool by default should be backup pool")
}
cnt := -1

var parser configParser
parser.ParseProxy("http://127.0.0.1:8080")
cnt++

hp, ok := parentProxy[cnt].proxyConnector.(*httpParent)
hp, ok := pool.parent[cnt].ParentProxy.(*httpParent)
if !ok {
t.Fatal("1st http proxy parsed not as httpParent")
}
Expand All @@ -71,7 +73,7 @@ func TestParseProxy(t *testing.T) {

parser.ParseProxy("http://user:passwd@127.0.0.2:9090")
cnt++
hp, ok = parentProxy[cnt].proxyConnector.(*httpParent)
hp, ok = pool.parent[cnt].ParentProxy.(*httpParent)
if !ok {
t.Fatal("2nd http proxy parsed not as httpParent")
}
Expand All @@ -84,7 +86,7 @@ func TestParseProxy(t *testing.T) {

parser.ParseProxy("socks5://127.0.0.1:1080")
cnt++
sp, ok := parentProxy[cnt].proxyConnector.(*socksParent)
sp, ok := pool.parent[cnt].ParentProxy.(*socksParent)
if !ok {
t.Fatal("socks proxy parsed not as socksParent")
}
Expand All @@ -94,7 +96,7 @@ func TestParseProxy(t *testing.T) {

parser.ParseProxy("ss://aes-256-cfb:foobar!@127.0.0.1:1080")
cnt++
_, ok = parentProxy[cnt].proxyConnector.(*shadowsocksParent)
_, ok = pool.parent[cnt].ParentProxy.(*shadowsocksParent)
if !ok {
t.Fatal("shadowsocks proxy parsed not as shadowsocksParent")
}
Expand Down

0 comments on commit 97d7925

Please sign in to comment.