接口限流常用算法实践

发表于 2020-10-20 更新于 2022-04-15 分类于 PHP

背景

为了更好的说明几种算法，我们举个例子，下文就这个例子分别几种算法实现。

如何限制每分钟访问 /api/books 接口不能超过 120 次？

我们先定义一个接口：

interface RateLimiter
{
    public function access();
}

固定时间窗口算法

固定时间窗口算法又叫计数器算法，逻辑就是对固定时间段内的访问次数进行计数，如果计数结果超过次数限制，则拒绝访问。

固定时间窗口算法的劣势就在于其只关心时间段内的总访问次数，而忽略了瞬间集中请求的问题，换而言之，这种统计方法的粒度太粗了，然而我们无法保证请求在时间段内的分布是平均的。

举个例子，如果 A 用户访问接口的时间分布如下：

时间段	访问次数
00:00 ~ 00:30	20
00:30 ~ 01:00	100
01:00 ~ 01:30	100
01:30 ~ 02:00	20

显然在第一分钟，我们有 120 次请求，第二分钟也是 120 次请求，但是 00:30 ~ 01:30 这一分钟时间内，显然是请求书超过 120 次的，所以，这种情况虽然实现了需求，但是很勉强，粒度不够细。

PHP实现

class FixedWindowRateLimiter implements RateLimiter 
{
    protected $prefix = "ratelimiter:";
    protected $key    = null;
    protected $window = null;
    protected $limit  = null;

    /**
     * @param string  $key      计数KEY
     * @param integer $window   窗口时间（秒）
     * @param integer $limit    次数限制
     */
    public function __construct(string $key, int $window, int $limit)
    {
        $this->key    = $this->prefix . $key;
        $this->window = $window;
        $this->limit  = $limit;
    }

    public function access()
    {
        $count = Redis::get($this->key);
        //首次访问
        if (!$count) {
            Redis::pipeline(function ($pipe) {
                $pipe->incr($this->key);
                $pipe->expire($this->key, $this->window);
            });
            return true;
        }

        //判断计数
        if ($count >= $this->limit) {
            return false;
        }
        Redis::incr($this->key);
        return true;
    }
}

调用：

$rateLimiter = new FixedWindowRateLimiter("api:books", 60, 120);
if (!$rateLimiter->access()) {
    abort(404);
}

Lua实现

--- 资源唯一标识
local key = KEYS[1]
--- 时间窗最大并发数
local max_window_concurrency = tonumber(ARGV[1])  
--- 时间窗
local window = tonumber(ARGV[2])   
--- 时间窗内当前并发数
local curr_window_concurrency = tonumber(redis.call('get', key) or 0)  
if current + 1 > limit then
    return false
else
    redis.call("INCRBY", key,1)    
    if window > -1 then
        redis.call("expire", key,window)    
    end
    return true
end

滑动时间窗口算法

固定时间窗口算法的问题是统计区间太大，限流不够精确，而且在第二个统计区间 01:00 ~ 02:00 时没有考虑与前一个统计区间的关系与影响（第一个区间后半段 + 第二个区间前半段也是一分钟）。

为了解决上面我们提到的临界问题，我们试图把每个统计区间分为更小的统计区间，更精确的统计计数。

如上图所示，我们把每分钟分为三份，每个小格是20秒，对每个小格都会分别计数：

时间段	访问次数
00:00 ~ 00:20	2
00:20 ~ 00:40	2
00:40 ~ 01:00	116
01:00 ~ 01:20	116

第一次统计，我们先看 01:00 ~ 02:00，计数 120 次，没有超过流量限制，然而在下一个 20 秒中，流量请求集中，在 00:20 ~ 01:20 这一分钟的统计范围内，流量超过了限制，则可以限制访问。

计数器算法其实就是滑动窗口算法。只是它没有对时间窗口做进一步地划分，所以只有1格；由此可见，当滑动窗口的格子划分的越多，那么滑动窗口的滚动就越平滑，限流的统计就会越精确。

PHP实现v1

class SlidingWindowRateLimiter implements RateLimiter 
{
    public    $remaining = null;
    protected $prefix    = "ratelimiter:sliding:";
    protected $key       = null;
    protected $window    = null;
    protected $limit     = null;
    protected $blocks    = null;

    /**
     * @param string  $key      计数KEY
     * @param integer $window   窗口时间（秒）
     * @param integer $limit    次数限制
     * @param integer $blocks   窗口分块数量
     */
    public function __construct(string $key, int $window, int $limit, int $blocks)
    {
        $this->key    = $this->prefix . $key;
        $this->window = $window;
        $this->limit  = $limit;
        $this->blocks = $blocks;
    }

    public function access()
    {
        $time           = time();
        $lastBlockTime  = $time - $time % (ceil($this->window / $this->blocks));
        $firstBlockTime = $lastBlockTime - $this->window;

        //移除已经过期的 blocks
        Redis::zremrangebylex($this->key, '[0', '('.$firstBlockTime);
        $blocks    = Redis::zrange($this->key, 0, -1, 'WITHSCORES');
        $scores    = array_sum($blocks);
        $remaining = $this->limit - $scores;
        if ($remaining <= 0) {
            return false;
        }

        Redis::zincrby($this->key, 1, $lastBlockTime);
        $this->remaining = $remaining - 1;
        return true;
    }
}

滑动窗口真的可以解决临界问题么？

我们再来看一个例子，在下面的例子中，流量主要集中于滑动窗口的 last block 和窗口滑动后的 next block。

同样的请求情况，我们先把一个滑动窗口分为三格，窗口滑动前后都是符合限流要求的；然后我们把原来的每一格都分为两格，如图2，那么此时，窗口滑动前后的状况就不同了。

由此可见，滑动窗口算法很难完全符合限流需求，除非我们把统计区间变得更小，更甚至细化到每个请求，随之带来的问题是我们需要消耗更多存储空间。

PHP实现v2

class SlidingWindowRateLimiter implements RateLimiter 
{
    public    $remaining = null;
    protected $prefix    = "ratelimiter:sliding:";
    protected $key       = null;
    protected $window    = null;
    protected $limit     = null;
    protected $blocks    = null;

    /**
     * @param string  $key      计数KEY
     * @param integer $window   窗口时间（秒）
     * @param integer $limit    次数限制
     * @param integer $blocks   窗口分块数量
     */
    public function __construct(string $key, int $window, int $limit, int $blocks)
    {
        $this->key    = $this->prefix . $key;
        $this->window = $window;
        $this->limit  = $limit;
        $this->blocks = $blocks;
    }

    public function access()
    {
        $now = microtime(true);
        $result = Redis::pipeline(function ($pipe) use ($now) {
            $pipe->zrangebyscore($this->key, 0, $now - $this->window);
            $pipe->zrange($this->key, 0, -1);
            $pipe->zadd($this->key, $now, $now);
            $pipe->expire($this->key, $this->window);
        });

        // The second command inside the transaction was ZRANGE,
        // which returns a list of timestamps within the last hour.
        $timestamps = $result[1];

        $this->remaining = max(0, $this->limit - count($timestamps));

        return $this->remaining > 0;
    }
}

Lua实现v2

local token = KEYS[1]
local now = tonumber(ARGV[1])
local window = tonumber(ARGV[2])
local limit = tonumber(ARGV[3])

local clearBefore = now - window
redis.call('ZREMRANGEBYSCORE', token, 0, clearBefore)

local amount = redis.call('ZCARD', token)
if amount < limit then
    redis.call('ZADD', token, now, now)
end
redis.call('EXPIRE', token, window)

return limit - amount

漏桶算法

漏桶算法（Leaky Bucket）是什么呢？大家都用过水龙头，打开龙头开关水就会流下滴到水桶里，而漏桶指的是水桶下面有个漏洞可以出水。如果水龙头开的特别大那么水流速就会过大，这样就可能导致水桶的水满了然后溢出。

而我们讨论的漏桶算法的思路也很简单，水龙头打开后流下的水（请求）以一定的速率流到漏桶里（限流容器），漏桶以一定的速度出水（接口响应速率），如果水流速度过大（请求过多）就可能会导致漏桶的水溢出（访问频率超过接口响应速率），这时候我们需要关掉水龙头（拒绝请求），下面是经典的漏桶算法图示：

PHP实现v3

class LeakyBucketRateLimiter implements RateLimiter 
{
    protected $prefix   = "ratelimiter:leaky:";
    protected $key      = null;
    protected $rate     = null;
    protected $capacity = null;

    /**
     * @param string  $key      计数KEY
     * @param integer $rate     请求处理速率（个/秒）
     * @param int     $capacity 漏桶容量（最大队列长度）
     */
    public function __construct(string $key, int $rate, int $capacity)
    {
        $this->capacity = $capacity;
        $this->key      = $this->prefix . $key;
        $this->rate     = $rate;
    }

    public function access()
    {
        //当前时间
        $time = time();
        [$lastTime, $tokensPend] = Redis::hmget($this->key, 'last_time', 'tokens_pend');
        //推算已经处理过的数量
        $tokensHandled = ($time - $lastTime) * $rate;
        //当前剩余 tokens 数量（队列待处理）
        $tokensPend = max(0, $tokensPend - $tokensHandled);
        //更新上次请求的时间
        Redis::hSet($this->key, 'last_time', $time);
        //判断漏桶是否满了
        if ($tokensPend < $this->capacity) {
            //可以加入队列等待处理
            Redis::hSet($this->key, 'tokens_pend', $tokensPend + 1);
            return true;
        } else {
            //队列已经满了
            return false;
        }
    }

}

备注：

代码中对 redis 的操作存在并发问题，代码仅供参考程序逻辑，使用 redis+lua 脚本才是更靠谱的方案。
代码仅处理了漏桶满时溢出、漏桶不满时允许进入队列，但是实际请求的处理，还需要后续使用队列或延时处理保证漏桶流出的速率。

调用：

$mobile      = "13888888888";
$rateLimiter = new LeakyBucketRateLimiter(sprintf("api:sms:send:%s", $mobile), 2, 5);
if (!$rateLimiter->access()) {
    abort(404);
}
Queue::smsSend($mobile, $sms);

漏桶算法中必须保证请求处理速率是恒定的，不然限流就只能做到『漏桶满时溢出』的程度了。
因此，在我看来，这种限流处理后置的算法，可能更适合异步任务或者离线处理。

Lua实现v3

local token = KEYS[1]
local now = tonumber(ARGV[1])

local rate = tonumber(ARGV[2])
local capacity = tonumber(ARGV[3])

local lastTime = tonumber(redis.call('HGET', token, 'last_time') or 0)
local tokensPend = tonumber(redis.call('HGET', token, 'tokens_pend') or 0)

tokensPend = math.max(0, tokensPend - (now - lastTime) * rate)
redis.call('HSET', token, 'last_time', now)
if tokensPend < capacity then
    redis.call('HSET', token, 'tokens_pend', tokensPend + 1)
    return true
end

return false

漏桶算法有以下特点：

漏桶容量固定，漏洞出水速率是固定的（请求处理速度固定）
流量进入漏桶的速率是没有限制的（但不会立即处理，需要排队等待经过漏洞）
桶满了以后会溢出（拒绝新的请求）

漏桶算法限制的关键『请求处理速度』，即使遇到突发流量，我们的处理速度依然不变，因此会有一部分请求在排队，会等待延迟处理，但结果输出始终不会超出『请求处理速度』的限制。

令牌桶算法

令牌桶算法用来控制发送到网络上的数据的数目，并允许突发数据的发送。

令牌桶算法和漏桶算法的方向刚好是相反的，我们有一个固定的桶，桶里存放着令牌（token）。一开始桶是空的，系统按固定的时间（rate）往桶里添加令牌，直到桶里的令牌数满，多余的请求会被丢弃。当请求来的时候，从桶里移除一个令牌，如果桶是空的则拒绝请求或者阻塞。

PHP实现v4

class tokenBucketRateLimiter implements RateLimiter 
{
    protected $prefix   = "ratelimiter:tokenbucket:";
    protected $key      = null;
    protected $rate     = null;
    protected $capacity = null;

    /**
     * @param string  $key      计数KEY
     * @param integer $rate     产生令牌的速率（个/秒）
     * @param int     $capacity 漏桶容量
     */
    public function __construct(string $key, int $rate, int $capacity)
    {
        $this->capacity = $capacity;
        $this->key      = $this->prefix . $key;
        $this->rate     = $rate;
    }

    public function access()
    {
        //当前时间
        $time = time();
        [$lastTime, $tokensLeft] = Predis::hmget($this->key, 'last_time', 'tokens_left');
        //推算新产生的token数量
        $tokensAppend = ($time - $lastTime) * $rate;
        //当前剩余 tokens 数量
        $tokensLeft = min($this->capacity, $tokensLeft + $tokensAppend);
        //更新上次请求的时间
        Predis::hSet($this->key, 'last_time', $time);
        //判断token是否足够
        if ($tokensLeft < 1) {
            return false;
        } else {
            Predis::hSet($this->key, 'tokens_left', $tokensPend - 1);
            return true;
        }
    }
}

Lua实现v4

local token = KEYS[1]
local now = tonumber(ARGV[1])

local rate = tonumber(ARGV[2])
local capacity = tonumber(ARGV[3])

local lastTime = tonumber(redis.call('HGET', token, 'last_time') or 0)
local tokensLeft = tonumber(redis.call('HGET', token, 'tokens_left') or 0)

tokensLeft = math.min(capacity, tokensLeft + (now - lastTime) * rate)
redis.call('HSET', token, 'last_time', now)
if tokensLeft >= 1 then
    redis.call('HSET', token, 'tokens_left', tokensLeft - 1)
    return true
end

return false

往令牌桶中以固定速率增加令牌
令牌桶的容量有限，如果桶满了，新令牌溢出丢弃
如果桶中可用令牌不足时，该请求会被限流

令牌桶算法中流量的流入速率是不限制的，流出速率也是没有强制限制的（令牌消耗瞬间峰值是桶的大小），但是令牌的产生和积攒需要时间，因此既支持一定程度突发流量，又能满足总体限流的目标。

对比漏桶算法和令牌桶算法，我们发现，漏桶算法能够强行限制数据的传输速率，而令牌桶算法在能够限制数据的平均传输速率外，还允许某种程度的突发传输。