In this blog post you learn a way how to write a rate limit protection for your Spring Boot server that runs in multiple instances using bucket4j and Redis. Rate limiting means an upper threshold exists for how many calls per timeframe against all server instances are allowed per client. If the rate by which a client calls the servers is too high, the calls are rejected.

There exist various scenarios for rate limiting:

For the remainder of this text, we assume that we intend to solve the issue of rate limiting on application level and that we have multiple server instances.

The idea

We use a leaky bucket. That means we have a bucket per calling client that we refill regularly by adding a fixed number of tokens to the remaining ones. For each request against any server instance, we remove a token. If no tokens are left, all servers return an HTTP 429 until tokens become available again. This idea has already been implemented by bucket4j. Many people use this library. However, remember that you add a dependency if you plan to use it. Therefore, please note that it is the work of only two contributors who have changed the major version several times. However, the good news is that it provides integrations with several distributed storage providers. The distributed storage is needed to share the information between the server instances, and how many requests have already been made by one client. bucket4j allows you to use a JDBC remote storage like:

In this blog post, we use Redis instead for which the following clients are available

We use Lettuce. So we have a Redis server, several instances of the same Spring Boot server with a REST endpoint that needs rate limit protection. For each client, calling our Spring Boot application, we create a bucket using a bucket configuration that refills the bucket with 200 tokens per minute and a Proxy Manager that creates a proxy through which the bucket will be shared between the server instances via the Redis database. A Spring Boot Filter intercepts any calls to any endpoint of our Spring Boot Server and checks that the rate limit is kept. Only then the request is forwarded to the Controller, otherwise an HTTP 429 is returned. If you need to check the rate limit only on certain endpoints, you can use an aspect and an annotation instead of a filter. Check out this blog post that shows the details of that implementation.

you can use instead of a filter an aspect and an annotation as described in this blog post.

Starting Point: A plain Controller

Let’s assume we have a Spring Boot Server with the following endpoint:

package com.innoq.test;

import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestBody;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;

@RestController
@RequestMapping("/api")
public class MyRestController
    @Autowired
    MyService myService;

    @PostMapping
    public MyResponse processRequest(@RequestBody final MyRequest request) {
        return myService.process(request);
    }
}

Let’s assume, MyResponse, and MyRequest are simple JavaBeans. Spring Boot will use the ObjectMapper of Jackson to translate the incoming JSON to an instance of MyRequest. After the call MyResponse will be translated to JSON again. The content of myService.process() is not relevant, only that it returns a MyResponse.

bucket4j

Add the following dependencies:

<dependency>
    <groupId>com.bucket4j</groupId>
    <artifactId>bucket4j-core</artifactId>
    <version>8.9.0</version>
</dependency>
<dependency>
    <groupId>com.bucket4j</groupId>
    <artifactId>bucket4j-redis</artifactId>
    <version>8.9.0</version>
</dependency>
<dependency>
    <groupId>io.lettuce</groupId>
    <artifactId>lettuce-core</artifactId>
    <version>6.3.1.RELEASE</version>
</dependency>

Execute Redis, locally:

docker pull redis:latest
docker run -d -p 6379:6379 --name rate-limit redis

Configure bucket4j to

package com.innoq.test;

import io.github.bucket4j.Bandwidth;
import io.github.bucket4j.BucketConfiguration;
import io.github.bucket4j.distributed.ExpirationAfterWriteStrategy;
import io.github.bucket4j.distributed.proxy.ProxyManager;
import io.github.bucket4j.redis.lettuce.cas.LettuceBasedProxyManager;
import io.lettuce.core.RedisClient;
import io.lettuce.core.RedisURI;
import io.lettuce.core.api.StatefulRedisConnection;
import io.lettuce.core.codec.ByteArrayCodec;
import io.lettuce.core.codec.RedisCodec;
import io.lettuce.core.codec.StringCodec;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

import java.time.Duration;
import java.util.function.Supplier;

@Configuration
public class RedisConfig {
    private RedisClient redisClient() {
        return RedisClient.create(RedisURI.builder()
            .withHost("localhost")
            .withPort(6379)
            .withSsl(false)
            .build());
    }

    @Bean
    public ProxyManager<String> lettuceBasedProxyManager() {
        RedisClient redisClient = redisClient();
        StatefulRedisConnection<String, byte[]> redisConnection = redisClient
            .connect(RedisCodec.of(StringCodec.UTF8, ByteArrayCodec.INSTANCE));

        return LettuceBasedProxyManager.builderFor(redisConnection)
            .withExpirationStrategy(
                ExpirationAfterWriteStrategy.basedOnTimeForRefillingBucketUpToMax(Duration.ofMinutes(1L)))
            .build();
    }

    @Bean
    public Supplier<BucketConfiguration> bucketConfiguration() {
        return () -> BucketConfiguration.builder()
            .addLimit(Bandwidth.simple(200L, Duration.ofMinutes(1L)))
            .build();
    }
}

The filter

Write a filter that every call to any endpoint of this Spring Boot application has to pass, before the actual endpoint is called. We use the client’s remote address as the bucket’s key. The build() method of the bucket4j library only creates new buckets if they do not exist already for the given key.

package com.innoq.test;

import io.github.bucket4j.Bucket;
import io.github.bucket4j.BucketConfiguration;
import io.github.bucket4j.ConsumptionProbe;
import io.github.bucket4j.distributed.proxy.ProxyManager;
import jakarta.servlet.Filter;
import jakarta.servlet.FilterChain;
import jakarta.servlet.ServletException;
import jakarta.servlet.ServletRequest;
import jakarta.servlet.ServletResponse;
import jakarta.servlet.http.HttpServletRequest;
import jakarta.servlet.http.HttpServletResponse;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.core.annotation.Order;
import org.springframework.stereotype.Component;

import java.io.IOException;
import java.util.concurrent.TimeUnit;
import java.util.function.Supplier;

@Component
@Order(1)
public class RateLimitFilter implements Filter {

    private final Logger LOG = LoggerFactory.getLogger(getClass());

    @Autowired
    Supplier<BucketConfiguration> bucketConfiguration;

    @Autowired
    ProxyManager<String> proxyManager;

    @Override
    public void doFilter(ServletRequest servletRequest, ServletResponse servletResponse, FilterChain filterChain) throws IOException, ServletException {
        HttpServletRequest httpRequest = (HttpServletRequest) servletRequest;
        String key = httpRequest.getRemoteAddr();
        Bucket bucket = proxyManager.builder().build(key, bucketConfiguration);

        ConsumptionProbe probe = bucket.tryConsumeAndReturnRemaining(1);
        LOG.debug(">>>>>>>>remainingTokens: {}", probe.getRemainingTokens());
        if (probe.isConsumed()) {
            filterChain.doFilter(servletRequest, servletResponse);
        } else {
            HttpServletResponse httpResponse = (HttpServletResponse) servletResponse;
            httpResponse.setContentType("text/plain");
            httpResponse.setHeader("X-Rate-Limit-Retry-After-Seconds", "" + TimeUnit.NANOSECONDS.toSeconds(probe.getNanosToWaitForRefill()));
            httpResponse.setStatus(429);
            httpResponse.getWriter().append("Too many requests");
        }
    }
}

Execution

To run a local test with multiple instances, create an application.properties with

server.port=8080

and an application-dev.properties with

server.port=8081

If you execute the Spring Boot application, once with the JVM argument -Dspring.profiles.active=dev and once without the JVM argument, you will have two instances of your server running, where one uses port 8080 and another port 8081. Then you can use curl in your command line with port 8080 resp. 8081.

curl --location --request POST 'http://localhost:8080/api' \
--header 'Content-Type: application/json' \
--data-raw '{
    "question": "How do you do?"
}'

If you call the curl command too often, you will get an HTTP 429. Even if you switch the port to 8081, you will still get an HTTP 429 until tokens become available again as both server instances use the same bucket shared via Redis. So, after the duration specified in the bucket configuration has passed, you should get again an HTTP 200.

An interesting fact about filters, like the one used in this solution, is that they are executed sequentially. Hence, they cannot be executed asynchronously. If you use reactive programming or you prefer an asynchronous solution it might be worth checking out the bucket4j documentation on that. The here used LettuceBasedProxyManager supports an asynchronous approach, too. If you use a solution based on JCache, then this must be a synchronous solution. The next paragraph presents a way how to use a JCache based solution with bucket4j.

Alternative: Blazing Cache

bucket4j supports any JCache (JSR 107) solution. bucket4j provides a CompatibilityTest that allows to verify that a chosen JCache provider is compatible with bucket4j. As suggested by our reader Jacopo, let’s discover how to adapt the above solution that it uses the compatible Blazing Cache instead of redis.

<dependency>
    <groupId>com.bucket4j</groupId>
    <artifactId>bucket4j-jcache</artifactId>
    <version>8.9.0</version>
</dependency>
<dependency>
    <groupId>javax.cache</groupId>
    <artifactId>cache-api</artifactId>
    <version>1.1.1</version>
</dependency>
<dependency>
    <groupId>org.blazingcache</groupId>
    <artifactId>blazingcache-jcache</artifactId>
    <version>3.3.0</version>
</dependency>
package com.innoq.test;

import io.github.bucket4j.Bandwidth;
import io.github.bucket4j.BucketConfiguration;
import io.github.bucket4j.Refill;
import io.github.bucket4j.distributed.proxy.ProxyManager;
import io.github.bucket4j.grid.jcache.JCacheProxyManager;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

import javax.cache.Cache;
import javax.cache.CacheManager;
import javax.cache.Caching;
import javax.cache.configuration.MutableConfiguration;
import javax.cache.spi.CachingProvider;
import java.time.Duration;
import java.util.Properties;
import java.util.function.Supplier;

@Configuration
public class JCacheConfig {

    @Bean
    public Cache cache() {
        CachingProvider provider = Caching.getCachingProvider();
        Properties properties = new Properties();
        properties.put("blazingcache.mode","static");
        properties.put("blazingcache.jmx","false"); //default value
        properties.put("blazingcache.server.host","localhost"); // default value
        properties.put("blazingcache.server.port","1025"); // default value
        properties.put("blazingcache.server.ssl","false"); // default value
        CacheManager cacheManager = provider.getCacheManager(provider.getDefaultURI(), provider.getDefaultClassLoader(), properties);
        MutableConfiguration<String, String> cacheConfiguration = new MutableConfiguration<>();
        final Cache alreadyExistingCache = cacheManager.getCache("rate-limit-bucket");
        return alreadyExistingCache != null ? alreadyExistingCache : cacheManager.createCache("rate-limit-bucket", cacheConfiguration);
    }

    @Bean
    public ProxyManager<String> jCacheProxyManager(Cache cache) {
        return new JCacheProxyManager<>(cache);
    }

    @Bean
    public Supplier<BucketConfiguration> bucketConfiguration() {
        Refill refill = Refill.intervally(200, Duration.ofMinutes(1));
        return () -> BucketConfiguration.builder()
            .addLimit(Bandwidth.classic(200L, refill))
            .build();
    }
}

At the time of writing the step to execute a Blazing Cache Server has been a little tricky. The documentation states to download the BazingCache package, although this could not be found. However, one can download the sources, run mvn install -Dmaven.test.skip=true on the subdirectory blazingcache-service, and then obtain the zip file referred to in the documentation, i.e. blazingcache-3.3.0.zip, from the target directory of the build.

Conclusion

This blog post presented a solution for implementing rate limiting for multiple instances of a Spring Boot server. Depending on the given project requirements and constraints readers may choose between different valid approaches. This blog post used bucket4j. It presented a way, to how bucket4j can be combined with Redis or Blazing Cache as a JCache provider. Based on that it should be easy for readers to adapt the solution presented here if another distributed storage provider is preferred. Interesting other alternatives might be also to consider Resilience4j or the even simpler solution presented in this other blog post.