记一次 SpringBoot 中文乱码问题调查
作者:mmseoamin日期:2023-12-18

现象

  • 现象是请求中的中文保存到数据库后会乱码。
  • 乱码的字符并不是什么特殊字符。
  • 删除了乱码字符前面的字符后,乱码的地方会向后偏移。

    调查过程

    第一反应是数据库字段的字符集设置导致的,但修改成 utf8mb4 字符集后问题依旧。

    通过本地调试发现,直接请求接口的字符串并没有乱码。

    通过测试环境日志发现,Controller 接收到的参数中字符串已经乱码了。

    测试环境和开发环境的区别是其请求是通过网关转发的。

    调查网关后,发现其中有一个 Filter 曾对请求内容进行了转码处理。具体代码如下:

    java复制代码import java.nio.charset.StandardCharsets;
    import org.springframework.beans.factory.annotation.Autowired;
    import org.springframework.boot.autoconfigure.condition.ConditionalOnProperty;
    import org.springframework.cloud.gateway.filter.GatewayFilterChain;
    import org.springframework.cloud.gateway.filter.GlobalFilter;
    import org.springframework.core.Ordered;
    import org.springframework.core.io.buffer.DataBuffer;
    import org.springframework.core.io.buffer.DataBufferUtils;
    import org.springframework.core.io.buffer.NettyDataBufferFactory;
    import org.springframework.http.HttpHeaders;
    import org.springframework.http.HttpMethod;
    import org.springframework.http.MediaType;
    import org.springframework.http.server.reactive.ServerHttpRequest;
    import org.springframework.http.server.reactive.ServerHttpRequestDecorator;
    import org.springframework.stereotype.Component;
    import org.springframework.web.server.ServerWebExchange;
    import io.netty.buffer.ByteBufAllocator;
    import reactor.core.publisher.Flux;
    import reactor.core.publisher.Mono;
    /**
     * 跨站脚本过滤器
     */
    @Component
    @ConditionalOnProperty(value = "security.xss.enabled", havingValue = "true")
    public class XssFilter implements GlobalFilter, Ordered
    {
        // 跨站脚本的 xss 配置,nacos自行添加
        @Autowired
        private XssProperties xss;
        @Override
        public Mono filter(ServerWebExchange exchange, GatewayFilterChain chain)
        {
            ServerHttpRequest request = exchange.getRequest();
            // GET DELETE 不过滤
            HttpMethod method = request.getMethod();
            if (method == null || method.matches("GET") || method.matches("DELETE"))
            {
                return chain.filter(exchange);
            }
            // 非json类型,不过滤
            if (!isJsonRequest(exchange))
            {
                return chain.filter(exchange);
            }
            // excludeUrls 不过滤
            String url = request.getURI().getPath();
            if (StringUtils.matches(url, xss.getExcludeUrls()))
            {
                return chain.filter(exchange);
            }
            ServerHttpRequestDecorator httpRequestDecorator = requestDecorator(exchange);
            return chain.filter(exchange.mutate().request(httpRequestDecorator).build());
        }
        private ServerHttpRequestDecorator requestDecorator(ServerWebExchange exchange)
        {
            ServerHttpRequestDecorator serverHttpRequestDecorator = new ServerHttpRequestDecorator(exchange.getRequest())
            {
                @Override
                public Flux getBody()
                {
                    Flux body = super.getBody();
                    return body.map(dataBuffer -> {
                        byte[] content = new byte[dataBuffer.readableByteCount()];
                        dataBuffer.read(content);
                        DataBufferUtils.release(dataBuffer);
                        String bodyStr = new String(content, StandardCharsets.UTF_8);
                        // 防xss攻击过滤
                        bodyStr = EscapeUtil.clean(bodyStr);
                        // 转成字节
                        byte[] bytes = bodyStr.getBytes();
                        NettyDataBufferFactory nettyDataBufferFactory = new NettyDataBufferFactory(ByteBufAllocator.DEFAULT);
                        DataBuffer buffer = nettyDataBufferFactory.allocateBuffer(bytes.length);
                        buffer.write(bytes);
                        return buffer;
                    });
                }
                @Override
                public HttpHeaders getHeaders()
                {
                    HttpHeaders httpHeaders = new HttpHeaders();
                    httpHeaders.putAll(super.getHeaders());
                    // 由于修改了请求体的body,导致content-length长度不确定,因此需要删除原先的content-length
                    httpHeaders.remove(HttpHeaders.CONTENT_LENGTH);
                    httpHeaders.set(HttpHeaders.TRANSFER_ENCODING, "chunked");
                    return httpHeaders;
                }
            };
            return serverHttpRequestDecorator;
        }
        /**
         * 是否是Json请求
         *
         * @param
         */
        public boolean isJsonRequest(ServerWebExchange exchange)
        {
            String header = exchange.getRequest().getHeaders().getFirst(HttpHeaders.CONTENT_TYPE);
            return StringUtils.startsWithIgnoreCase(header, MediaType.APPLICATION_JSON_VALUE);
        }
        @Override
        public int getOrder()
        {
            return -100;
        }
    }
    

    本地调试网关服务时发现出问题的是 getBody() 方法。 当请求body较长时,byte[] content = new byte[dataBuffer.readableByteCount()]; 变量中获取到的并不是全部的请求体,而只是其中的部分内容。 这就解释了为什么会固定在一定长度的时候会乱码。

    java复制代码@Override
    public Flux getBody()
    {
        Flux body = super.getBody();
        return body.map(dataBuffer -> {
            byte[] content = new byte[dataBuffer.readableByteCount()];
            dataBuffer.read(content);
            DataBufferUtils.release(dataBuffer);
            String bodyStr = new String(content, StandardCharsets.UTF_8);
            // 防xss攻击过滤
            bodyStr = EscapeUtil.clean(bodyStr);
            // 转成字节
            byte[] bytes = bodyStr.getBytes();
            NettyDataBufferFactory nettyDataBufferFactory = new NettyDataBufferFactory(ByteBufAllocator.DEFAULT);
            DataBuffer buffer = nettyDataBufferFactory.allocateBuffer(bytes.length);
            buffer.write(bytes);
            return buffer;
        });
    }
    

    在这篇博客中博主遇到了同样的问题,并给出了一种解决方案。

    java复制代码.route("route1",
        r -> r.method(HttpMethod.POST)
                .and().readBody(String.class, requestBody -> {
            // 这里不对body做判断处理
            return true;
        }).and().path(SERVICE1)
        .filters(f -> {
            f.filters(list);
            return f;
        }).uri(URI));
    

    但是这种方法需要修改代码,而现在项目中网关的路由配置一般都是放在配置中心上的。 查看 readBody() 方法的代码,发现这里指定的是一个 Predicate (断言)。

    修改后网关路由配置如下:

    yaml复制代码spring:
      cloud:
        gateway:
          routes:
            - id: some-system
              uri: lb://some-system
              filters:
                - StripPrefix=1
              predicates:
                - Path=/some-system/**
                - name: ReadBodyPredicateFactory
                  args:
                    inClass: '#{T(String)}'
    

    之后启动网关服务时报了如下错误:

    Unable to find RoutePredicateFactory with name ReadBodyPredicateFactory

    完整错误消息堆栈如下:

    java复制代码14:38:40.771 [boundedElastic-27] ERROR o.s.c.g.r.CachingRouteLocator - [handleRefreshError,94] - Refresh routes error !!!
    java.lang.IllegalArgumentException: Unable to find RoutePredicateFactory with name ReadBodyPredicateFactory
    	at org.springframework.cloud.gateway.route.RouteDefinitionRouteLocator.lookup(RouteDefinitionRouteLocator.java:203)
    	at org.springframework.cloud.gateway.route.RouteDefinitionRouteLocator.combinePredicates(RouteDefinitionRouteLocator.java:192)
    	at org.springframework.cloud.gateway.route.RouteDefinitionRouteLocator.convertToRoute(RouteDefinitionRouteLocator.java:116)
    	at reactor.core.publisher.FluxMap$MapSubscriber.onNext(FluxMap.java:106)
    	at reactor.core.publisher.FluxFlatMap$FlatMapMain.tryEmitScalar(FluxFlatMap.java:488)
    	at reactor.core.publisher.FluxFlatMap$FlatMapMain.onNext(FluxFlatMap.java:421)
    	at reactor.core.publisher.FluxMergeSequential$MergeSequentialMain.drain(FluxMergeSequential.java:432)
    	at reactor.core.publisher.FluxMergeSequential$MergeSequentialMain.innerComplete(FluxMergeSequential.java:328)
    	at reactor.core.publisher.FluxMergeSequential$MergeSequentialInner.onComplete(FluxMergeSequential.java:584)
    	at reactor.core.publisher.FluxMap$MapSubscriber.onComplete(FluxMap.java:142)
    	at reactor.core.publisher.FluxFilter$FilterSubscriber.onComplete(FluxFilter.java:166)
    	at reactor.core.publisher.FluxMap$MapConditionalSubscriber.onComplete(FluxMap.java:269)
    	at reactor.core.publisher.FluxFilter$FilterConditionalSubscriber.onComplete(FluxFilter.java:300)
    	at reactor.core.publisher.FluxFlatMap$FlatMapMain.checkTerminated(FluxFlatMap.java:846)
    	at reactor.core.publisher.FluxFlatMap$FlatMapMain.drainLoop(FluxFlatMap.java:608)
    	at reactor.core.publisher.FluxFlatMap$FlatMapMain.innerComplete(FluxFlatMap.java:894)
    	at reactor.core.publisher.FluxFlatMap$FlatMapInner.onComplete(FluxFlatMap.java:997)
    	at reactor.core.publisher.Operators$MonoSubscriber.complete(Operators.java:1799)
    	at reactor.core.publisher.MonoCollectList$MonoCollectListSubscriber.onComplete(MonoCollectList.java:128)
    	at org.springframework.cloud.commons.publisher.FluxFirstNonEmptyEmitting$FirstNonEmptyEmittingSubscriber.onComplete(FluxFirstNonEmptyEmitting.java:325)
    	at reactor.core.publisher.FluxSubscribeOn$SubscribeOnSubscriber.onComplete(FluxSubscribeOn.java:166)
    	at reactor.core.publisher.FluxIterable$IterableSubscription.fastPath(FluxIterable.java:360)
    	at reactor.core.publisher.FluxIterable$IterableSubscription.request(FluxIterable.java:225)
    	at reactor.core.publisher.FluxSubscribeOn$SubscribeOnSubscriber.requestUpstream(FluxSubscribeOn.java:131)
    	at reactor.core.publisher.FluxSubscribeOn$SubscribeOnSubscriber.onSubscribe(FluxSubscribeOn.java:124)
    	at reactor.core.publisher.FluxIterable.subscribe(FluxIterable.java:164)
    	at reactor.core.publisher.FluxIterable.subscribe(FluxIterable.java:86)
    	at reactor.core.publisher.Flux.subscribe(Flux.java:8402)
    	at reactor.core.publisher.FluxFlatMap.trySubscribeScalarMap(FluxFlatMap.java:200)
    	at reactor.core.publisher.MonoFlatMapMany.subscribeOrReturn(MonoFlatMapMany.java:49)
    	at reactor.core.publisher.FluxFromMonoOperator.subscribe(FluxFromMonoOperator.java:76)
    	at reactor.core.publisher.FluxSubscribeOn$SubscribeOnSubscriber.run(FluxSubscribeOn.java:194)
    	at reactor.core.scheduler.WorkerTask.call(WorkerTask.java:84)
    	at reactor.core.scheduler.WorkerTask.call(WorkerTask.java:37)
    	at java.base/java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:264)
    	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java)
    	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
    	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
    	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)
    	at java.base/java.lang.Thread.run(Thread.java:832)
    

    报错的方法源码如下:

    java复制代码@SuppressWarnings("unchecked")
    private AsyncPredicate lookup(RouteDefinition route, PredicateDefinition predicate) {
        RoutePredicateFactory factory = this.predicates.get(predicate.getName());
        if (factory == null) {
            throw new IllegalArgumentException("Unable to find RoutePredicateFactory with name " + predicate.getName());
        }
        if (logger.isDebugEnabled()) {
            logger.debug("RouteDefinition " + route.getId() + " applying " + predicate.getArgs() + " to "
                    + predicate.getName());
        }
        // @formatter:off
        Object config = this.configurationService.with(factory)
                .name(predicate.getName())
                .properties(predicate.getArgs())
                .eventFunction((bound, properties) -> new PredicateArgsEvent(
                        RouteDefinitionRouteLocator.this, route.getId(), properties))
                .bind();
        // @formatter:on
        return factory.applyAsync(config);
    }
     
    

    Debug 发现 this.predicates 的内容如下:

    java复制代码predicates = {LinkedHashMap@11025}  size = 13
     "After" -> {AfterRoutePredicateFactory@14744} "[AfterRoutePredicateFactory@7ea8224b configClass = AfterRoutePredicateFactory.Config]"
     "Before" -> {BeforeRoutePredicateFactory@14746} "[BeforeRoutePredicateFactory@5a010eec configClass = BeforeRoutePredicateFactory.Config]"
     "Between" -> {BetweenRoutePredicateFactory@14748} "[BetweenRoutePredicateFactory@623ded82 configClass = BetweenRoutePredicateFactory.Config]"
     "Cookie" -> {CookieRoutePredicateFactory@14750} "[CookieRoutePredicateFactory@180e33b0 configClass = CookieRoutePredicateFactory.Config]"
     "Header" -> {HeaderRoutePredicateFactory@14752} "[HeaderRoutePredicateFactory@270be080 configClass = HeaderRoutePredicateFactory.Config]"
     "Host" -> {HostRoutePredicateFactory@14754} "[HostRoutePredicateFactory@752ffce3 configClass = HostRoutePredicateFactory.Config]"
     "Method" -> {MethodRoutePredicateFactory@14756} "[MethodRoutePredicateFactory@78f35e39 configClass = MethodRoutePredicateFactory.Config]"
     "Path" -> {PathRoutePredicateFactory@14758} "[PathRoutePredicateFactory@11896124 configClass = PathRoutePredicateFactory.Config]"
     "Query" -> {QueryRoutePredicateFactory@14760} "[QueryRoutePredicateFactory@69fe8c75 configClass = QueryRoutePredicateFactory.Config]"
     "ReadBody" -> {ReadBodyRoutePredicateFactory@14762} "[ReadBodyRoutePredicateFactory@633cad4d configClass = ReadBodyRoutePredicateFactory.Config]"
     "RemoteAddr" -> {RemoteAddrRoutePredicateFactory@14764} "[RemoteAddrRoutePredicateFactory@15c3585 configClass = RemoteAddrRoutePredicateFactory.Config]"
     "Weight" -> {WeightRoutePredicateFactory@14766} "[WeightRoutePredicateFactory@5b86f4cb configClass = WeightConfig]"
     "CloudFoundryRouteService" -> {CloudFoundryRouteServiceRoutePredicateFactory@14768} "[CloudFoundryRouteServiceRoutePredicateFactory@468646ea configClass = Object]"
    

    也就是说配置文件中 predicate.name 应该为 ReadBody 而不是 ReadBodyPredicateFactory

    this.predicates 赋值相关的代码如下,从中可以看到在 NameUtils.normalizeRoutePredicateName 方法中去除了后缀 RoutePredicateFactory 。

    估计这是由于项目中使用的 Gateway 版本比较新(spring-cloud-gateway-server:3.0.3)导致的。

    RouteDefinitionRouteLocator.java

    java复制代码private void initFactories(List predicates) {
        predicates.forEach(factory -> {
            String key = factory.name();
            if (this.predicates.containsKey(key)) {
                this.logger.warn("A RoutePredicateFactory named " + key + " already exists, class: "
                        + this.predicates.get(key) + ". It will be overwritten.");
            }
            this.predicates.put(key, factory);
            if (logger.isInfoEnabled()) {
                logger.info("Loaded RoutePredicateFactory [" + key + "]");
            }
        });
    }
    

    RoutePredicateFactory.java

    java复制代码default String name() {
        return NameUtils.normalizeRoutePredicateName(getClass());
    }
    

    NameUtils.java

    java复制代码public static String normalizeRoutePredicateName(Class clazz) {
        return removeGarbage(clazz.getSimpleName().replace(RoutePredicateFactory.class.getSimpleName(), ""));
    }
    private static String removeGarbage(String s) {
        int garbageIdx = s.indexOf("$Mockito");
        if (garbageIdx > 0) {
            return s.substring(0, garbageIdx);
        }
        return s;
    }
    

    初次之外还需要配置一个 Predicate 类型的 Bean 并 自定义个 RoutePredicateFactory (从 ReadBodyRoutePredicateFactory 复制修改)。 否则会出现 NPE 和 404 NOT FOUND 错误。

    NPE 错误如下:

    java复制代码16:13:00.409 [reactor-http-epoll-3] ERROR o.s.c.g.h.RoutePredicateHandlerMapping - [lambda$null,132] - Error applying predicate for route: launch-tencent
    java.lang.NullPointerException: null
        at org.springframework.cloud.gateway.handler.predicate.ReadBodyRoutePredicateFactory.lambda$null(ReadBodyRoutePredicateFactory.java:96)
        at reactor.core.publisher.FluxMap$MapSubscriber.onNext(FluxMap.java:106)
        at reactor.core.publisher.FluxPeek$PeekSubscriber.onNext(FluxPeek.java:200)
    

    报错的代码是最后一行 config.getPredicate().test(objectValue) ,由于没有配置 config.getPredicate() 导致了 NPE 。

    java复制代码return ServerWebExchangeUtils.cacheRequestBodyAndRequest(exchange,
        (serverHttpRequest) -> ServerRequest
            .create(exchange.mutate().request(serverHttpRequest).build(), messageReaders)
            .bodyToMono(inClass).doOnNext(objectValue -> exchange.getAttributes()
                .put(CACHE_REQUEST_BODY_OBJECT_KEY, objectValue))
            .map(objectValue -> config.getPredicate().test(objectValue)));
    

    添加一个自定义的断言 alwaysTruePredicate ,并在配置文件的 args 中指定使用这个断言 predicate: '#{@alwaysTruePredicate}' 。

    java复制代码@Bean
    public Predicate alwaysTruePredicate(){
        return new Predicate() {
            @Override
            public boolean test(Object o) {
                return true;
            }
        };
    }
    

    404 NOT FOUND 错误需要在 ReadBodyRoutePredicateFactory 的基础上,在 .map(objectValue -> config.getPredicate().test(objectValue)) 的后面加一段 .thenReturn(true) 处理,使其永远返回 true 。

    对策

    1. 添加工厂类 GwReadBodyRoutePredicateFactory
    java复制代码import lombok.extern.slf4j.Slf4j;
    import org.reactivestreams.Publisher;
    import org.springframework.cloud.gateway.handler.AsyncPredicate;
    import org.springframework.cloud.gateway.handler.predicate.AbstractRoutePredicateFactory;
    import org.springframework.cloud.gateway.support.ServerWebExchangeUtils;
    import org.springframework.http.codec.HttpMessageReader;
    import org.springframework.stereotype.Component;
    import org.springframework.web.reactive.function.server.HandlerStrategies;
    import org.springframework.web.reactive.function.server.ServerRequest;
    import org.springframework.web.server.ServerWebExchange;
    import reactor.core.publisher.Mono;
    import java.util.List;
    import java.util.Map;
    import java.util.function.Predicate;
    /**
     * Predicate that reads the body and applies a user provided predicate to run on the body.
     * The body is cached in memory so that possible subsequent calls to the predicate do not
     * need to deserialize again.
     */
    @Component
    @Slf4j
    public class GwReadBodyRoutePredicateFactory extends
            AbstractRoutePredicateFactory {
        private static final String TEST_ATTRIBUTE = "read_body_predicate_test_attribute";
        private static final String CACHE_REQUEST_BODY_OBJECT_KEY = "cachedRequestBodyObject";
        private final List> messageReaders;
        public GwReadBodyRoutePredicateFactory() {
            super(GwReadBodyRoutePredicateFactory.Config.class);
            this.messageReaders = HandlerStrategies.withDefaults().messageReaders();
        }
        public GwReadBodyRoutePredicateFactory(List> messageReaders) {
            super(GwReadBodyRoutePredicateFactory.Config.class);
            this.messageReaders = messageReaders;
        }
        @Override
        @SuppressWarnings("unchecked")
        public AsyncPredicate applyAsync(
                GwReadBodyRoutePredicateFactory.Config config
        ) {
            return new AsyncPredicate() {
                @Override
                public Publisher apply(ServerWebExchange exchange) {
                    Class inClass = config.getInClass();
                    Object cachedBody = exchange.getAttribute(CACHE_REQUEST_BODY_OBJECT_KEY);
                    Mono modifiedBody;
                    // We can only read the body from the request once, once that happens if
                    // we try to read the body again an exception will be thrown. The below
                    // if/else caches the body object as a request attribute in the
                    // ServerWebExchange so if this filter is run more than once (due to more
                    // than one route using it) we do not try to read the request body
                    // multiple times
                    if (cachedBody != null) {
                        try {
                            boolean test = config.predicate.test(cachedBody);
                            exchange.getAttributes().put(TEST_ATTRIBUTE, test);
                            return Mono.just(test);
                        } catch (ClassCastException e) {
                            if (log.isDebugEnabled()) {
                                log.debug("Predicate test failed because class in predicate "
                                        + "does not match the cached body object", e);
                            }
                        }
                        return Mono.just(false);
                    } else {
                        return ServerWebExchangeUtils.cacheRequestBodyAndRequest(exchange,
                                (serverHttpRequest) -> ServerRequest.create(
                                                exchange.mutate().request(serverHttpRequest).build(), messageReaders)
                                        .bodyToMono(inClass)
                                        .doOnNext(objectValue -> exchange.getAttributes()
                                                .put(CACHE_REQUEST_BODY_OBJECT_KEY, objectValue))
                                        .map(objectValue -> config.getPredicate().test(objectValue))
                                        .thenReturn(true));
                    }
                }
                @Override
                public String toString() {
                    return String.format("ReadBody: %s", config.getInClass());
                }
            };
        }
        @Override
        @SuppressWarnings("unchecked")
        public Predicate apply(
                GwReadBodyRoutePredicateFactory.Config config
        ) {
            throw new UnsupportedOperationException("GwReadBodyPredicateFactory is only async.");
        }
        public static class Config {
            private Class inClass;
            private Predicate predicate;
            private Map hints;
            public Class getInClass() {
                return inClass;
            }
            public GwReadBodyRoutePredicateFactory.Config setInClass(Class inClass) {
                this.inClass = inClass;
                return this;
            }
            public Predicate getPredicate() {
                return predicate;
            }
            public GwReadBodyRoutePredicateFactory.Config setPredicate(Predicate predicate) {
                this.predicate = predicate;
                return this;
            }
            public  GwReadBodyRoutePredicateFactory.Config setPredicate(Class inClass, Predicate predicate) {
                setInClass(inClass);
                this.predicate = predicate;
                return this;
            }
            public Map getHints() {
                return hints;
            }
            public GwReadBodyRoutePredicateFactory.Config setHints(Map hints) {
                this.hints = hints;
                return this;
            }
        }
    }
    

    2.添加一个 GwReadBody 用的断言 Bean alwaysTruePredicate

    java复制代码@Bean
    public Predicate alwaysTruePredicate(){
        return new Predicate() {
            @Override
            public boolean test(Object o) {
                return true;
            }
        };
    }
    

    3.在网关的路由配置中添加 GwReadBody 断言配置,将接收类型指定为 String,并将 predicate 参数指定上一步中配置的 alwaysTruePredicate

    yaml复制代码spring:
      cloud:
        gateway:
          routes:
            - id: some-system
              uri: lb://some-system
              filters:
                - StripPrefix=1
              predicates:
                - Path=/some-system/**
                - name: GwReadBody
                  args:
                    inClass: '#{T(String)}'
                    predicate: '#{@alwaysTruePredicate}'
    

    4.重启网关服务后,再次在 XssFilter 中通过 getBody() 获取请求内容时,获得的已经是完整的请求体了。