GT's Blog

SpringFox Swagger with Groovy - fixing slow startup / heap space error

Story of how we reduced app startup time by 20x and heap usage by 7x

Few months ago, we started developing a Spring Boot app in Groovy. It is a typical app with ton of CRUD logic and few REST endpoints. It also had a Swagger UI configured with SpringFox Swagger. Though it was taking ~35 seconds to start and taking ~700MB heap space, no-one in the team really cared about the slowness and memory usage of the application.

As our application grew bigger, we added more DB tables and endpoints which resulted in further slowness and required more heap space. It was ~3 minutes startup with almost 1.4GB heap at its worst state. Again no-one really bothered to fix it... until the application randomly failed to start and started crashing randomly with OutOfMemoryError.

Problem Identification: Thread Dump using IntelliJ

The IntelliJ thread-dump showing the main thread always stuck at springfox

I started the application in debug mode in IntelliJ and took thread dump several times to see what's the main thread is doing. I found that in all thread dumps, the main thread was stuck around springfox.documentation.spring.web.readers.parameter.ModelAttributeParameterExpander.expand() method.

So, I dig into the ModelAttributeParameterExpander#expand() method was able to quickly confirm that it was trying to expand the fields from the @ModelAttribute to render into the Swagger UI - which is exposed in a JSON endpoint host:port/v3/api-docs.

Once the application started (after about 3 mins), I opened up the host:port/v3/api-docs endpoint locally and amazed by the size of JSON response. It WAS 1240MB! That explained why Swagger UI used to take longer.

Problem Isolation:

To isolate the problem, I commented out all the endpoints that's using @ModelAttribute and just like I expected, the app started within 9 seconds, took about 200MB heap, the swagger UI loaded within one second, the host:port/v3/api-docs endpoint response was about 20KB.

@GetMapping("/test")
void test(@ModelAttribute A a) { //commented out all the endpoints to isolate problem

Then I dig through the code inside ModelAttributeParameterExpander#expand() and I found that its collecting all the fields and properties inside the @ModelAttribute class and recursively calls the expand() method for all the sub-fields and properties.

public List<Compatibility<Parameter, RequestParameter>> expand(ExpansionContext context) {
    List<Compatibility<Parameter, RequestParameter>> parameters = new ArrayList();
    Set<PropertyDescriptor> propertyDescriptors = this.propertyDescriptors(context.getParamType().getErasedType());
    Map<Method, PropertyDescriptor> propertyLookupByGetter = this.propertyDescriptorsByMethod(context.getParamType().getErasedType(), propertyDescriptors);
    Iterable<ResolvedMethod> getters = (Iterable)this.accessors.in(context.getParamType()).stream().filter(this.onlyValidGetters(propertyLookupByGetter.keySet())).collect(Collectors.toList());
    ... 
    Stream<ModelAttributeField> collectionTypes = attributes.stream().filter(this.isCollection().and(this.recursiveCollectionItemType(context.getParamType()).negate()));
    collectionTypes.forEachOrdered((each) -> {
         ....
         parameters.addAll(this.expand(childContext)); //nested

I dig further and found that it was grabing the getMetaClass method generated by Groovy and recursively looking at all the get* methods and the parameters inside MetaClass class.

public static class MyClass implements GroovyObject {
    //..fields

    @Generated
    @Internal
    @Transient
    public MetaClass getMetaClass() {  //this was treated as class property

MetaClass: it contains several get* method. All of them were processed by Swagger

public interface MetaClass extends MetaObjectProtocol {
    Object invokeMethod(Class var1, Object var2, String var3, Object[] var4, boolean var5, boolean var6);

    Object getProperty(Class var1, Object var2, String var3, boolean var4, boolean var5);

    void setProperty(Class var1, Object var2, String var3, Object var4, boolean var5, boolean var6);

    Object invokeMissingMethod(Object var1, String var2, Object[] var3);

    Object invokeMissingProperty(Object var1, String var2, Object var3, boolean var4);

    Object getAttribute(Class var1, Object var2, String var3, boolean var4);

    void setAttribute(Class var1, Object var2, String var3, Object var4, boolean var5, boolean var6);

    void initialize();

    List<MetaProperty> getProperties();

    List<MetaMethod> getMethods();

    ClassNode getClassNode();

    List<MetaMethod> getMetaMethods();

    int selectConstructorAndTransformArguments(int var1, Object[] var2);

    MetaMethod pickMethod(String var1, Class[] var2);
}

Solution:

I found that there were few discussions already in springfox GitHub project and solution was also discussed. The solution is to configure the Docket object to ignore MetaClass class using ignoredParameterTypes() method:

   @Bean
    Docket docket(){
        new Docket(DocumentationType.OAS_30)
                .ignoredParameterTypes(MetaClass)
    }

}

Now the application takes about 9 seconds to start and takes about 200MB heap space. That was a great reduction from 3min/1.4GB.

Reproducible example:

The following simple app was taking 10 seconds to start, the http://localhost:8080/v3/api-docs returns 58.5MB JSON response. Once I comment out the test method and rerun it just takes 1.7 seconds. Also, the http://localhost:8080/v3/api-docs returns 5KB JSON response.

package gt.swagger.demo

import org.springframework.boot.SpringApplication
import org.springframework.boot.autoconfigure.SpringBootApplication
import org.springframework.web.bind.annotation.*
import springfox.documentation.swagger2.annotations.EnableSwagger2

@SpringBootApplication
@EnableSwagger2
class SwaggerTestApp {
    static void main(String[] args) {
        SpringApplication.run(SwaggerTestApp, args)
    }
}

@RestController
class Controller {

    @GetMapping("/test")
    void test(@ModelAttribute A a) {
    }

    static class A {
        String a
    }
}

When I add more endpoints with @ModelAttribute or when I nest the 'A' class with class B, the application starts taking a lot of time to start and takes up more heap space

static class A {
    String a
    B b1, b2, b3, b4, b5 ,b6, b7, b8, b9
}

static class B {
    String b
}

The problem can be solved by configuring the following bean:

@Configuration
class SwaggerConfig {

    @Bean
    Docket docket(){
        new Docket(DocumentationType.OAS_30)
                .ignoredParameterTypes(MetaClass)
    }

}

POM.xml:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>2.5.2</version>
        <relativePath/> <!-- lookup parent from repository -->
    </parent>
    <groupId>gt.swagg</groupId>
    <artifactId>sb25-swagger3</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <packaging>jar</packaging>

    <properties>
        <java.version>11</java.version>
    </properties>

    <dependencies>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
        </dependency>
        <dependency>
            <groupId>io.springfox</groupId>
            <artifactId>springfox-boot-starter</artifactId>
            <version>3.0.0</version>
        </dependency>
        <dependency>
            <groupId>org.codehaus.groovy</groupId>
            <artifactId>groovy</artifactId>
        </dependency>
    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-maven-plugin</artifactId>
            </plugin>
            <plugin>
                <groupId>org.codehaus.gmavenplus</groupId>
                <artifactId>gmavenplus-plugin</artifactId>
                <version>1.11.0</version>
                <executions>
                    <execution>
                        <goals>
                            <goal>addSources</goal>
                            <goal>addTestSources</goal>
                            <goal>generateStubs</goal>
                            <goal>compile</goal>
                            <goal>generateTestStubs</goal>
                            <goal>compileTests</goal>
                            <goal>removeStubs</goal>
                            <goal>removeTestStubs</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>
</project>

Lesson learned:

- Pay attention on why your application is starting slow. Even a small mis-configuration like this might use up lots of resources

- Developer hour cost $$$. Make sure developers are not burning $$$ on non-trivial things like the slow application start

- Focus on how to increase developer productivity. I have seen devs compiling the whole project after every change to test it. Watch/educate developers to see if they are using the tools correctly

- Java/JVM is fast. Like every other programming language, developers can misuse/misconfigure it to make it slow

- etc

GIT write useful commit messages

What's harder than choosing a variable name? IMO its composing a perfect code comment and commit message. Today we will discuss some bad commit messages, why they are bad and what we can do to improve those?

You made all the code updates, wrote tests and now its time to commit them with a commit message. Is "Code Updated" a sufficient? What about "Issue fixed"? Just imagine yourself reading the commit message some time later. Just like code comment, commit message should add context about why the code update was necessary?

A good commit message helps a future developer(it might be you) to reestablish the context around a piece of code whenever the developer needs to dig through the history to find out why the code was changed. It also helps the code reviewer to what to expect on the code.

Unless your team follows squash merge (where all the individual commits in the PR are combined and the person merging the PR can put a single/meaningful commit message) you should pay extra attention when choosing. Commit messages like the following add no value. A good developer should never write these kind of commit messages.

Bad commit messages(taken from real life examples):

issue fix
more changes
logic updated
updated code again
added import function
updated dialog box UI
deleted column

Good practices:

Describe why the code was updated?

"Issue fix" doesn't indicate which issue number and what was the cause of issue? "Logic updated" is super obvious and serves no purpose. "updated dialog box UI" doesn't indicate which dialog box and why it was updated. "added import function" doesn't indicate which data type does it import?

Refer the Issue/Jira/Story number

All the code changes should be followed by a Jira/Story created by someone. Noone should just go ahead and make a arbitrary code change. Most of the git providers (github, bitbucket, gitlab etc) support hyperlinking of Jira/Issue number from commit message. eg: UI_APP:4555 Fixed broken button on IE9 caused by unsupported JS function

Commit frequency: per logical change

Its good to do a new commit per logical change. eg: do a commit once the functional piece is working:

new column 'purchase_amount' added on table
updated the backend to save new purchase_amount column
updated UI to display new column

For bigger update, use a heading and bullet points. The header can be short.

Commit frequency: commit often to have a periodic checkpoint

Imagine you are working on a BIG feature where you modified several files. At some point you are able to make few things working (point X). But when you work on additional changes the things that worked earlier started breaking. Later you are in a position where you don't know how to revert the new changes to get back to the point X where few things were function. If you had committed your changes at point X, it would have made it super easy to revert local changes to that point.

Few good examples:

dialog box UI updated to support multiple callback functions on both Cancel and Save event –> instead of updated dialog box ui
removed the employer contribution period field as we're no longer using it –> instead of removed employer contribution period field
add member import function for the employer user –> instead of added import function

KeyCloak docker-compose import realm file

Example of docker-compose for keycloak 18 with ability to import a realm file:

It assumes the realm-export.json file is in /PATH_OF_realm-export.json_FILE/

docker-compose.yml:

version: '3'
services:
    keycloak:
        image: 'quay.io/keycloak/keycloak:18.0.2'
        container_name: keycloak
        command: [ 'start-dev --import-realm' ]
        environment:
            - KEYCLOAK_DB=dev-file
            - KEYCLOAK_ADMIN=admin
            - KEYCLOAK_ADMIN_PASSWORD=admin
            - KEYCLOAK_FEATURES=scripts
            - KEYCLOAK_HTTP_PORT=8080
        volumes:
            - /PATH_OF_realm-export.json_FILE/:/opt/keycloak/data/import
        ports:
            - 8082:8080

Kafka - a working docker-compose

Here's a working docker-compose that runs zookeeper, kafka and kafka-ui in same network.

version: '3.8'
services:

    zookeeper:
        container_name: zookeeper
        image: confluentinc/cp-zookeeper:latest
        hostname: zookeeper
        restart: always
        environment:
            ZOOKEEPER_SERVER_ID: 1
            ZOOKEEPER_CLIENT_PORT: 2181
            ZOOKEEPER_TICK_TIME: 2000
            ZOOKEEPER_INIT_LIMIT: 5
            ZOOKEEPER_SYNC_LIMIT: 2
        ports:
            - 2181:2181

    kafka:
        container_name: kafka
        image: confluentinc/cp-kafka:latest
        depends_on:
            - zookeeper
        restart: always
        environment:
            KAFKA_BROKER_ID: 1
            KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
            KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
            KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092,RMOFF_DOCKER_HACK://kafka:9093
            KAFKA_LISTENERS: PLAINTEXT://:9092,RMOFF_DOCKER_HACK://:9093
            KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,RMOFF_DOCKER_HACK:PLAINTEXT
            KAFKA_INTER_BROKER_LISTENER_NAME: RMOFF_DOCKER_HACK
        ports:
            - 9092:9092

    kouncil:
        container_name: kouncil
        image: consdata/kouncil:latest
        restart: always
        depends_on:
            - kafka
        environment:
            bootstrapServers: kafka:9093
            BOOTSTRAPSERVERS: kafka:9093
            BOOTSTRAP_SERVERS: kafka:9093
        ports:
            - 8888:8080

Spring Boot - shutdown and restart application programmatically

Today we will be discussing how to shutdown and restart a Spring application programmatically.

1) New Project

First let's create a simple spring boot app with web dependency so that we can setup an endpoint to trigger the shutdown and restart. Let's head over to https://start.spring.io/ and choose web and other dependencies that you need

2) Few words about ApplicationContext and ConfigurableApplicationContext

ApplicationContext is the central interface that represents the Spring IoC container and is responsible for instantiating, configuring, and assembling the beans. ConfigurableApplicationContext is another interface that extends ApplicationContext and provides facilities to configure refresh, startup & shutdown of the application and allows registration of custom listeners to respond to those events.

3) Let's code!

3.1) Simple App

Run the application and store ConfigurableApplicationContext in a static variable appCtx so that we can use it later

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.context.ConfigurableApplicationContext;

@SpringBootApplication
public class App {

    static ConfigurableApplicationContext appCtx;

    public static void main(String[] args) {
        appCtx = SpringApplication.run(App.class, args);
    }

3.2) Shutdown - call ConfigurableApplicationContext.close()

This closes the application context, releasing all resources and locks that the implementation might hold. This includes destroying all cached singleton beans.

static void shutdown() {
    appCtx.close();
}

3.3) Restart

We know how to stop and start the application. To stop, we can simply do:

appCtx.close();

And to start, we can simply call the SpringApplication.run(..)

appCtx = SpringApplication.run(App.class, args);

But the catch here is, appCtx.close() would exit the application and the SpringApplication.run(..) won't be able to run

So, a solution would be to use Daemon Thread to shutdown and start the application. Daemon thread is a type of thread that can run independently in the background(independent of the main thread).

Thread t = new Thread(() -> {
    appCtx.close();
    appCtx = SpringApplication.run(...);
});

t.setDaemon(false);
t.start();

3.4) Passing the original program arguments

We may want to use the original arguments that were passed to run method when we restart the application.

 SpringApplication.run(App.class, args);

We can do that by either storing String[] args in a static variable or use ApplicationArguments bean that Spring Boot creates to store the arguments.

var args = appCtx.getBean(ApplicationArguments.class);      //retrieve
appCtx = SpringApplication.run(App.class, args.getSourceArgs());  //pass

3.5) Web endpoints to trigger the shutdown and restart

@RestController
class Ctrl {

    @GetMapping("/shutdown")
    void shutdown() {
        App.shutdown();
    }

    @GetMapping("/restart")
    void restart() {
        App.restart();
    }
}

4) Complete Example:

package gt.restart;

import org.springframework.boot.ApplicationArguments;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.context.ConfigurableApplicationContext;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RestController;

@SpringBootApplication
public class App {

    static ConfigurableApplicationContext appCtx;

    public static void main(String[] args) {
        appCtx = SpringApplication.run(App.class, args);
    }

    static void shutdown() {
        if (appCtx.isActive()) {
            appCtx.close();
        }
    }

    static void restart() {
        var args = appCtx.getBean(ApplicationArguments.class);

        Thread thread = new Thread(() -> {
            appCtx.close();
            appCtx = SpringApplication.run(App.class, args.getSourceArgs());
        });

        thread.setDaemon(false);
        thread.start();
    }

}

@RestController
class Ctrl {

    @GetMapping("/shutdown")
    void shutdown() {
        App.shutdown();
    }

    @GetMapping("/restart")
    void restart() {
        App.restart();
    }
}

5) Test

Call the endpoints

GET http://localhost:8080/restart

GET http://localhost:8080/shutdown

Read Request parameter String from URL in Spring/HttpServlet

Read entire query string(request parameter) in Spring Boot/HttpServlet

In Spring Boot and other Java web frameworks, we capture the request parameters (aka query parameters) and map them to individual variable or a Map<String,String>.

But, if you want to read the entire query parameter in its raw form ( eg: to read name=Ganesh&country=Nepal&ageLt=30&page=1 from request: http://localhost:8080/search?name=Ganesh&country=Nepal&ageLt=30&page=1), you can utilize HttpServletRequest.getQueryString()

Example:

@GetMapping("/test1")
void endpoint1(HttpServletRequest req) {
    var qs = req.getQueryString()  //returns the entire string
    qs.split("&") //split to get the individual parameters
}

Note: Spring Boot automatically resolves the HttpServletRequest object ( its the real holder of request object in Http Servlet environment)

GIT delete merged branches

How to delete merged branches from local copy:

When you work on a project for a while, there is a good chance that you will have a list of old/already merged branches showing on your local git.

When you run $git branch it will show all the branches since you started working on the project including the ones that are not currently active.

$git branch
    PROJX-1024-add-new-button
    PROJX-1026-add-another-button
    PROJX-1027-titleupdates
    build-fix
    develop
    release
    master

Here's how you can find the merged branches: It will only list a branch if it exists in local but not in remote(deleted from remote after merging).

$git branch --merged
    PROJX-1024-add-new-button
    build-fix
    develop
    release

Here's how you delete the merged branches except develop and release

$git branch --merged | egrep -v "(^\*|develop|release)" | xargs git branch -d

You can add any number of branches to the skip list separated by pipe |

$git branch --merged | egrep -v "(^\*|develop|release|another_branch|yet_another_branch)" | xargs git branch -d

Subscribe to: Posts ( Atom )