Skip to main content

Story of a Java application in the cloud on Heroku


Starting with a monolith application is not really uncommon. But when the demand arises it is important to have a plan or path to go distributed either a Big Bang change or phased approach. I took the phased approach and the phases sort of happened naturally (without even knowing the right technical terms, BUT the concept and vision was clear). I will try to tell the story in this post.

Although I will use "sample app" and the tutorials I prepared for this is a "sample app", I have faced the scenarios in real life few years ago and learned a thing or two. I am using Heroku for this "sample app" but this can also be implemented in AWS or Azure.

I am sure there's always a better way of doing it, but this is how I have approached it. 

Firstly, let's set some functional specification for our "sample app":
  1. The app will take request from the user (there's no restriction on how many users can request the app in a given second.) via browser.
  2. The will show response/result of the request via dynamic web pages.
  3. The app will then do some heavy computing tasks during which the app may call an external IO for read purpose (not shown in the design as the read operation is assumed to be fast enough).
  4. The app will then write to a slow IO. 


Let the story begin.

Back story:
I had an idea about a Web based application to solve certain business problem. I wan't sure if the application will be adopted or whether I will use it long term or not. But I wanted to build something quickly and start using it to see if it actually solves the problem or becomes useful to other people or to me.

Really, most of the new ideas start like this and evolves from here. To build the perfect solution there's just too many things to consider, too long to build it, too many scenarios and edge cases to consider, too much effort to start something that may or may not last long OR worse someone will do it before.


Phase -1: 
So I did the terrible thing, and built a Monolithic Java application. I used Spring Boot and Thymeleaf as templating engine (do not ask why, I just did it). The "sample app" looked something like this:


Let's call this "The Monolith" and dissect a little bit on what's really happening here:
  • The app is on Heroku ecosystem.
  • The app's source code is on Git (Bitbucket / Github).
  • There's a deployment pipeline to automate the deploy onto Heroku.
  • Auto scaling is setup (for the Dynos to auto scale) based on default p95 response time (min 1 max 2). Reference here: https://blog.heroku.com/heroku-autoscaling
  • Dyno type was Standard-1x.
Now let's dissect how the "sample app" is working here:
  • Spring boot base java app. Thymeleaf is templating enginge. Tomcat is the webserver.
  • By design/nature Tomcat will create 1 thread per request. This will cause some issue which I will point in the next section.
  • When the request is received the "sample app" reads a BIG file that contains json data and does some heavy cpu intensive computing.
  • Once the computation is finished the "sample app" makes write to our SLOW IO.
Let's discuss the issues that will happen (and has happened) in this scenario:
  • The app is running on Standard-1x with scale of max 2. So restricted to 512 threads only. 
  • The app will take some time during the heavy process (let's assume the IO read is fast enough and negligible. Also for simplicity let's not worry about memory consumption for now). 
  • The app will take even longer to for IO write because IO response is slow.
Now, couple of things that will happen here that things are about to go terribly wrong when our "sample app" will start gaining popularity (meaning increased number of users and doing increased numbers of requests). 

Everything that can go wrong will go wrong:
  • Due to slow processing (read + compute) each request thread will take longer time to finish.
  • The above will have a cascading reaction that will cause the app to hit 512 max thread limit.
  • To make matter even worse the due to slow IO write the threads will take even longer and eventually many (most likely randomly) will hit Heroku's "Time our quickly" limit which is 30sec, (the randomness will depend on IO service how it performs when increasing number of simultaneous write operation are being called for)  

Phase-2:
So I read few blogs here and there and did the next quick action (ignore the money/cost side for now, let's say I had a money growing tree in my backyard). I won't say whether it was good or bad. But in real life scenario it served me ok.

Monolith to threading


Here's what I did:
  • Added threading capability (on top of Tomcat's request thread). So the "sample app" will create a background thread per request (depending on if the request calls for heavy compute and/or IO write).
  • Scaled Dynos vertically. I upgraded the Web Dynos to Performance-m type. Reference here: https://devcenter.heroku.com/articles/dyno-types
  • Made some optimisation on calling the slow IO write.
Why I did it:
  • Serving the user request is priority #1. It does not have return the computed result right away but the "sample app" should give the requesting user some indication right away that the it is working to process the request. Simply waiting for a response causes frustration (and when users are frustrated they do interesting things like hitting the refresh button and resubmitting the request; which make things even worse :).. but it's fair).
  • A nested/child thread will work ok in this context. All the request thread have to do is to compute that if the processing is going to be long or not (in this context heavy computing and/or slow IO write) and if the answer is yes then create a new thread and offload the heavy task to the child thread. Then immediately respond to the user's request saying "Trust me, I am working on it, check back in a moment or I will let you know when I am done."
  • I did not go crazy and created child threads without any limit. I used ThreadExecutionPool matching with the number for cores/shares (Compute 11x).
  • The vertical scaling made sure that the "sample app" and its threads have enough memory (2.5G) and and enough CPU (11x) on a dedicated environment. (probably c-2-mid in aws term).
  • I did few things right when I was building the app. 
    • I made sure that application maintains strict layers of services and services are constrained by their domain. 
    • All the service classes are stateless which inherently made them somewhat thread safe.
    • Session variables and or shared vars were maintained to a bare minimum and only in the controllers space. So none of these vars flowed through to services.
    • When something was said to be POJO I made sure the object remains as dumb as it can be.
    • Used some patterns (eg> flyweight) to keep the memory consumption low during read+compute.
    • Separation of concerns were strictly maintained. The read was isolated from write and so was the compute. Controller only handle get,post and nothing creepy.
All of the above made it easy to decouple layers and switch to threading. So making it a multi-threaded was a relative fast job.

The Result:
The issues that I was facing went away. Request and response was smooth enough. Things started to look good.

BUT things still weren't really up to the mark:
  • Cost (billing) was through the roof. Ofcourse I did not have a money growing tree in any yard where I live. At first it was ok. But then during spike when the Dynos scaled to MAX (6) and it was spiking often as the "sample app" gained popularity. Number to regular users started to increase exponentially. The cost was adding up.
  • The slow IO went even worse. As the number of threads increased (with each Dyno scale up) the could not handle the load and started to time out and/or reject requests. So I had to do some creepy programming to keep trying which made my code ugly.
  • The slow IO is the main culprit here. Because when the IO write threads were getting stuck and the request threads were creating more threads (per scaled Dyno) which made it even more slow for the IO write. 
  • The performance of the read+compute side was relatively ok. 

It was ok, it worked ok. But it was not sustainable. I started to foresee the things that would eventually blow up.

Phase-3:
I finally listened to what Heroku have been telling repeatedly all along.

Monolith to Threading to micro-services

Here's what I did:
  • Divided the app into 2. Web and Worker.
  • Web will do web processing.
    • Thread will process fast read+compute operation as it was doing before. But no IO write anymore.
    • Rather it will write the tasks (with its payload) to the queue.
    • I suffered enough with the inconsistent behavior of Heroku affinity. So switched to Redis for session. My session was already lean so this wasn't an issue at all. Infact it went more smoother than I anticipated.
  • The worker would pick tasks from the queue and perform the slow IO operation.  
  • I had to refactor things a little bit (eg, Devide the project into 2, Making commons libs that will be shared between both to avoid serialize and de-serialize issue (check the videos, you will know), parent child poms etc. But it was worth it.
  • I decrease the Web Dynos Autoscale max down to 3. But I have noticed that the "sample app" rarely reaches to 3. At the same peak as before it, most of the time, reaches to 2 Dynos. Note to future self: Decrease the max to 2.
  • Per Dyno there were 3 workers spin. Thus at the "sample app" at its max would spin max 9 workers. Yes they would perhaps run for longer but that's acceptable. (I measured the avg processing time, it was within the reasonable parameters).
Result:
  • Costing now slashed to half. Which is always good (more money at beer-o-clock).
  • User experience was improved by 10x. The users would still wait with a "processing" gif on their page and due to the nature of the application this is normal and standard. But the thread response was faster. So the average "result produce" time decreased significantly. 
  • The slow IO was also happy as it was getting way less writes at a given time. The rejection stopped. But it occasionally would do a time-out. But rather than handling this with creepy codes (I removed the creepy code) the job/task would simply go back to the queue to be processed again. And I did not have to do anything extra for it. It was by design with RabbitMQ and Spring-Rabbit-MQ. 

Things are finally looking up. This is where I paused scaling my real life application.

But I have been thinking that I can do even better.

Phase-4:
This is a future phase and in theory this should work a lot better, both from costing and performance perspective. Sure, it will need further refactoring but that will be worth it.
Monolith to Distributed

Here's what I am thinking of doing:
  • Remove the Multi-threading from the web application by not creating any child threads from request threads.
  • Make the request response process supper dumb. All the Web App should do is to get the request from the user and raise a task/job in the queue for someone to process.
  • Divide the Web application further into 2. Making the "sample app" becoming a collection of 3 small apps:
    • One for request handling
    • One for reading+computing
    • One for writing to slow IO  
  • Completely separate the code bases and have different pipelines. As really these 3 are really 3 different apps. So possibly total code base would be 4. (3 app + 1 commons lib).
  • Refactor to check statuses from queue rather than result on a CompletableFuture from thread. I should be able to even enhance user experience (Note to future self: Do the cool stuff here.).
  • Decrease the Dyno size vertically, back to Standard-1x.
  • Increase the number of workers per Dyno (I should be able to do this creating a clever pipeline .. post/experiment for another day).
  • Utilise Rails AutoScaling to scale up and down the worker process. Reference here: https://devcenter.heroku.com/articles/rails-autoscale

Benefits:
  • Costing will go down even more (if not significantly low).
  • I could possibly even decrease the Web Autoscaling max down to only 2. Because all the Web app will do is to get request and add task/job in the queue.
  • Standard-1x Dyno is going to run all the time anyway with Number of process types = unlimited. I should take advantage of this by creating more worker processes and offloading the heavy lifting to the workers.

Live coding of the "Sample App":
Comming soon.... (check in 3 days time) 


Here're some good articles from Heroku:


Comments

Popular posts from this blog

The story of a Hack Job

"So, you have hacked it" -- Few days ago one of the guys at work passed me this comment on a random discussion about something I built. I paused for a moment and pondered: Do I reply defending how that's not a hack. OR Do I just not bother I picked the second option for 2 reasons: It was late. It probably isn't worth defending the "hack vs" topic as the comment passed was out of context. So I chose the next best action and replied "Yep, sure did and it is working great.". I felt like Batman in the moment. In this post I will rant about the knowledge gap around hacking and then describe about one of the components of my home automation project (really, this is the main reason for this post) and use that as an example how hacking is cool and does not always mean bad. But first lets align on my definition of hacking: People use this term in good and bad, both ways. For example: "He/she did a hack job" -- Yeah, that probably...

Kubectl using SSH tunnel for TKG K8s Clusters

We know SSH'ing and probably many knows about SSH tunnel. The way, in my opinion, these 2 (SSH and SSH tunnel) are different to me (and I am in favor of SSH Tunnel) is how I use it. From tooling perspective I would almost always do tunnel instead of direct ssh.  In this post I will describe how to do SSH tunnel for kubectl to interact with remote kubernetes cluster (Specifically Tanzu Kubernetes Grid aka TKG cluster). Get the project ready to go from my github:  https://github.com/alinahid477/vsphere-with-tanzu-wizard Topics Backstory SSH tunnel for TKG Clusters using Docker container Technical stuff: Tunnel through Bastion for TKG K8s cluster Technical stuff: SSH Tunnel for Kubectl for remote K8s Clusters (same with or without docker) Technical stuff: Explain me this A famous quote from Darth Vader himself: "Feel the power of SSH Tunnel" Backstory Why ssh or ssh tunnel? The below diagram shows in what scenario a SSH or SSH Tunnel almost becomes a necessity. Let's st...

CKA Exam; May 2024: My take on it and cheat sheet

So, I finally got the little green tick of having CKA certification in my certification list. I put off this exam for so long that it seriously became not funny anymore. The internet has quite literally way more than 1000 posts on this topic. But what harm would one more post cause? So here's mine. I will write it from my perspective. I am writing this post just in case if anyone benefits from it, as I predict there could be many on the same boat as me. Background: Kubernetes, modern application architecture, DevSecOps etc are not new territory for me. In fact, I think I am fairly versed in K8s and related tech stack. But due my own imposter syndrome I have been putting off sitting the CKA exam. However, last week I thought about the CKA as "just another approval for my skills" and got the nudge to sit the exam.  Here's what I did till the day I sat for the exam. (Everybody is different but the below worked for me the best) The preparation: As I have been working with...

A CQRS Microservice Architecture - My Way

Microservice Architecture is the new trend in the industry. When I thought of building a decisioning engine to work as my personal assistant I decided to design it as a swarm of Microservices. The most compelling selling point about Microservice Architecture which resonated well for me was its ease of maintainability and that's a big factor for a hobby project like this which has a lot of custom logic programmed into it and requires a lot of frequent changes. In theory, Microservice should fit right in. I tried doing it in the Monolithic way and I failed !!. So, will Microservices solve it for me? Let's find out. In this post I will do it in the "Software Engineering" way. Background: Few weeks ago I created a wifi controlled water system that I can control via my Home Assistant (using my phone even if I am on the other side of the planet, like Batman). And this is working great for me. Read about that here: Smart wifi controlled irrigation system . Bu...