Cracking The Coding Interview Together – Problem 1.6: String Compression
Welcome back to our Cracking the Coding Interview — Together series.
If you're following this journey every Sunday, you already know the drill: we take one question from the book, sit down as if we were solving it in front of a real interviewer, ask the right questions, think out loud, try a few approaches, refine them, write clean code, and understand why the code works.
Today’s problem is String Compression — a deceptively simple one at first glance, but depending on how you think about edge cases, can trip you up in an interview.
We always follow the format as follows
- Understand the problem statement together
- Ask any questions, that you would like to ask the interviewer
- I would urge you to take your time here and try to come up with your own solution before going forward
- Think how you would go about solving the problem(you would be talking out loud in front of the actual interviewer)
- Write code and understand how it works and explain it to your interviewer
Problem statement
String Compression: Implement a method to perform basic string compression using the counts of repeated characters. For example, the stringaabcccccaaawould becomea2blc5a3, If the "compressed" string would not become smaller than the original string, your method should return the original string. You can assume the string has only uppercase and lowercase letters (a - z).
Analyse the problem statement together
We need to write a method that compresses a string by replacing repeated characters with character + count
However, there’s a twist:
If the compressed string is not shorter than the original, return the original string.
There is nothing more to it, it is a simple programming problem where we basically keep track of the current character and its frequency and then follow through.
Additional constraints:
- Input string contains only uppercase and lowercase letters (a–z, A–Z).
- The compression is basic — we’re not doing any advanced algorithms like entropy encoding or Huffman trees. Just run-length encoding.
- Characters are case sensitive (
aandAare different).
Take some time and try to dig deep to see if there is anything unclear to you.
Questions for the interviewer
Every good interview solution starts with the right questions. If I were sitting in front of an interviewer, I’d ask:
1. Are we guaranteed the string contains only letters?
Yes — the problem states uppercase/lowercase letters only.
2. Should counts be appended even when the character appears once?
Yes — b becomes b1.
3. Are we allowed to use extra space? Or do we need to compress in-place?
Typically:
- In Java, strings are immutable, so you must use extra space.
- If you want bonus points, ask if they expect an approach using
StringBuilderinstead of string concatenation (they do).
4. What about extremely large counts?
Not relevant in this version, but clarifying doesn’t hurt.
5. Should whitespace be treated as a character?
Not needed — input has only letters.
These questions show clarity of thought and prevent misunderstandings.
Think about the Solution
Like always, I recommend you pause here and actually think through how you'd solve it.
This is where interviewers pay attention.
Let’s reason through it as if you're walking them through your thought process:
Observation 1
We’re effectively doing run-length encoding: counting how many times the current character repeats consecutively.
Observation 2
We need to keep track of:
- The current character.
- How many times it has repeated.
- When the character changes → append the previous character + count.
Observation 3
We must check if the compressed string is shorter.
Otherwise, return the original. So either:
- build compressed string and compare lengths
OR - compute compressed length first (optimization)
Observation 4 — Brute Force Approach
Let’s outline a brute-force first, since interviewers love seeing your progression.
Brute Force Idea:
- Loop through the string.
- For each character, count consecutive repetitions.
- Append to a result string using string concatenation (
result = result + char + count). - At the end, return whichever is shorter.
Why is this brute-force?
Because:
- Using string concatenation repeatedly creates new string objects each time → O(n²) time.
- This reveals whether you understand immutability in Java.
Still — it’s simple and works.
Observation 5 — Optimal Approach
To optimize:
- Use
StringBuilder(O(n) time). - Iterate once.
- Append compressed segments.
- Compare lengths.
The key here is: string concatenation vs. StringBuilder — interviewers want to see if you know this difference.
As always, I encourage you to stop here and think:
- How would you count repetitions?
- How do you handle the final sequence at the end of the loop?
- What are tricky edge cases?
- “a” → "a1" (but original is shorter → return "a")
- “aaa” → “a3”
- “ab” → “a1b1” → original shorter → return “ab”
- Case sensitivity: “aAaa” → treat individually.
Take 2–3 minutes and sketch a solution.
Write code and explain
Since you have the complete understanding of the solution you explain the solution and write the code. Implementation could be as follows:
public static String compressString(String s) {
if (s == null || s.length() == 0) {
return s;
}
StringBuilder compressed = new StringBuilder();
int count = 1;
for (int i = 1; i < s.length(); i++) {
if (s.charAt(i) == s.charAt(i - 1)) {
count++;
} else {
// append previous sequence
compressed.append(s.charAt(i - 1)).append(count);
count = 1;
}
}
// handle final sequence
compressed.append(s.charAt(s.length() - 1)).append(count);
// return shorter between original and compressed
return compressed.length() < s.length() ? compressed.toString() : s;
}Walkthrough of How the Code Works
example
aabcccccaaa1. Initialize
compressed = ""count = 1- Start looping from index 1.
2. Compare each char with previous
- i=1:
a == a→ count = 2 - i=2:
b != a→ append"a2", reset count to 1 - i=3:
c != b→ append"b1", reset count to 1 - i=4:
c == c→ count = 2 - i=5:
c == c→ count = 3 - i=6:
c == c→ count = 4 - i=7:
c == c→ count = 5 - i=8:
a != c→ append"c5" - i=9:
a == a→ count = 2 - i=10:
a == a→ count = 3
3. Append the last group
- append
"a3"
Final compressed string
a2b1c5a3Compare lengths
- original length = 11
- compressed length = 9 → return compressed
Time & Space Complexity
Time Complexity: O(n)
- We iterate through the string once.
- Appends are amortized O(1).
Space Complexity: O(n)
- Worst-case: compressed string is slightly longer than original (e.g., "abc" → "a1b1c1").
Final Thoughts
String compression problems test string manipulation, pattern recognition, and attention to detail.
This one in particular is a great interview warm-up question because:
- It tests understanding of run-length encoding.
- It checks whether you know how to build strings efficiently in Java.
- It forces you to handle edge cases.
- It’s simple enough to code under pressure, but deep enough to evaluate thinking.
And with that, we have completed Problem 1.6 together.
If you’re following this series, you know the routine — next Sunday, we’ll tackle the next challenge from Cracking the Coding Interview, line by line, thought by thought, solution by solution.
Let me know if you want a visual diagram or step-through animation — happy to include one next time.
Happy Coding!
Member discussion