Thursday, March 29, 2018

How to Ace Code Interview
  • The recruiter is your friend too! Before any interview, feel free to ask the recruiter for the format of the interview, expectations, preparation material, general tips, etc. This will really help you focus your attention to specific things because otherwise CS is a vast area to tackle.
  1. Understand the problem - ASK QUESTIONS!!! Check your understanding by repeating back what you think you’re hearing: “If I understand correctly, you’re asking…” or “Is that like…?” Interviewers can give you more information or clarify something you may not understand.
  2. Define the algorithm with a block diagram or a list of steps and check it with the interviewer.
  3. Define use cases for the function to understand the code that must be written. Think of the corner conditions.
  4. Slowly, methodically start explaining and writing code. Start off by defining the function with a return type (void, bool, int, string, etc) and inputs needed.
    1. Remember it’s “pseudocode” so don’t get caught up on syntax.  The more you talk, the easier it will be for the interviewer to understand your thoughts behind the code you’re writing and re-steer you/guide you if you’re slightly off
    2. Once code is written, run an example (test cases)
    3. Any bugs, find them before your interviewer does
Your whiteboard code is merely a code sample from which your interviewer will derive information. The information that I derive here is that you probably don’t use the built-in linked list class much. Do I care? Not even a little bit.
What about code like this?
int getMax(int[] array) {
 int max = array[0];
 for (int i = 0; i <= array.length; i++) {
if (array[i] < max) {
 max = array[i];
 return max;
We have a few issues here.
  1. This code will crash at line 2 if the array is null or empty.
  2. The for loop starts at 0 when there’s no need for it to. It should start at 1.
  3. The for loop goes through array.length. This will cause an exception every time.
  4. The comparison on line 4 is backwards.
int getMax(int[] array) {
 int max = array[0];
 for (int i = 0; i < array.length — 1; i++) {
if (array[i] > array[i + 1]) {
 max = array[i];
 return max;
This isn’t just carelessness. The code doesn’t even make sense algorithmically. I’m very concerned about that.
How You Identify and Fix Your Bugs
Many candidates panic when they find a bug. A particular test case reveals a bug, and then they make a quick fix. The “fix” resolves it for that test case, but perhaps doesn’t fix the true issue.
For example, consider this code to locate all instances of a string s within a string b:
int countSubstrings(String s, String b) {
 int count = 0;
 for (int i = 0; i < b.length() — s.length(); i++) {
String bSubstring = b.substring(i, i + s.length());
if (bSubstring.equals(s)) {
 return count;
At first glance, the code basically looks correct, but it’s not. (Did you notice the bug yet?)
Many candidates spot the bug when they throw in a test case like a = “xyz” and b = “xyz”. That’s when they notice that the for loop never gets executed at all.
When I say slow programmer, I’m not saying someone is slow in typing. More generally, I mean people who are slow in coming up the perfect code.

In a 45min interview session, you will have few minutes for introduction in the beginning and few minutes for questions at the end. As a result, you only get ~35min for coding questions!
Generally, interviewers will expect you to solve 2 questions per interview. Depending on the difficulty, you may need to write solid code for at least one questions.
Therefore, honestly ask yourself whether you can complete 2 coding questions within 35min. If the answer is no, you should definitely need to speed up.
It’s worth to note that some interviewer will only ask one coding question but with a bunch of follow-ups. If you are trying to evaluate whether you are too slow in an interview, don’t forget to take the difficulty of questions into consideration.

Also, remember that interviewers won’t let you know if you are slow. If you used up all the time for his first question, usually he would just say he finished all his questions although the second one never got a chance to ask.

#1 Write code before having a clear mind
This is the most common mistakes I’ve seen. A lot of people like to start coding immediately after they have a vague idea, which is extremely terrible.
I can easily predict what’s gonna happen next. The candidate will start coding and get stuck very soon as he needs to figure out a bunch of details. He may keep fixing his code for a while. At one point, he notices that the whole approach doesn’t work and he has to erase everything from the whiteboard.
This wastes tons of time although it looks like he moves fast! Spending few minutes to discuss your solution with interviewers is definitely worth the time.
#2 Over optimization
In other words, some people tend to complicate the problem. They will consider a bunch of production issues that is not necessary for simplified coding questions. For example, they may consider things like security issue, dead lock, integer overflow etc..
I’m not saying that you shouldn’t consider these issues. But it’s better to postpone this discussion after you’ve finished the simple solution.
#3 Reinvent the wheel

Sometimes you don’t actually need to implement everything. For instance, some common sorting algorithms or binary search are not required to code from scratch.

Do ask interviewers whether you really need to implement them. If the question is not focused on these algorithms, most likely you can just use them for granted.

Identify your bottleneck

Usually there are two parts of a coding interview – coming up with the right approach and writing solid code, it’s always better to figure out which is your bottleneck before the optimization.
If you are slow to provide a solution, it’s very likely that you don’t have enough practice. Once you have worked on tons of coding questions, you’ll come up with the right approach within minutes. In this case, I don’t have any better suggestion than practicing as much as you can.
If you find yourself slow to finish coding, you should really write solid code for every question you practice with. I’ve seen so many candidates who came up with an approach quickly but failed to finish the code within the whole session. Do write solid code (NO PSUEDO CODE!) while practicing. It’s totally different from “solving in your mind”.

Practice with a timer

if you are slow when practicing, there’s no chance you’ll be faster in a real interview.
Therefore, it’s highly recommended to put a timer aside when you practice with coding questions. You’ll realize how different it is for sure.
You may feel nervous or maybe excited. Either case is likely to slow you down. The point is that if you can mimic the same environment as a real interview when practicing, you should get a better chance

Naive solution first

Never hold back your solution when you think it’s too naive.
What is highly recommended is to tell your solution even if it’s not concrete yet. For many interview questions, it won’t be hard to come up with a basic solution like brute force. If you are concerned about whether it’s too naive, you can say something like “I know this is definitely not the optimal solution, but I’d like to mention that…”.
The biggest advantage is that you successfully proved that you can solve the problem at least with some approached. Also, the naive solution acts as a starting point, which helps you to keep optimizing it.
In addition, communication plays an important role in coding interviews. Even if the idea is vague, thinking loudly can be helpful to form more concrete ideas and interviewers are also likely to discuss more about it.
If you are familiar with Python, do consider use it in coding interviews.
The simple syntax sometimes can save you a lot of time, especially compared to Java.
Again, this is optional so that you definitely don’t need to learn Python from beginning just for coding interviews. Using a language you are not familiar with does more harm than good.

Typing/writing fast won’t save you much time. What’s more, bad handwriting may confuse both interviewers and yourself.
Also. never sacrifice your time of thinking and discussion as we said before. Moving slowly but steady is the best way to speed up.

#1 Big-O analysis

#2 Input validation

It’s a great habit to validate all your input, which is true for both production code and interviews. However, many people assume inputs are all validated without even asking.
As the robustness principle said “Code that sends commands or data to other machines should conform completely to the specifications, but code that receives input should accept non-conformant input as long as the meaning is clear.”
So a practical suggestion is whenever you’ve written down the function definition, put input validation right after that immediately.
Another suggestion is to ask interviewers for clarification, which can save you from unnecessary checkings. For instance, you may ask if you can assume all inputs are integers. If the answer is yes, then you can skip the integer checking in your code.

#3 Check corner cases

In an interview, most people are a little bit nervous and want to finish the code as soon as possible. So it’s very natural that those solutions don’t take all situations into consideration.
A general tip is that you should always consider corner inputs when writing your solution, which is a great habit for experienced engineers.
Secondly, after you finish your code, it’s highly recommended to use few example inputs to test your solution and you can include several corner inputs.
Writing robust code is a skill that is developed through experience and real life projects. That’s why good engineers can write well-rounded solutions without even thinking about it.

#4 Clean code

First, you should have good handwriting on a whiteboard. Many people tend to write very fast in hopes of saving time, however they may waste more time to explain what they have written.
Not only does clear writing make it easier to communicate with interviewers, but it also makes the reader comfortable and happy.
Second, you should care about your coding style. I suggest everyone pay attention to this point as it’s both a low hanging fruit for interview preparation and a great habit for your work.

#5 Prepare questions at the end of the interview

At the end of each interview, you will get a chance to ask any question to the interviewer. In fact, it’s a great opportunity to further impress the interviewer.
It’s unlikely to happen that someone who failed to solve a single problem but got hired due to good questions. However, good questions can still make you stand out to some extent and give you some advantages, let alone that it doesn’t require a lot of time to prepare.
Questions to Ask At The End of an Interview has a detailed discussion about this. In a nutshell, you can ask questions about product, company culture or technical questions if you don’t have any idea now and you should prepare well before the interview.

#6 Be confident in communication

A lot of people are so nervous in an interview that they are even afraid of making a single mistake. As a result, they tend to be very unconfident in communication.
For example, even if they are quite sure about the time complexity, they could still say “I guess the time complexity might be O(n), I may be wrong though”. From the interviewer’s perspective, it seems that the candidate is not very clear about big-O analysis although he gave the correct answer.
However, I’m not encouraging you to be confident when you are not. The correct interpretation of this point is to be certain about things you know. Of course everyone can make a mistake and you don’t need to be afraid of that.
Also you will keep discussing with the interviewer so that you may still figure out the correct answer later.

#7 Smile

Don’t take an interview as a painful exam. Instead you should think of it as a chance to discuss interesting problems with someone else.
A lot of people look pretty “unhappy” from the beginning of the interview. They might be too nervous or they have struggled a lot to solve the problem.
However, as is known to all that your mode not only can affect your performance, but your interviewer’s feedback as well.

Wednesday, March 28, 2018

Redis Misc 2
At the same time, we also need to understand that, clustering implies some limitations on the way we use Redis keys (these limitations are very logical though).
  1. Transaction cannot be performed on keys which are part of different range of hash slots.
  2. Multi key operations cannot be performed on keys which are part of different range of hash slots.
For example, suppose there are two keys "key1" and "key2". key1 is mapped to hash slot 5500, thus, is stored in Node A. key2 is mapped to hash slot 5501, thus, is stored in Node B.
So, we cannot perform transaction on those keys. Nor we can perform multi key operations on key1 and key2. Multi key operation like "mget key1 key2" will throw exception.

A simple answer is, by ensuring that the keys on which we perform multi-key operation or transaction, are part of same hash slot range. And we ensure this by "Hash Tagging" Redis keys.
Hash Tags are a way to ensure that multiple keys are allocated in the same hash slot. There is an exception in the computation of hash slots which is used in implementing Hash Tags.
In order to implement hash tags, the hash slot for a key is computed in a slightly different way in certain conditions. If the key contains a "{...}" pattern, only the substring between { and } is hashed in order to obtain the hash slot. However, since it is possible that there are multiple occurrences of { or }, the algorithm is well specified by the following rules:
  • IF the key contains a { character.
  • AND IF there is a } character to the right of {
  • AND IF there are one or more characters between the first occurrence of { and the first occurrence of }
Then instead of hashing the key, only what is between the first occurrence of { and the following first occurrence of } is hashed. Let's have a look at the following examples:
  1. The two keys {user1000}.following and {user1000}.followers will hash to the same hash slot since only the substring user1000 will be hashed in order to compute the hash slot.
  2. For the key foo{}{bar}, the whole key will be hashed as usually since the first occurrence of { is followed by } on the right without characters in the middle.
  3. For the key foo{{bar}}zap the substring {bar will be hashed, because it is the substring between the first occurrence of { and the first occurrence of } on its right.
  4. For the key foo{bar}{zap} the substring bar will be hashed, since the algorithm stops at the first valid or invalid (without bytes inside) match of { and }.
  5. What follows from the algorithm is that if the key starts with {}, it is guaranteed to be hashed as a whole. This is useful when using binary data as key names.
Query routing means that you can send your query to a random instance, and the instance will make sure to forward your query to the right node.
Redis Cluster implements an hybrid form of query routing, with the help of the client (the request is not directly forwarded from a Redis instance to another, but the client gets redirected to the right node).
Redis cluster divides keyspace onto 16384 hash slots (CRC 16 mod 16384) and assign range to every node, so every node in a Redis Cluster is responsible of a subset of the hash slots. Move of the hash slot is automatic.
Redis Cluster supports multiple key operations as long as all the keys involved into a single command execution (or whole transaction, or Lua script execution) all belong to the same hash slot. The user can force multiple keys to be part of the same hash slot by using a concept called hash tags.
Master and slave roles are pre-configured, also this is possible to configure replica to serve particular master.
As well as for Sentinel, slave may become a master. In this case part of the changes will be lost due to eventual consistency.
In a Redis Cluster replica may be assigned to the master automatically and automatically migrated.
Split-brain is prevented by majority principle.
Until fail this will be a strong (linear, due to single master and single thread) view consistency for master and eventual consistency for replicas. In case of failover - eventual consistency with data loss.
There is no rack-aware replication inside Redis, but Redis Labs declared rack-awareness (as well as HA) in their product, Redis Labs Enterprise Cluster, which is based on Redis.
Redis provides asynchronous persistence to files:
  • Snapshot (RDB)
  • AOF (Append-only file) operation log
RDB file is a compact snapshot of the storage which is created automatically and good for backup, disaster recovery and quick restart.
Save of snapshot may be configured by timeout or by changes count.
In the same time, creating of the snapshot is a heavy operation (full dump) which should not be invoked too frequently, so in case of node fail, RDF can be old.
AOF is a command log of every write operation received by the server, to be replayed on start-up, reconstructing original dataset.
Commands are logged using the same format as the Redis protocol itself, in an append-only fashion.
Redis is able to rewrite the log on background when it gets too big. Nevertheless this means that re-play may take significantly more time than load from RDF.
Also in case of failure, AOF file may be incomplete and data after last filesync may be damaged and ignored.
AOF is updated on every write operation, but file sync is triggered depending on the setting:
  • fsync every time a new command is appended to the AOF. Very very slow, very safe.
  • fsync every second. Fast enough (in 2.4 likely to be as fast as snapshotting), and means that you can lose 1 second of data if there is a disaster.
  • Never fsync, just put your data in the hands of the Operating System. The faster and less safe method.
So, as well as in almost all NoSQL solutions, persistence is asynchronous (tunable, but dramatically affects performance). But in other NoSQL solution this is compensated by synchronous replication.
This means that Redis can’t guarantee durability under any cluster schemas (Sentinel or cluster).
Redis does not support encryption. In order to implement setups where trusted parties can access a Redis instance over the internet or other untrusted networks, an additional layer of protection should be implemented, such as an SSL proxy or Slipped.
Redis is designed to be accessed by trusted clients inside trusted environments. This means that usually it is not a good idea to expose the Redis instance directly to the internet.
In the same time Redis:
  • can bind specific interface
  • have tiny layer of authentication (password, pre-defined in redis.conf configured file)
  • can be configured to disable or rename particular commands in redis.conf
  1. Client writes to master B.
  2. Master B replies OK to the client.
  3. Master B propagates the writes to its slave B1.
There is now acknowledgement from B1, before master B gives its OK to the client. In the worst case the following happens:
  • B accepts a write from the client and gives its OK.
  • B crashes before the write is replicated to the slave B1.
In this case the write is lost forever. This is very similar to what happens with most databases that flush data to disk every second.

Redis Cluster is a mix between query routing and client side partitioning.

Twemproxy supports automatic partitioning among multiple Redis instances, with optional node ejection if a node is not available (this will change the keys-instances map, so you should use this feature only if you are using Redis as a cache).
It is not a single point of failure since you can start multiple proxies and instruct your clients to connect to the first that accepts the connection.


We learned that a problem with partitioning is that, unless we are using Redis as a cache, to add and remove nodes can be tricky, and it is much simpler to use a fixed keys-instances map.
However the data storage needs may vary over the time. Today I can live with 10 Redis nodes (instances), but tomorrow I may need 50 nodes.
Since Redis is extremely small footprint and lightweight (a spare instance uses 1 MB of memory), a simple approach to this problem is to start with a lot of instances since the start. Even if you start with just one server, you can decide to live in a distributed world since your first day, and run multiple Redis instances in your single server, using partitioning.
And you can select this number of instances to be quite big since the start. For example, 32 or 64 instances could do the trick for most users, and will provide enough room for growth.
In this way as your data storage needs increase and you need more Redis servers, what to do is to simply move instances from one server to another. Once you add the first additional server, you will need to move half of the Redis instances from the first server to the second, and so forth.

ZREVRANGE key min max

Executing batches of commands using redis cli
./redis-cli < temp.redisCmds


Review (572) System Design (334) System Design - Review (198) Java (189) Coding (75) Interview-System Design (65) Interview (63) Book Notes (59) Coding - Review (59) to-do (45) Linux (43) Knowledge (39) Interview-Java (35) Knowledge - Review (32) Database (31) Design Patterns (31) Big Data (29) Product Architecture (28) MultiThread (27) Soft Skills (27) Concurrency (26) Cracking Code Interview (26) Miscs (25) Distributed (24) OOD Design (24) Google (23) Career (22) Interview - Review (21) Java - Code (21) Operating System (21) Interview Q&A (20) System Design - Practice (20) Tips (19) Algorithm (17) Company - Facebook (17) Security (17) How to Ace Interview (16) Brain Teaser (14) Linux - Shell (14) Redis (14) Testing (14) Tools (14) Code Quality (13) Search (13) Spark (13) Spring (13) Company - LinkedIn (12) How to (12) Interview-Database (12) Interview-Operating System (12) Solr (12) Architecture Principles (11) Resource (10) Amazon (9) Cache (9) Git (9) Interview - MultiThread (9) Scalability (9) Trouble Shooting (9) Web Dev (9) Architecture Model (8) Better Programmer (8) Cassandra (8) Company - Uber (8) Java67 (8) Math (8) OO Design principles (8) SOLID (8) Design (7) Interview Corner (7) JVM (7) Java Basics (7) Kafka (7) Mac (7) Machine Learning (7) NoSQL (7) C++ (6) Chrome (6) File System (6) Highscalability (6) How to Better (6) Network (6) Restful (6) CareerCup (5) Code Review (5) Hash (5) How to Interview (5) JDK Source Code (5) JavaScript (5) Leetcode (5) Must Known (5) Python (5)

Popular Posts