Timeout & Retries – learnings

  1. As an API developer, my first goal is to provide robust client side implementation for request timeout and it is different from connection timeout. Both too high or too low is generally considered bad. It should be based on your promised SLA on latency. You may eventually end up tuning connection and other underlying platform level configurations that are optimal for your use case.
  2. Retries are my client’s unintentional punitive actions against you for violating or not improving SLA. This outcome becomes severe when your service starts failing under load and the retries increases exponentially though there is no point of beating a dead horse, hence comes backoff.
  3. Another important point is about cumulative retries from chain of services. You might be having complete useless timeouts as one of the way down dependency does not even honour you min timeout.
  4. Exponential backoff is good to reduce contention but does not do anything to distribute the incoming traffic into equal time slices. This is why the leaky bucket or fixed window or the token bucket algorithm performs poorly on edge cases. On the other hand jittered backoff helps to shape traffic using randomness and it could seen in sliding window rate limiting.
  5. Be extreme careful not to degrade customer experience while using Jitter, you do not want to lose them with delays but rather give them frustration free simple in and out experience.
  6. Another important API property to pay attention is idempotent. If you are using any identifier based approach to identify duplicate requests then be sure to include that in retry request.

Software engineering Best practices – Configuration

  1. Making configuration explicit but simple at the same time to avoid client surprises. The goal should be, if it is read by multiple audience, it should convey the same meaning and should be agnostic to the audience/engineer’s experience.
  2. In modern distributed system, which are divided into control plane & data plane, any configuration to an API or service, is a control plane operation and gets better suited with control plane APIs and not to be confused with data plane API input filter.

Control plane vs Data plane

In my experience, this distintiction exists because it helps you to set right design and development goals for scalability, latency & different other properties of the. Control plane is a collection of services that is used to create & re/configure environments & runtimes, where data plane is responsible for delivering service/s for get/read or set/write/update data required to perform an end user action. In simpler form, Control plane services are used to commission/de and manage data plane services.

  1. Usually APIs/ports exposed by data plane services are separate from data plane service APIs as these are really control APIs, e.g. these APIs are side car implementation of control plane pieces lying with data plane service.
  2. Intuitively, control plane processes should call data plane control ports/APIs. It could be tempting to make the configuration pull from data plane services rather than push from control plane services but that loses meaning of the word ‘control’ w.r.t how control plane should dictate configuration push, load on data plane services or controlled configuration rollout.
  3. On the other hand inversion of control makes the data plane services more independency on choosing the when part of equation of adopting change.

memcached and distribution or assignment

I have been using Memcached in a lot of different type of caching solutions for web services and thought of sharing some basic facts,

  1. Memcached servers are just very efficient key value storage providing APIs to get and set values over the network.
  2. Client makes the solution distributed. The client has the logic to select or choose the destination host for get or set operation using consistent hashing.
  3. The memcached client library supports a number of different distribution algorithms that are used in multi-server configurations. Please read the protocol.
  4.  The cache key is used in the consistent hashing to determine distrubution. The same server is selected during both set and get operations. The algorithms are Ketama and Wheel in code.
  5. following diagram is for simplicity in understanding the process.

Untitled Diagram

In the next post, I will show how to use Guava and Memcached together to implement a multi layered caching solution.

0 1 knapsack Java Code

     /**
     *
     * @param w weight for items
     * @param v values for items
     * @param i number of items
     * @param W the max weight
     * @return what knapsack/0-1 should return
     */
    static int knapsack(int[]w, int []v, int i, int W){
        if(i < 0) return 0;         if(w[i] > W){
            return knapsack(w,v, i-1, W);
        }else {
            return Math.max(knapsack(w,v, i-1, W), knapsack(w,v, i-1, W-w[i])+ v[i] );
        }
    }

Throttling web service / application

There are different throttling techniques one can use to throttle a network applications, Following are some well established and simple techniques used in different network applications.

Out of which Token Bucket suits very well for Web Services with the following reasoning,

  • Failing fast on the incoming requests because there is no tokens available. That helps the caller to strategize immediately what to do in case of running out of limits.
  • Failing fast also helps in not building a log task queue at the callers side.
  • New milk is better than old milk (milk and wine policy) because keeping an old request with no relevance to the caller is just waste of resources if relevance is what you care about.

In my coming post I shall implement a simple throttler that uses the Token Bucket algorithm in Java.

Least Recently Used or LRU Cache : a simple implementation in Java

A friend of mine asked me about LRU cache and while explaining the same I could not locate a simple implementation one by just googling it.

The Java util provides a beautiful implementation through LinkedHashMap. Which is a Hash table and linked list implementation of the Map interface, with predictable iteration order. In order to use it as an LRU cache, do the following,

  • use constructor LinkedHashMap(int initialCapacity, float loadFactor, boolean accessOrder) and set the accessOrder to true. and,
  • override the implementation for removeEldestEntry(Map.Entry eldest)
  • please keep in mind that this is NOT an thread safe implementation and to have thread safety for this LinkedHashMap, you can do something like Map m = Collections.synchronizedMap(new LinkedHashMap(…)), if multiple threads may use the map.

You can of-course look at the implementation of LinkedHashMap for reference and it is really awesome how implemented but I thought to write up my own implementation for LRU cache using a HashMap and a linked list with minimal functionality. So this implementation provides a LRU cache implementation in Java using Hashmap and a doubly linked list.

package com.das.yadab.java;

import java.util.HashMap;
import java.util.Map;

public class LRUCache&amp;lt;K, V&amp;gt; {
    public static final int MAX_SIZE = 100; // Default max size for the cache to grow.
    private Entry&amp;lt;K, V&amp;gt; head;
    private HashMap&amp;lt;K, Entry&amp;gt; internalCache;
    private int maxSize;

    public LRUCache() {
        this.head = new Entry(null, null);
        head.before = head.after = head;
        internalCache = new HashMap&amp;lt;K, Entry&amp;gt;();
        maxSize = MAX_SIZE;
    }

    public LRUCache(int maxSize) {
        this.head = new Entry(null, null);
        head.before = head.after = head;
        internalCache = new HashMap&amp;lt;K, Entry&amp;gt;();
        this.maxSize = maxSize;
    }

    /**
     * Gets value associated with the specified key from cache.
     *
     * @param key
     * @return value or null if not found
     */
    public V get(Object key) {
        Entry&amp;lt;K, V&amp;gt; e = internalCache.get(key);
        if (e == null)
            return null;
        e.recordAccess(this);
        return e.value;
    }

    /**
     * Associates key and value in the cache, null value is not allowed.
     * If cache size has grown more than the allowed capacity the eldest member will be evicted.
     *
     * @param key
     * @param value
     * @return
     */
    public V put(K key, V value) {
        if (value == null) {
            throw (new IllegalArgumentException("Null Value association is not helpful in caching"));
        }
        if(internalCache.containsKey(key)){
            internalCache.get(key).setValue(value);
            internalCache.get(key).recordAccess(this);
           return;
        } 
        if (shouldRemoveEldestEntry()) {
          Entry eldest = head.after;
          removeEntry(eldest);
        }
        Entry entry = new Entry(key, value);
        entry.addBefore(head);
        internalCache.put(key, entry);
        return value;
    }

    private void removeEntry(Entry eldest) {
        // since we do not allow null value association so this is safe.
        if (internalCache.remove(eldest.key) != null) {
            eldest.recordRemoval(this);
        }
    }

    /*
    Returns true if this map should remove its eldest entry.
    if maxSize is not set during initialization the method will always return false.
     */
    public boolean shouldRemoveEldestEntry() {
        if (internalCache.size() &amp;gt; maxSize) {
            return true;
        }
        return false;
    }

    static class Entry&amp;lt;K, V&amp;gt; implements Map.Entry&amp;lt;K, V&amp;gt; {
        private Entry&amp;lt;K, V&amp;gt; before, after;
        private K key;
        private V value;

        Entry(K key, V value) {
            this.key = key;
            this.value = value;
        }

        @Override
        public final K getKey() {
            return key;
        }

        public final V getValue() {
            return value;
        }

        public V setValue(V newValue) {
            V oldValue = value;
            value = newValue;
            return oldValue;
        }

        private void addBefore(Entry&amp;lt;K, V&amp;gt; existingEntry) {
            after = existingEntry;
            before = existingEntry.before;
            before.after = this;
            after.before = this;
        }

        void recordAccess(LRUCache&amp;lt;K, V&amp;gt; cache) {
            remove();
            addBefore(cache.head);
        }

        final void recordRemoval(LRUCache&amp;lt;K, V&amp;gt; m) {
            remove();
        }

        private void remove() {
            before.after = after;
            after.before = before;
        }

        @Override
        public boolean equals(Object o) {
            if (!(o instanceof Map.Entry))
                return false;
            Map.Entry e = (Map.Entry) o;
            Object k1 = getKey();
            Object k2 = e.getKey();
            if (k1 == k2 || (k1 != null &amp;amp;&amp;amp; k1.equals(k2))) {
                Object v1 = getValue();
                Object v2 = e.getValue();
                if (v1 == v2 || (v1 != null &amp;amp;&amp;amp; v1.equals(v2)))
                    return true;
            }
            return false;
        }

        @Override
        public int hashCode() {
            return (key == null ? 0 : key.hashCode()) ^
                    (value == null ? 0 : value.hashCode());
        }

        @Override
        public String toString() {
            return "Entry key=" + key +
                    ", value=" + value +
                    '}';
        }
    }

    @Override
    public String toString() {
        return "LRUCache{" +
                "internalCache=" + internalCache +
                ", head=" + head +
                ", maxSize=" + maxSize +
                '}';
    }
}

An example usage, since it only accesses the key “0”, it should retain the key “0” though it is added at the beginning.

LRUCache cache = new LRUCache(5);
for (int i = 0; i &amp;amp;lt; 10; i++) {
    cache.put(Integer.toString(i), i);
    cache.get(Integer.toString(0));
}

Java : difference between Polymorphism and Inheritance

Since the definition for both them have some common ground so it may confuse some times with the differences between this two paradigm. Definitions from wiki, you might have already read it.

  • Polymorphism polymorphism is the provision of a single interface to entities of different types. Java supports basically subtype(using inheritance in Java) and parametric polymorphism (using generics in Java).
  • Inheritance (OOP) is when an object or class is based on another object (prototypal inheritance) or class (class-based inheritance), using the same implementation (inheriting from an object or class) specifying implementation to maintain the same behavior (realizing an interface; inheriting behavior).

So from the above definition,

  1. Inheritance or subclassing in Java is one of the technique used to achieve polymorphism.
  2. Inheritance refers to using the structure and behavior of a superclass in a subclass.
  3. Inheritance helps code reuse at the same time while achieving polymorphism.

Java : Asynchronous Task sequencing using Observer pattern

The other day I was answering a question on stackoverflow on what could be a simple implementation of sequencing or ordering asynchronous tasks in Java. More detail about the question. Basically task sequencing problem is, you have a list of tasks say : {A, B, C, D} and you want execute them in some order, say {A} -> {B, C} ->{D} ->{X} -> {A}.

I preferred the Observer pattern because it is more of a reactive pattern so that design is reactive in nature rather than controlling and more loosely coupled.

public abstract class Task extends Observable implements Runnable, Observer {
    private final Mutex lock = new Mutex();
    private final String taskId;
    private final Set<String> completedTasks;
    private final Set<String> shouldCompletedTasksBeforeStart;

    public Task(final String taskId) {
        this.taskId = taskId;
        this.completedTasks = new HashSet<>();
        this.shouldCompletedTasksBeforeStart = new HashSet<>();
    }

    public String getTaskId() {
        return this.taskId;
    }

    @Override
    public void run() {
        while (true) {
            this.lock.getLock();
            if (this.completedTasks.equals(this.shouldCompletedTasksBeforeStart)) {
                doWork();
                setChanged();
                notifyObservers(this.taskId);
                // reset
                this.completedTasks.clear();
            }
            this.lock.freeLock();
            try {
                // just some sleep, you change to how it fits you
                Thread.sleep(1000);
            } catch (final InterruptedException e) {
                // TODO Auto-generated catch block
            }
        }
    }

    @Override
    public void update(final Observable observable, final Object arg) {
        this.lock.getLock();
        this.completedTasks.add((String) arg);
        this.lock.freeLock();
    }

    public void addPredecessorTask(final Task task) {
        if (this.taskId.equals(task.taskId)) {
            return;
        }
        this.lock.getLock();
        // Notice here, it is a little logic make your predecessor/successor work
        task.addObserver(this);
        this.shouldCompletedTasksBeforeStart.add(task.taskId);
        this.lock.freeLock();
    }
    protected abstract void doWork();
}

//HelloTask.java
public static class HelloTask extends Task {
    public HelloTask(final String taskId) {
        super(taskId);
    }

    @Override
    protected void doWork() {
        System.out.println("Hello from " + getTaskId() + "!");
    }
}

//Main.java
public class Main {
    public static void main(final String[] args) {
        final HelloTask helloTaskA = new HelloTask("A");
        final HelloTask helloTaskB = new HelloTask("B");
        final HelloTask helloTaskC = new HelloTask("C");

        helloTaskA.addPredecessorTask(helloTaskB);
        helloTaskC.addPredecessorTask(helloTaskB);

        final ExecutorService pool = Executors.newFixedThreadPool(10);
        pool.execute(helloTaskC);
        pool.execute(helloTaskA);
        pool.execute(helloTaskB);

    }
}

Artificial Brain and cryptanalysis

Was reading about different Blue brain or Artificial brain projects and thought of writing a simple network myself. So all I could afford myself is one compute machine from amazon which allows to run 3000 threads only if I am not doing much in each neuron. Each thread is a neuron and after adding little logic of filtering I came down to much lower number. I felt like, what am I doing, so to simulate this in cloud framework, I need the following,

  • A good messaging platform
  • A very lightweight service framework to make it work as neuron

And to able to use this network for doing a cryptanalysis is not far but the brain I produced is a size of an ant or less. I need bigger brain means a lot of machines => a lot of money ??