Friday, July 3, 2015

Interview - Deadlock Misc



https://mp.weixin.qq.com/s/BVGtDDCa7yjtfJJPNKOC_g
当使用synchronized关键词提供的内置锁时,只要线程没有获得锁,那么就会永远等待下去,然而Lock接口提供了boolean tryLock(long time, TimeUnit unit) throws InterruptedException方法,该方法可以按照固定时长等待锁,因此线程可以在获取锁超时以后,主动释放之前已经获得的所有的锁。通过这种方式,也可以很有效地避免死锁。
我们再来回顾一下死锁的定义,“死锁是指两个或两个以上的进程在执行过程中,由于竞争资源或者由于彼此通信而造成的一种阻塞的现象,若无外力作用,它们都将无法推进下去。”
死锁条件里面的竞争资源,可以是线程池里的线程、网络连接池的连接,数据库中数据引擎提供的锁,等等一切可以被称作竞争资源的东西。

1、线程池死锁

用个例子来看看这个死锁的特征:
final ExecutorService executorService = 
        Executors.newSingleThreadExecutor();
Future<Long> f1 = executorService.submit(new Callable<Long>() {

    public Long call() throws Exception {
        System.out.println("start f1");
        Thread.sleep(1000);//延时
        Future<Long> f2 = 
           executorService.submit(new Callable<Long>() {

            public Long call() throws Exception {
                System.out.println("start f2");
                return -1L;
            }
        });
        System.out.println("result" + f2.get());
        System.out.println("end f1");
        return -1L;
    }
});
在这个例子中,线程池的任务1依赖任务2的执行结果,但是线程池是单线程的,也就是说任务1不执行完,任务2永远得不到执行,那么因此造成了死锁
解决办法:扩大线程池线程数 or 任务结果之间不再互相依赖。

2、网络连接池死锁

同样的,在网络连接池也会发生死锁,假设此时有两个线程A和B,两个数据库连接池N1和N2,连接池大小都只有1,如果线程A按照先N1后N2的顺序获得网络连接,而线程B按照先N2后N1的顺序获得网络连接,并且两个线程在完成执行之前都不释放自己已经持有的链接,因此也造成了死锁。

5. How to prevent deadlocks?
Lock Ordering

Deadlock occurs when multiple threads need the same locks but obtain them in different order.

If you make sure that all locks are always taken in the same order by any thread, deadlocks cannot occur. 

If a thread, like Thread 3, needs several locks, it must take them in the decided order. It cannot take a lock later in the sequence until it has obtained the earlier locks.


Lock ordering is a simple yet effective deadlock prevention mechanism. However, it can only be used if you know about all locks needed ahead of taking any of the locks. This is not always the case.


Lock Timeout
Another deadlock prevention mechanism is to put a timeout on lock attempts meaning a thread trying to obtain a lock will only try for so long before giving up. If a thread does not succeed in taking all necessary locks within the given timeout, it will backup, free all locks taken, wait for a random amount of time and then retry.

Operating System | Process Management | Deadlock Introduction
Deadlock
 is a situation where a set of processes are blocked because each process is holding a resource and waiting for another resource acquired by some other process.
 in operating systems when there are two or more processes hold some resources and wait for resources held by other(s). For example, in the below diagram, Process 1 is holding Resource 1 and waiting for resource 2 which is acquired by process 2, and process 2 is waiting for resource 1.
Deadlock can arise if following four conditions hold simultaneously (Necessary Conditions) 
Coffman conditions
Mutual Exclusion: One or more than one resource are non-sharable (Only one process can use at a time)
Hold and Wait: A process is holding at least one resource and waiting for resources.
No Preemption: A resource cannot be taken from a process unless the process releases the resource.
Circular Wait: A set of processes are waiting for each other in circular form.

Methods for handling deadlock
There are three ways to handle deadlock
1) Deadlock prevention or avoidance: The idea is to not let the system into deadlock state.
2) Deadlock detection and recovery: Let deadlock occur, then do preemption to handle it once occurred.
3) Ignore the problem all together: If deadlock is very rare, then let it happen and reboot the system. This is the approach that both Windows and UNIX take.

http://geeksquiz.com/deadlock-prevention/
Deadlock Prevention And Avoidance
Eliminate Mutual Exclusion 
It is not possible to dis-satisfy the mutual exclusion because some resources, such as the tap drive and printer, are inherently non-shareable.
Eliminate Hold and wait
1. Allocate all required resources to the process before start of its execution, this way hold and wait condition is eliminated but it will lead to low device utilization.
Eliminate Circular Wait
Each resource will be assigned with a numerical number. A process can request for the resources only in increasing order of numbering.
For Example, if P1 process is allocated R5 resources, now next time if P1 ask for R4, R3 lesser than R5 such request will not be granted, only request for resources more than R5 will be granted.

http://buttercola.blogspot.com/2014/11/interview-knowledge-based-questions-1.html
Every time a thread takes a lock it is noted in a data structure (map, graph etc.) of threads and locks. Additionally, whenever a thread requests a lock this is also noted in this data structure.

When a thread requests a lock but the request is denied, the thread can traverse the lock graph to check for deadlocks.

One possible action is to release all locks, backup, wait a random amount of time and then retry. This is similar to the simpler lock timeout mechanism except threads only backup when a deadlock has actually occurred. Not just because their lock requests timed out. However, if a lot of threads are competing for the same locks they may repeatedly end up in a deadlock even if they back up and wait.

A better option is to determine or assign a priority of the threads so that only one (or a few) thread backs up. The rest of the threads continue taking the locks they need as if no deadlock had occurred. If the priority assigned to the threads is fixed, the same threads will always be given higher priority. To avoid this you may assign the priority randomly whenever a deadlock is detected.
http://www.shuatiblog.com/blog/2014/09/01/Multithreading-deadlock-prevention/
Preventing one of the 4 conditions will prevent deadlock:
Removing the mutual exclusion condition, but not very possible.

The hold and wait conditions may be removed by requiring processes to request all the resources they will need before starting up.

The no preemption condition may also be difficult or impossible to avoid as a process has to be able to have a resource for a certain amount of time, or the processing outcome may be inconsistent or thrashing may occur.

The final condition is the circular wait condition.
Answer
Assign an order to our locks (require that the locks always acquired in order).
This prevent 2 thread waiting to get the resource in each other’s hand.
https://hellosmallworld123.wordpress.com/2014/05/21/threads-and-locks/
inorder to prevent deadlock, we can usually focus on preemption and circular wait. For preemption, we can set a timeout and when hit, release the lock. For circular wait, we can try to enforce an order of acquiring the locks so that each thread must acquire the locks in a specific order, in that way there will be no thread holding one lock and waiting another.
https://en.wikipedia.org/wiki/Deadlock
Avoiding database deadlocks
An effective way to avoid database deadlocks is to follow this approach from the Oracle Locking Survival Guide:
Application developers can eliminate all risk of enqueue deadlocks by ensuring that transactions requiring multiple resources always lock them in the same order.

Livelock
A livelock is similar to a deadlock, except that the states of the processes involved in the livelock constantly change with regard to one another, none progressing.
Livelock is a special case of resource starvation; the general definition only states that a specific process is not progressing.
A real-world example of livelock occurs when two people meet in a narrow corridor, and each tries to be polite by moving aside to let the other pass, but they end up swaying from side to side without making any progress because they both repeatedly move the same way at the same time.
Livelock is a risk with some algorithms that detect and recover from deadlock. If more than one process takes action, the deadlock detection algorithm can be repeatedly triggered. This can be avoided by ensuring that only one process (chosen arbitrarily or by priority) takes action.

Distributed deadlock
Distributed deadlocks can occur in distributed systems when distributed transactions or concurrency control is being used. Distributed deadlocks can be detected either by constructing a global wait-for graph from local wait-for graphs at a deadlock detector or by adistributed algorithm like edge chasing.
Phantom deadlocks are deadlocks that are falsely detected in a distributed system due to system internal delays but don't actually exist.
http://geeksquiz.com/deadlock-detection-recovery/
Deadlock Detection
1. If resources have single instance:
In this case for Deadlock detection we can run an algorithm to check for cycle in the Resource Allocation Graph. Presence of cycle in the graph is the sufficient condition for deadlock.
deadlock
In the above diagram, resource 1 and resource 2 have single instances. There is a cycle R1–>P1–>R2–>P2. So Deadlock is Confirmed.
2. If there are multiple instances of resources:
Detection of cycle is necessary but not sufficient condition for deadlock detection, in this case system may or may not be in deadlock varies according to different situations.
Deadlock Recovery
Traditional operating system such as Windows doesn’t deal with deadlock recovery as it is time and space consuming process. Real time operating systems use Deadlock recovery.
Recovery method
1. Killing the process.
     killing all the process involved in deadlock.
     
     Killing process one by one. After killing each 
     process check for deadlock again keep repeating 
     process till system recover from deadlock.
2. Resource Preemption
Resources are preempted from the processes involved in deadlock, preempted resources are allocated to other processes, so that their is a possibility of recovering the system from deadlock. In this case system go into starvation.

Quiz: http://geeksquiz.com/deadlock/
http://www.geeksforgeeks.org/operating-systems-set-16/
http://www2.latech.edu/~box/os/ch07.pdf
http://www.cs.uic.edu/~jbell/CourseNotes/OperatingSystems/7_Deadlocks.html
http://users.cs.cf.ac.uk/O.F.Rana/os/lectureos12/lectureos12.html

Labels

Review (572) System Design (334) System Design - Review (198) Java (189) Coding (75) Interview-System Design (65) Interview (63) Book Notes (59) Coding - Review (59) to-do (45) Linux (43) Knowledge (39) Interview-Java (35) Knowledge - Review (32) Database (31) Design Patterns (31) Big Data (29) Product Architecture (28) MultiThread (27) Soft Skills (27) Concurrency (26) Cracking Code Interview (26) Miscs (25) Distributed (24) OOD Design (24) Google (23) Career (22) Interview - Review (21) Java - Code (21) Operating System (21) Interview Q&A (20) System Design - Practice (20) Tips (19) Algorithm (17) Company - Facebook (17) Security (17) How to Ace Interview (16) Brain Teaser (14) Linux - Shell (14) Redis (14) Testing (14) Tools (14) Code Quality (13) Search (13) Spark (13) Spring (13) Company - LinkedIn (12) How to (12) Interview-Database (12) Interview-Operating System (12) Solr (12) Architecture Principles (11) Resource (10) Amazon (9) Cache (9) Git (9) Interview - MultiThread (9) Scalability (9) Trouble Shooting (9) Web Dev (9) Architecture Model (8) Better Programmer (8) Cassandra (8) Company - Uber (8) Java67 (8) Math (8) OO Design principles (8) SOLID (8) Design (7) Interview Corner (7) JVM (7) Java Basics (7) Kafka (7) Mac (7) Machine Learning (7) NoSQL (7) C++ (6) Chrome (6) File System (6) Highscalability (6) How to Better (6) Network (6) Restful (6) CareerCup (5) Code Review (5) Hash (5) How to Interview (5) JDK Source Code (5) JavaScript (5) Leetcode (5) Must Known (5) Python (5)

Popular Posts