Wednesday, March 23, 2016

YARN - Yet Another Resource Negotiator



YARN Essentials
YARN is a generic resource platform to manage resources in a typical cluster.
YARN enables multiple applications to run simultaneously on the same shared cluster and allows applications to negotiate resources based on need. resource allocation/management is central to YARN.

In Hadoop 1.x, there was a single JobTracker service that was overloaded with many things such as cluster resource management, scheduling jobs, managing computational resources, restarting failed tasks, monitoring TaskTrackers, and so on.

Locality awareness
Multitenancy
support more data processing models

separate two major functionalities, resource management and job scheduling or monitoring of JobTracker, into separate daemons, that is, a cluster level ResourceManager (RM) and an application-specific ApplicationMaster (AM).

master-slave: ResourceManager is the master and node-specific slave NodeManager (NM)
The ResourceManager is the supervisor component that manages the resources among the applications in the whole system. The per-application ApplicationMaster is the application-specific daemon that negotiates resources from ResourceManager and works in hand with NodeManagers to execute and monitor the application's tasks.

The application-level ApplicationMaster is responsible for negotiating resources from the ResourceManager on application submission, such as memory, CPU, disk, and so on. It is also responsible for tracking an application's status and monitoring application processes in coordination with the NodeManager.

ResourceManager
ApplicationMaster
NodeManager
NodeManager acts as a per-machine agent and is responsible for managing the life cycle of the container and for monitoring their resource usage.

scheduler policies
The FIFO scheduler
The Fair scheduler
The Capacity scheduler

Labels

Review (572) System Design (334) System Design - Review (198) Java (189) Coding (75) Interview-System Design (65) Interview (63) Book Notes (59) Coding - Review (59) to-do (45) Linux (43) Knowledge (39) Interview-Java (35) Knowledge - Review (32) Database (31) Design Patterns (31) Big Data (29) Product Architecture (28) MultiThread (27) Soft Skills (27) Concurrency (26) Cracking Code Interview (26) Miscs (25) Distributed (24) OOD Design (24) Google (23) Career (22) Interview - Review (21) Java - Code (21) Operating System (21) Interview Q&A (20) System Design - Practice (20) Tips (19) Algorithm (17) Company - Facebook (17) Security (17) How to Ace Interview (16) Brain Teaser (14) Linux - Shell (14) Redis (14) Testing (14) Tools (14) Code Quality (13) Search (13) Spark (13) Spring (13) Company - LinkedIn (12) How to (12) Interview-Database (12) Interview-Operating System (12) Solr (12) Architecture Principles (11) Resource (10) Amazon (9) Cache (9) Git (9) Interview - MultiThread (9) Scalability (9) Trouble Shooting (9) Web Dev (9) Architecture Model (8) Better Programmer (8) Cassandra (8) Company - Uber (8) Java67 (8) Math (8) OO Design principles (8) SOLID (8) Design (7) Interview Corner (7) JVM (7) Java Basics (7) Kafka (7) Mac (7) Machine Learning (7) NoSQL (7) C++ (6) Chrome (6) File System (6) Highscalability (6) How to Better (6) Network (6) Restful (6) CareerCup (5) Code Review (5) Hash (5) How to Interview (5) JDK Source Code (5) JavaScript (5) Leetcode (5) Must Known (5) Python (5)

Popular Posts