Massive Technical Interviews Tips: Mesos

/var/log/mesos/
/var/data/mesos

http://mesos.apache.org/documentation/latest/sandbox/#where-is-it

Mesos refers to the “sandbox” as a temporary directory that holds files specific to a single executor. Each time an executor is run, the executor is given its own sandbox and the executor’s working directory is set to the sandbox.

The sandbox is located on the agent, inside a directory tree like the following:

root ('--work_dir')
|-- slaves
|   |-- latest (symlink)
|   |-- <agent ID>
|       |-- frameworks
|           |-- <framework ID>
|               |-- executors
|                   |-- <executor ID>
|                       |-- runs
|                           |-- latest (symlink)
|                           |-- <container ID> (Sandbox!)

https://mesosphere.github.io/dcos-commons/
https://github.com/mesosphere/dcos-commons
https://github.com/dcos/examples.git

Install dcos using vagrant
https://yurisubach.com/2016/07/06/dcos-local-deployment/
https://github.com/dcos/dcos-vagrant

Access the GUI http://m1.dcos/

vagrant up|destroy
https://github.com/dcos/dcos-cli/issues/679
If you are using Vagrant, the --user=vagrant switch works. Note it still asks for a password, I entered the password as vagrant and I got in.
https://github.com/dcos/examples/tree/master/redis/1.10
dcos package install mr-redis
dcos node ssh --master-proxy --leader --user=vagrant
you will need to use the API available on mrredis.mesos:5656, from within the DC/OS cluster, will create a Redis instance with the name test, a memory capacity of 100 MB and 3 Redis slaves
curl -X POST mrredis.mesos:5656/v1/CREATE/test3/100/1/3
curl -s mrredis.mesos:5656/v1/STATUS/test | jq

docker run -it --rm redis:4-alpine redis-cli -h 10.0.1.5 -p 6380

curl -X DELETE mrredis.mesos:5656/v1/DELETE/test
dcos package uninstall mr-redis

https://github.com/dcos/examples/tree/master/elasticsearch/1.10
dcos package install elasticsearch

In the following we will use the DC/OS Admin Router to provide access to the Elasticsearch UI: use the URL http://$DCOS_DASHBOARD/service/elasticsearch/ and replace $DCOS_DASHBOARD with the URL of your DC/OS UI:

curl -XPUT 192.168.65.111:1025/customer?pretty

curl 192.168.65.111:1025/_cat/indices?v

curl -XPUT 192.168.65.111:1025/customer/external/1?pretty -d '

{

"name": "Dave C Os"

}'

dcos package uninstall elasticsearch

https://gist.github.com/benclarkwood/77dea418bcac21a91d4a28e59b489af1

http://blog.dataart.com/getting-started-with-packages-in-dcos/

https://docs.mesosphere.com/1.11/tutorials/dcos-101/
https://mesosphere.github.io/marathon/docs/persistent-volumes.html

You can create a stateful application by specifying a local persistent volume. Local volumes enable stateful tasks because tasks can be restarted without data loss.

When you specify a local volume or volumes, tasks and their associated data are “pinned” to the node they are first launched on and will be relaunched on that node if they terminate. The resources the application requires are also reserved. Marathon will implicitly reserve an appropriate amount of disk space (as declared in the volume via persistent.size) in addition to the sandbox disk size you specify as part of your application definition.

{
  "containerPath": "data",
  "mode": "RW",
  "persistent": {
    "type": "root",
    "size": 10,
    "constraints": []
  }
}

https://docs.mesosphere.com/1.7/usage/tutorials/wordpress-mysql/
https://docs.mesosphere.com/services/elastic/2.2.0-5.6.5/install/

http://mesos.apache.org/documentation/latest/deploy-scripts/
One particularly useful setting is LIBPROCESS_IP, which tells the master and agent binaries which IP address to bind to; in some installations, the default interface that the hostname resolves to is not the machine’s external IP address, so you can set the right IP through this variable

https://github.com/uzyexe/mesos-marathon-demo/blob/master/docker-compose.yml - works
http://mesos.apache.org/documentation/latest/endpoints/

   /slaves
   /master/slaves

   /health
   /master/health

https://hub.docker.com/r/mesosphere/mesos-slave-dind/

--privileged=true - Provides access to cgroups

Recommended Environment Variables

MESOS_CONTAINERIZERS - Include docker to enable running tasks as docker containers. Ex: docker,mesos
MESOS_RESOURCES - Specify resources to avoid oversubscribing via auto-detecting host resources. Ex: cpus:4;mem:1280;disk:25600;ports:[21000-21099]
DOCKER_NETWORK_OFFSET - Specify an IP offset to give each mesos-slave-dind container (default: 0.0.1.0). Ex: 0.0.1.0 (slave A), 0.0.2.0 (slave B)
DOCKER_NETWORK_SIZE - Specify a CIDR range to apply to the above offset (default=24).
VAR_LIB_DOCKER_SIZE - Specify the max size (in GB) of the loop device to be mounted at /var/lib/docker (default=5). This is only used if OverlayFS is not supported by the kernel or the parent docker is configured to use AUFS.

https://github.com/dcos/dcos-docker#network-routing-docker-for-mac

HyperKit (the hypervisor used by Docker for Mac) does not currently support IP routing on Mac.

Use one of the following alternative solutions instead:

docker-mac-network sets up a VPN running in containers and uses a VPN client to route traffic to other containers.
Docker for Mac - Host Bridge uses a kernel extension to add a new network interface and Docker network bridge.

https://docs.mesosphere.com/1.11/deploying-services/expose-service/
Create a Marathon app definition with the required "acceptedResourceRoles":["slave_public"] parameter specified.

All other users: You can use Marathon-LB, a rapid proxy and load balancer that is based on HAProxy.
https://mesosphere.github.io/marathon/docs/persistent-volumes.html
https://docs.mesosphere.com/1.7/usage/tutorials/marathon/stateful-services/

You’ll notice that we’re creating a volume for postgres to use for its data. Even if the task dies and restarts, it will get that volume back. Next, add this service to your cluster:

dcos marathon app add /1.7/usage/tutorials/marathon/stateful-services/postgres.marathon.json


Copy

One the service has been scheduled and the docker container has downloaded, postgres will become healthy and be ready to use. You can see this by checking out what tasks are running on your cluster:

dcos marathon task list

dcos marathon app stop postgres

dcos marathon app start postgres

To restore the state of your cluster as it was before installing the stateful service, you delete the service:

dcos marathon app remove postgres

https://www.youtube.com/watch?v=kDVBRh9J2Ys

https://www.reddit.com/r/devops/comments/6vft4j/main_differences_between_apache_mesos_vs/

http://mesosframeworks.com/
https://github.com/mesos/elasticsearch
https://github.com/adamtulinius/mesos-solrcloud

https://www.digitalocean.com/community/tutorials/an-introduction-to-mesosphere

Master daemon: runs on a master node and manages slave daemons
Slave daemon: runs on a master node and runs tasks that belong to frameworks
Framework: also known as a Mesos application, is composed of a scheduler, which registers with the master to receive resource offers, and one or more executors, which launches tasks on slaves. Examples of Mesos frameworks include Marathon, Chronos, and Hadoop
Offer: a list of a slave node's available CPU and memory resources. All slave nodes send offers to the master, and the master provides offers to registered frameworks
Task: a unit of work that is scheduled by a framework, and is executed on a slave node. A task can be anything from a bash command or script, to an SQL query, to a Hadoop job
Apache ZooKeeper: software that is used to coordinate the master nodes

https://docs.ovh.com/fr/docker/quick-start-with-marathon/
https://mesosphere.github.io/marathon/docs/application-basics.html

http://blog.csdn.net/pelick/article/details/45652117

Marathon依赖zk和mesos，如果没用mesos集群的话可以跑local模式。

首先download zookeeper包，修改conf/zoo.cfg，然后

bin/zkServer.sh start
1

启动一个在local:2181的单个zk服务。zk对Marathon来说，用于做同一个app的多个副本的选举，以做到app fail后marathon可以在新的mesos slave上重新启动。另一方面，zk也用于mesos集群的HA模式。即zk同时负责了对mesos和marathon的HA，但节点路径是分开的，可以见下面启动参数。

我也是这样在本地启动了一个master和一个slave

/usr/local/sbin/mesos-master --registry=in_memory --ip=127.0.0.1
/usr/local/sbin/mesos-slave --master=127.0.0.1:5050
1
2

然后可以在localhost:5050 看到mesos master的UI。

下载Marathon包，我使用的是0.8.0版本，支持0.20.0+版本的mesos。下载完解压就可以使用 https://mesosphere.github.io/marathon/

./bin/start --master 127.0.0.1:5050 --zk zk://localhost:2181/marathon
1

这里连接的就是local启动的mesos master和zk。在local:8080查看marathon的UI。

https://mesosphere.github.io/marathon/docs/

MESOS_NATIVE_JAVA_LIBRARY=/usr/local/Cellar/mesos/1.4.1/lib/libmesos.dylib

./marathon --master 127.0.0.1:5050 --zk zk://localhost:2181/marathon

Marathon相比Mesos上另一个service调度框架Apache Aurora更加易上手，本身是Scala开发的，整体和mesos一样让人感觉轻量，主要提供的是google-scale能力和方便的app管理服务。

http://xialingsc.github.io/home//mesos/How-to-install-Mesos-On-Mac/

在实践过程中，还需要将/var/lib/mesos的权限赋予当前用户，否则会出现“/var/lib/mesos/replicated_log/LOCK: Permission denied Failed to recover the log”，修改其权限的方式为:

sudo chown `whoami` /var/lib/mesos

https://medium.com/@GetLevvel/how-to-get-started-with-apache-mesos-marathon-and-zookeeper-24fb72d76cf9

https://medium.com/@gargar454/deploy-a-mesos-cluster-with-7-commands-using-docker-57951e020586

https://github.com/sekka1/mesosphere-docker

```
  HOST_IP=$(docker-machine env default)
```

https://github.com/sekka1/mesosphere-docker/issues/11

https://mesosphere.com/blog/installing-mesos-on-your-mac-with-homebrew/

brew update
brew install mesos

/usr/local/Cellar/mesos/0.19.0: 83 files, 24M, built in 17.4 minutes

/usr/local/sbin/mesos-master --registry=in_memory --ip=127.0.0.1

A Mesos cluster needs at least one Mesos Master to coordinate and dispatch tasks onto Mesos Slaves. When experimenting on your laptop, a single master is all you need. Full production clusters, such as those you might run in a public cloud or in a private datacenter, will usually run Mesos in High Availability Mode. A highly-available Mesos cluster (designed for fault-tolerance with no single point of failure) will often have three or more masters running.

Once your Mesos Master has started, you can visit its management console: http://localhost:5050

Since a Mesos Master needs slaves onto which it will dispatch jobs, you might also want to run some of those. Mesos Slaves can be started by running the following command for each slave you wish to launch:

/usr/local/sbin/mesos-slave --master=127.0.0.1:5050 --work_dir=~/mesos/slave1

https://github.com/mesosphere/dcos-commons

https://www.usenix.org/legacy/events/nsdi11/tech/full_papers/Ghodsi.pdf

https://medium.com/@GetLevvel/how-to-get-started-with-apache-mesos-marathon-and-zookeeper-24fb72d76cf9

http://mesos.readthedocs.io/en/stable/getting-started/
https://platform9.com/blog/compare-kubernetes-vs-mesos/
https://stackoverflow.com/questions/26705201/whats-the-difference-between-apaches-mesos-and-googles-kubernetes

Apache Mesos, abstracts CPU, memory, and disk resources in a way that allows datacenters to function as if they were one large machine

It has built-in support for isolating processes using containers, such as Linux control groups (cgroups) and Docker, allowing multiple applications to run alongside each other on a single machine.

resource offers, two-tier scheduling, and resource isolation
uses resource offers to advertise resources to frameworks

resource scheduling is the responsibility of the Mesos master’s allocation module and the framework’s scheduler, a concept known as two-tier scheduling

Dominant Resource Fairness (DRF)
DRF seeks to maximize the minimum dominant share across all users. For example, if user A runs CPU-heavy tasks and user B runs memory-heavy tasks, DRF attempts to equalize user A’s share of CPUs with user B’s share of memory

Resource isolation
Using Linux cgroups or Docker containers to isolate processes, Mesos allows for multitenancy, or for multiple processes to be executed on a single Mesos slave.

When using cgroups, any packages or libraries that the tasks might depend on must be already present on the host operating system.

The leading master is responsible for deciding which resources to offer to a particular framework using a pluggable allocation module, or scheduling algorithm, to distribute resource offers to the various schedulers. The scheduler can then either accept or reject the offer based on whether it has any work to be performed at that time.

--attributes='datacenter:pdx1;rack:1-1;os:rhel7'
--resources='cpu:24;mem:24576;disk:409600'

a framework is the term given to any Mesos application that’s responsible for scheduling and executing tasks on a cluster. A framework is made up of two components: a scheduler and an executor.

A scheduler is typically a long-running service responsible for connecting to a Mesos master and accepting or rejecting resource offers. Mesos delegates the responsibility of scheduling to the framework

An executor is a process launched on a Mesos slave that runs a framework’s tasks on a slave

a Spark driver program connects to a cluster manager—the Spark master—that in turn distributes tasks to various worker nodes.

the Spark Driver refers to the machine running the Spark job, and the SparkContext is the main entry point to Spark. The SparkContext is responsible for connecting to a cluster manager and running tasks on the cluster.

Sunday, March 4, 2018

Mesos

Recommended Environment Variables

Labels

Popular Posts