Friday, August 31, 2018

Career Development



https://lethain.com/career-advice/
  • Be discoverable. One of the reasons I write online is so that folks can discover me, and being discoverable has lead to many of the best things in my life. I suspect getting access to most exceptionallyinteresting jobs depends on first being widely discoverable.

  • Don’t stay comfortable for too long. Growth compounds and if you stay somewhere that you’re very comfortable for too long, then you’re missing out on so much future growth.

https://medium.com/@gayle/the-best-thing-you-can-do-for-your-career-just-say-yes-3f24d3673200
In fact, almost all the major turning points in my career were times when I wanted to say no to something because I didn’t really feel like it or didn’t feel ready for it or didn’t really see the point.



And this is what many people miss. You see the successes of successful people and you assume they “strategized” and then followed the right path. You don’t see that there were a whole lot of failures along the way. You don’t see all the unpredictability. You don’t see that many of their successes were not plans but rather whims — things they just did because… why not?

If I could give one piece of career advice to people, it’s this: Just say yes. Just do it. Try it. See what happens.
https://www.w3cschool.cn/architectroad/architectroad-yuc52cxo.html
职场第一大忌是什么?
答:遇到问题,搁置或绕路

职场最没有前途的是什么人?
答:“反馈黑洞”,即不沟通,不同步信息,这样的人越能干,效果越负向。
在职场,没有沟通能力,其他能力都归零

成人世界的逻辑应该是“去下注大概率成功的事情”:
•高考成功的概率比不读书去做生意成功的概率更高,所以应该去参加高考
•投资风口中迅猛发展的企业,财富增值的概率高,于是投资这些企业
•读书比刷朋友圈更有可能有收获,于是就读书

讲事实不能说服说有人,但事实就不重要么?
故,澄清是必要的。

职场中最重要的能力是什么?
:表达能力。
传统社会最重要的资产是财富和权力,未来社会最重要的资产是影响力
影响力如何构成:
•写作
•演讲
这个社会,有啥事是不能够通过一顿撸串解决的呢?如果有,那就两顿;
人在职场,有啥事是不能够通过一次沟通解决的呢?如果有,那就两次。

职场中怎么样的态度才能够“如鱼得水”?
诚恳
诚恳的沟通,让自己包裹在事实当中,最稳妥。
没有任何道路通向真诚,真诚是通向一切的道路。

总结,在职场中混,罗振宇送给新员工的四句话:
目标清晰,想方法和行动
做大概率会赢的事情
沟通和表达是最重要的
找不到态度,就回到真诚


Kafka Troubleshooting



https://mp.weixin.qq.com/s/pZ4BZmhzAJDpM4D8KJJfzQ
最近发现线上的Kafka Consumer Client频繁出现无法消费的情况,导致offset积压。但是在重启Kafka Broker之后又正常了。 而Cloudera Manager在重启之前,我们发现三台broker中并没有KakfaController。让人很是不解。
这张图是后续补的,当时的现象是Leader的值均为-1,Isr的值也均为-1

至此我们猜测是因为KakfaController丢失导致的partition leader为-1,进而导致的Consumer端无法正常消费。

查看Leader的选举方式
在这里我们需要先看下KafkaController,KafkaServer.startup()的时候会新建KafkaController,而KafkaController在启动时,启动了Controller的Elector
从上面的选举代码中我们可以看出在Kafka集群刚启动时,默认Broker谁先启动则默认为Controller,后续若/controller节点发生变化,会触发Leader变更监听程序LeaderChangeListener,执行变更操作或者重新选举


我们可以发现Replica的状态机管理是在KafkaController中完成的,也就是说Controller丢失的情况下,也就失去了与Zookeeper交互的能力。默认情况下Leader必须从ISR列表中选择,我们发现列表为空(经过排查发现是Kafka的bug,在Controller和Zookeeper通信过程中出现问题时,可能导致leader丢失而无法通信的情况,这个可能性是很大的,因为zookeeper在高并发环境是容易超时,这就是为什么在kafka 0.8.2.1之后更建议我们使用kafka topic的方式存储offset而不是存储在zookeeper中。
当然一般情况下我们会先检查系统日志是否有报异常,这种定位问题效率最高。我们来看看kafka的server log是不是有和zookeeper相关的异常


从图上可以发现,的确存在zookeeper连接失败的情况,另外我们发现一个比较诡异的事情:
2018-04-10 00:30:11,149 INFO kafka.controller.KafkaController: [Controller 218]: Currently active brokers in the cluster: Set()
所有的broker都临时下线了,然后我查看了其他broker的server log发现所有机器在同一时间均出现了zookeeper连接超时的情况,导致了后续一连串的ERROR:
发现凌晨的时候出现了流量高峰。可见主要原因还是当时系统IO压力大导致的连接超时,于是我们适当调大了zookeeper.session.timeout.ms=12,再也没有出现过超时的情况。实际上这是kafka的bug,在0.10版本中已经解决了。另一方面在我们集群中出现这个问题,有一部分原因在于,我们的kafka集群和zookeeper集群是在不同的机房,不同的网络中。在后来的排查中我们发现有一台交换机网口有问题导致的带宽不稳定,特别是在大规模计算的时候尤为明显。
PS:当zookeeper服务器端和客户端版本不一致的时候也会导致连接超时的情况。


Labels

Review (572) System Design (334) System Design - Review (198) Java (189) Coding (75) Interview-System Design (65) Interview (63) Book Notes (59) Coding - Review (59) to-do (45) Linux (43) Knowledge (39) Interview-Java (35) Knowledge - Review (32) Database (31) Design Patterns (31) Big Data (29) Product Architecture (28) MultiThread (27) Soft Skills (27) Concurrency (26) Cracking Code Interview (26) Miscs (25) Distributed (24) OOD Design (24) Google (23) Career (22) Interview - Review (21) Java - Code (21) Operating System (21) Interview Q&A (20) System Design - Practice (20) Tips (19) Algorithm (17) Company - Facebook (17) Security (17) How to Ace Interview (16) Brain Teaser (14) Linux - Shell (14) Redis (14) Testing (14) Tools (14) Code Quality (13) Search (13) Spark (13) Spring (13) Company - LinkedIn (12) How to (12) Interview-Database (12) Interview-Operating System (12) Solr (12) Architecture Principles (11) Resource (10) Amazon (9) Cache (9) Git (9) Interview - MultiThread (9) Scalability (9) Trouble Shooting (9) Web Dev (9) Architecture Model (8) Better Programmer (8) Cassandra (8) Company - Uber (8) Java67 (8) Math (8) OO Design principles (8) SOLID (8) Design (7) Interview Corner (7) JVM (7) Java Basics (7) Kafka (7) Mac (7) Machine Learning (7) NoSQL (7) C++ (6) Chrome (6) File System (6) Highscalability (6) How to Better (6) Network (6) Restful (6) CareerCup (5) Code Review (5) Hash (5) How to Interview (5) JDK Source Code (5) JavaScript (5) Leetcode (5) Must Known (5) Python (5)

Popular Posts