Thursday, March 31, 2016

Solr Cross Data Center Replication
  • New support for Cross Data Center Replication consisting of active/passive replication for separate SolrClouds hosted in separate data centers.
Cross Data Center Replication
  • Accommodate 2 or more data centers
  • Accommodate active/active uses
  • Accommodate limited band-with cross-datacenter connections
  • Minimize coupling between peer clusters to increase reliability
  • Support both full consistency and eventual consistency
Clusters will be configured to know about each other, most likely through keeping a cluster peer list in zookeeper. One essential piece of information will be the zookeeper quorum address for each cluster peer. Any node in one cluster can know the configuration of another cluster via a zookeeper client.
Update flow will go from the shard leader in one cluster to the shard leader in the peer clusters. This can be bi-directional, with updates flowing in both directions. Updates can be either synchronous or asynchronous, with per-update granularity.
Solr transaction logs are currently removed when no longer needed. They will be kept around (potentially much longer) to act as the source of data to be sent to peer clusters. Recovery can also be bi-directional with each peer cluster sending the other cluster missed updates.

Architecture Features & Benefits

  • Scalable – no required single points of aggregation / dissemination that could act as a bottleneck.
  • Per-update choice of synchronous/asynchronous forwarding to peer clusters.
  • Peer clusters may have different configuration, such as replication factor.
  • Asynchronous updates allow for bursts of indexing throughput that would otherwise overload cross-DC pipes.
  • “Push” operation for lowest latency async updates.
  • Low-overhead… re-uses Solr’s existing transaction logs for queuing.
  • Leader-to-leader communication means update is only sent over cross-DC connection once.

Update Flow

  1. An update will be received by the shard leader and versioned
  2. Update will be sent from the leader to it’s replicas
  3. Concurrently, update will be sent (synchronously or asynchronously) to the shard leader in other clusters
  4. Shard leader in the other cluster will receive already versioned update (and not re-version it), and forward the update to it’s replicas

Solr Document Versioning

The shard leader versions a document and then forwards it to replicas. Update re-orders are handled by the receiver by dropping updates that are detected to be older than the latest document version in the index. This works given that complete documents are always sent to replicas, even if it started as a partial update on the leader.
Solr version numbers are derived from a timestamp (the high bits are milliseconds and the low bits are incremented for each tie in the same millisecond to guarantee a monotonically increasing unique version number for any given leader).

The Clock Skew Problem

If updates are accepted for the same document in two different clouds (implying two different leaders versioning the document), then having the correct last document “win” relies on clock synchronization between the two leaders. Updates to the same document at different data centers within the clock skew time risk being incorrectly ordered.

The Partial Update Problem

Solr only has versions at the document level. The current partial update implementation (because of other constraints) reads the current stored fields of the document, makes the requested update, and indexes the new resulting document. This creates a problem with accepting Solr atomic updates / partial updates to the same document in both data-centers.
DC1: writes document A, version=time1
DC2: receives document A (version=time1) update from DC1
DC1: updates A.street_address (Solr reads version time1, writes version time2)
DC2: updates A.phone_number (Solr reads version time1, writes version time3)
DC1: receives document A (version=time3) from DC2, writes it.
DC2: received document A (version=time2) from DC1, ignores it (older version)
Although both data-centers became “consistent”, the partial update of street_address was completely lost in the process.


Option 1:
Configure the update for full synchronization. All peer clusters must be available for any to be writeable.
Option 2:
Use client versioning, where the update clients specify a user-level version field.
Option 3:
For a given document, consider one cluster the primary for the purposes of document changes/updates. See “Primary Cluster Routing”.

Primary Cluster Routing

To deal with potential update conflicts arising from updating the same document in different data centers, each document can have a primary cluster.
A routing enhancement can ensure that a document sent to the wrong cluster will be forwarded to the correct cluster.
Routing can take as input a request parameter, a document field, or the unique id field. The primary cluster could be determined by hash code (essentially random), or could be determined by a mapping specified in the cluster peer list. Changes to this mapping for fail-over would not happen automatically in Solr. If a data center becomes unreachable, the application/client layers have responsibility for deciding that a different cluster should become the primary for that set of documents.
Primary cluster routing will be optional. Many applications will naturally not trigger the type of undesirable update behavior described, or will have the ability to work around update limitations.

Future Option: Improve Partial Updates

Implement true partial updates with vector clocks and/or finer grained versioning so that updates to different fields can be done conflict free if re-ordered. This would also lower the bandwidth costs of partial updates since the entire document would no longer be sent to all replicas and to other peer clusters.

Future Option: Update Aggregators

One could potentially further minimize cross-DC traffic by introducing traffic aggregator nodes (one per cluster) that all udpates would flow through. This would likely only improve bandwidth utilization in low update environments. The improvements would come from fewer connections (and hence less connection overhead) and better compression (a block of many small updates would generally have a better compression ratio than the same updates compressed individually).

Future Option: Clusterstate proxy

Many zookeeper clients in a peer cluster could generate significant amounts of traffic between data centers. There could be a designated listener to the remote cluster state that could disseminate this state to others in the local cluster rather than hitting ZK directly.

Also worth investigating is the use of a local zookeeper observer node that could service all local ZK reads for the remote ZK quorum.

AWS High Availability
Regions are large and widely dispersed into separate geographic locations. Availability Zones are distinct locations within a region that are engineered to be isolated from failures in other Availability Zones and provide inexpensive, low latency network connectivity to other Availability Zones in the same region.
Each region is completely independent. Any ElastiCache activity you initiate (for example, creating clusters) runs only in your current default region.
Architectures using Multiple AWS regions can be broadly classified into following categories:Cold, Warm, Hot Standby and Hot Active . 
In designing High Availability Architectures using Multiple AWS regions we need to address the following set of challenges:

  • Workload Migration - ability to migrate our application environment across AWS regions
  • Data Synch - ability to migrate real time copy of the data between the two or more regions
  • Network Flow - ability to enable flow of network traffic between two or more regions
The Following diagram illustrates a sample AWS Multi Region HA architecture.

Workload Migration: Amazon S3 or EBS backed AMI’s will operate only in regional scope inside AWS. We need to create the same AMI’s in another AWS  region again for inter region HA architectures. Every time when a code deployment is made, applications need to synchronize the executable /jars/configuration files across the AWS regions. Use of Automated deployments like Puppet, Chef will make things easier for such ongoing deployment cases.  Point to Note: In addition to AMI's ; Amazon EBS, ElasticIP’s etc also operate in AWS Regional scope.
Amazon Availability Zones are distinct physical locations having Low latency network connectivity between them inside the same region and are engineered to be insulated from failures from other AZ’s. They have Independent power, cooling, network and security. 

it is usually recommended to architect applications leveraging multiple availability zones of Amazon inside a region as best practice. Availability zones (AZs) are distinct geographical locations that are engineered to be insulated from failures in other AZs and come really handy during outages. By placing Amazon EC2 instances in multiple AZs, an application can be protected from failure or outages at a single location. 

It is important to run independent application stacks in more than one AZ, either in the same region or in another region, so that if one zone fails, the application in the other zone can continue to run. When we design such a system, we will need a good understanding of zone dependencies.

AWS offers infrastructure building blocks like 
  • Amazon S3 for Object and File storage
  • Amazon CloudFront for CDN
  • Amazon ELB for Load balancing
  • Amazon AutoScaling for Scaling out EC2 automatically
  • Amazon CloudWatch for Monitoring
  • Amazon SNS and SQS for Messaging
as Web Services which developers and Architects can use in their App Architecture. These building blocks are inherently fault tolerant, robust and scalable in nature. They are in built with Multi-AZ capability for High availability. Example: S3 is designed to provide 99.999999999% durability and 99.99% availability of objects over a given year. It is designed to sustain the concurrent loss of data in two facilities. Applications architected using these building blocks can leverage the experience of Amazon engineers for building highly available systems in the form of simple API calls.
The entire failover process, from detection to the resumption of normal caching behavior, will take several minutes. Your application’s caching tier should have a strategy (and some code!) to deal with a cache that is momentarily unavailable.

Software Principles

Software Principles to Teach Every New Recruit

KISS — Keep It Stupid Simple
DRY — Don’t Repeat Yourself
YAGN — You Aint Gonna Need it
Symmetry — aka consistency
Robustness Principle — follow the rules, but don’t expect everyone will
DTDD — Documentation and Test-Driven Design
Effective Logging
The Deming Principle — failure behind the failure?
Some coding styles are simply easier to read than others. For example, fewer levels of indentation and brackets is simpler.  Less checking of conditions is simpler (and more efficient.) 

Coding styles that use less context are easier to follow.  Our brains, don’t have huge stacks.  Following multiple indentation levels, just like following execution paths through extra code layers, uses up some of our limited capacity.  It makes groking the code harder.  The code sample on the right terminates execution quickly when it can.  The code sample on the left keeps all execution paths lingering all the way to the end of the function.  So, at every line later in the function, we have to keep track of what state we’re in at each indent level.  That’s not in the spirit of KISS.

Java Misc Part 2
Common Mistake #1: Neglecting Existing Libraries

Common Mistake #6: Using Null References without Need
Optional<String> optionalString = Optional.ofNullable(nullableString);

Common Mistake #8: Concurrent Modification Exception
for (IHat hat : hats) {
    if (hat.hasEarFlaps()) {

Class path entries can contain the base name wildcard character (*), which is considered equivalent to specifying a list of all of the files in the directory with the extension .jar or .JAR. For example, the class path entry mydir/* specifies all JAR files in the directory named mydir. A class path entry consisting of * expands to a list of all the jar files in the current directory. Files are considered regardless of whether they are hidden (have names beginning with '.').
A class path entry that contains an asterisk (*) does not match class files. To match both classes and JAR files in a single directory mydir, use either mydir:mydir/* or mydir/*:mydir. The order chosen determines whether the classes and resources in mydir are loaded before JAR files in mydir or vice versa.
Subdirectories are not searched recursively. For example, mydir/* searches for JAR files only in mydir, not in mydir/subdir1mydir/subdir2, and so on.
The order in which the JAR files in a directory are enumerated in the expanded class path is not specified and may vary from platform to platform and even from moment to moment on the same machine. A well-constructed application should not depend upon any particular order. If a specific order is required, then the JAR files can be enumerated explicitly in the class path.
Expansion of wild cards is done early, before the invocation of a program's main method, rather than late, during the class-loading process. Each element of the input class path that contains a wildcard is replaced by the (possibly empty) sequence of elements generated by enumerating the JAR files in the named directory. For example, if the directory mydir contains a.jar, b.jar, and c.jar, then the class path mydir/* is expanded into mydir/a.jar:mydir/b.jar:mydir/c.jar, and that string would be the value of the system property java.class.path.
You can run JAR packaged applications with the Java launcher (java command). The basic command is:
java -jar jar-file
The -jar flag tells the launcher that the application is packaged in the JAR file format. You can only specify one JAR file, which must contain all of the application-specific code.
Before you execute this command, make sure that the runtime environment has information about which class within the JAR file is the application's entry point.
To indicate which class is the application's entry point, you must add a Main-Class header to the JAR file's manifest. The header takes the form:
Main-Class: classname
The header's value, classname, is the name of the class that is the application's entry point.

The 'e' flag (for 'entrypoint') creates or overrides the manifest's Main-Class attribute. It can be used while creating or updating a JAR file. Use it to specify the application entry point without editing or creating the manifest file.
For example, this command creates app.jar where the Main-Class attribute value in the manifest is set to MyApp:
jar cfe app.jar MyApp MyApp.class

If the entrypoint class name is in a package it may use a '.' (dot) character as the delimiter. For example, if Main.class is in a package called foo the entry point can be specified in the following ways:
jar cfe Main.jar foo.Main foo/Main.class

We first create a text file named Manifest.txt with the following contents:
Class-Path: MyUtils.jar
Command-Line Arguments

pass enviroment
I suspect the problem is that you've put the "-D" after the -jar. Try this:
java -Dtest="true" -jar myApplication.jar
From the command line help:
java [-options] -jar jarfile [args...]
In other words, the way you've got it at the moment will treat -Dtest="true" as one of the arguments to pass to main instead of as a JVM argument.
A JVM runs with a number of system properties. You can configure system properties by using the -D option, pronounced with an upper case 'D'

All you have to do i suse the -D flag, and provide the system prperty name immediately following the D, and equals sign, and then the value to be assigned to the property. For example, to set the file.encoding property of the Java runtime to utf-8, you could set the following property:

java -Dfile.encoding=utf-8

You can then grab the value programatically as follows:

System.getProperty("file.encoding"); /*this method is overloaded, as per previous post*/

  • Command line options contrary to command line data arguments—start with a prefix that uniquely identifies them. Prefix examples include a dash (-) on Unix platforms for options like -a or a slash (/) on Windows platforms.
  • Options can either be simple switches (i.e., -a can be present or not) or take a value. An example is:
    java MyTool -a -b logfile.inp
  • Options that take a value can have different separators between the actual option key and the value. Such separators can be a blank space, a colon (:), or an equals sign (=):
    java MyTool -a -b logfile.inp
    java MyTool -a -b:logfile.inp
    java MyTool -a -b=logfile.inp
  • Options taking a value can add one more level of complexity. Consider the way Java supports the definition of environment properties as an example:
    java -Djava.library.path=/usr/lib ...
  • So, beyond the actual option key (D), the separator (=), and the option's actual value (/usr/lib), an additional parameter (java.library.path) can take on any number of values (in the above example, numerous environment properties can be specified using this syntax). In this article, this parameter is called "detail."
Data arguments are all command line arguments that do not start with a prefix.
int foo = bar.charAt(1) - '0';
Because char is the same as short (although, an unsigned short), you can safely cast it to an int. And the casting is always done automatically if arithmetics are involved
Parsing date and time
To create a LocalDateTime object from a string you can use the static LocalDateTime.parse()method. It takes a string and a DateTimeFormatter as parameter. The DateTimeFormatter is used to specify the date/time pattern.
String str = "1986-04-08 12:30";
DateTimeFormatter formatter = DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm");
LocalDateTime dateTime = LocalDateTime.parse(str, formatter);
Formatting date and time
To create a formatted string out a LocalDateTime object you can use the format() method.
DateTimeFormatter formatter = DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm");
LocalDateTime dateTime = LocalDateTime.of(1986, Month.APRIL, 8, 12, 30);
String formattedDateTime = dateTime.format(formatter); // "1986-04-08 12:30"
Note that there are some commonly used date/time formats predefined as constants in DateTimeFormatter. For example: Using DateTimeFormatter.ISO_DATE_TIME to format the LocalDateTime instance from above would result in the string "1986-04-08T12:30:00".
The parse() and format() methods are available for all date/time related objects (e.g. LocalDate or ZonedDateTime)
Just to note that DateTimeFormatter is immutable and thread-safe, and thus the recommended approach is to store it in a static constant where possible
If you're using Java 7 or Java 8, you should strongly consider using java.nio.file.PathPath.resolve can be used to combine one path with another, or with a string. The Paths helper class is useful too. For example:
Path path = Paths.get("foo", "bar", "baz.txt");
If you need to cater for pre-Java-7 environments, you can use, like this:
File baseDirectory = new File("foo");
File subDirectory = new File(baseDirectory, "bar");
File fileInDirectory = new File(subDirectory, "baz.txt");
import java.text.DateFormatSymbols;
monthString = new DateFormatSymbols().getMonths()[month-1];
Alternatively, you could use SimpleDateFormat:
import java.text.SimpleDateFormat;
System.out.println(new SimpleDateFormat("MMMM").format(date));
(Note: if you need "fOO BAr" to become "Foo Bar", then use capitalizeFully(..) instead)
You should have a look at StringUtils class from Apache Commons Lang lib - it has method .capitalize()
Description from the lib:
Capitalizes a String changing the first letter to title case as per Character.toTitleCase(char). No other letters are changed.

Methods in object: hashCOde, equals, toString(), clone, finalize, getClass
wait, notify, notifyAll,
FileUtils.copyInputStreamToFile(initialStream, targetFile);

guava Files.write(buffer, targetFile);
Most of the answers so far have had to do with the OS scheduler. However, there is a more important factor that I think would lead to your answer. Are you writing to a single physical disk, or multiple physical disks?
Even if you parallelize with multiple threads...IO to a single physical disk is intrinsically a serialized operation. Each thread would have to block, waiting for its chance to get access to the disk. In this case, multiple threads are probably useless...and may even lead to contention problems.
However, if you are writing multiple streams to multiple physical disks, processing them concurrently should give you a boost in performance. This is particularly true with managed disks, like RAID arrays, SAN devices, etc.
CPU Bound means the rate at which process progresses is limited by the speed of the CPU. A task that performs calculations on a small set of numbers, for example multiplying small matrices, is likely to be CPU bound.
I/O Bound means the rate at which a process progresses is limited by the speed of the I/O subsystem. A task that processes data from disk, for example, counting the number of lines in a file is likely to be I/O bound.
Memory bound means the rate at which a process progresses is limited by the amount memory available and the speed of that memory access. A task that processes large amounts of in memory data, for example multiplying large matrices, is likely to be Memory Bound.
Cache bound means the rate at which a process progress is limited by the amount and speed of the cache available. A task that simply processes more data than fits in the cache will be cache bound.
I/O Bound would be slower than Memory Bound would be slower than Cache Bound would be slower than CPU Bound.
instanceof operator and isInstance() method both are used for checking the class of the object. But main difference comes when we want to check the class of object dynamically. In this case isInstance() method will work. There is no way we can do this by instanceof operator.

NOTE: instanceof operator throws compile time error(Incompatible conditional operand types) if we check object with other classes which it doesn’t instantiate.

Null is a special value used in Java. It is mainly used to indicate that no value is assigned to a reference variable.

import org.apache.maven.artifact.versioning.DefaultArtifactVersion;

DefaultArtifactVersion minVersion = new DefaultArtifactVersion("1.0.1");
DefaultArtifactVersion maxVersion = new DefaultArtifactVersion("1.10");

DefaultArtifactVersion version = new DefaultArtifactVersion("1.11");

if (version.compareTo(minVersion) < 0 || version.compareTo(maxVersion) > 0) {
    System.out.println("Sorry, your version is unsupported");
You can get the right dependency string for Maven Artifact from this page:
The best to reuse existing code, take Maven's ComparableVersion class
  • Apache License, Version 2.0,
  • tested,
  • used (copied) in multiple projects like spring-security-core, jboss etc
  • multiple features
  • it's already a java.lang.Comparable
  • just copy-paste that one class, no third-party dependencies
Don't include dependency to maven-artifact as that will pull various transitive dependencies
On the face of it, InetAddress.getLocalHost() should give you the IP address of this host. The problem is that a host could have lots of network interfaces, and an interface could be bound to more than one IP address. And to top that, not all IP addresses will be reachable from off the machine. Some could be virtual devices, and others could be private network IP addresses.
What this means is that the IP address returned by InetAddress.getLocalHost() might not be the right one to use.
  1. HashSet doesn’t maintain any kind of order of its elements.
  2. TreeSet sorts the elements in ascending order.
  3. LinkedHashSet maintains the insertion order. Elements gets sorted in the same sequence in which they have been added to the Set.

1. Using BufferedReader class

By wrapping the (standard input stream) in an InputStreamReader which is wrapped in a BufferedReader

Advantages: The input is buffered for efficient reading.
Drawbacks: The wrapping code is hard to remember.

2. Using Scanner class

The main purpose of the Scanner class (available since Java 1.5) is to parse primitive types and strings using regular expressions, however it is also can be used to read input from the user in the command line.
  • Convenient methods for parsing primitives (nextInt()nextFloat(), …) from the tokenized input.
  • Regular expressions can be used to find tokens.
  • The reading methods are not synchronized.

3. Using Console class

  • Reading password without echoing the entered characters.
  • Reading methods are synchronized.
  • Format string syntax can be used.
  • Does not work in non-interactive environment (such as in an IDE).
it has been becoming a preferred way for reading user’s input from the command line. In addition, it can be used for reading password-like input without echoing the characters entered by the user; the format string syntax can also be used (like System.out.printf()).

System.out.print("Enter your username: ");
String username = console.readLine();
System.out.print("Enter your password: ");
char[] password = console.readPassword();
String passport = console.readLine("Enter your %d (th) passport number: "2);

  • public Scanner useDelimiter(String pattern) – Sets this scanner’s delimiting pattern to a pattern constructed from the specified String. An invocation of this method of the form useDelimiter(pattern) behaves in exactly the same way as the invocation useDelimiter(Pattern.compile(pattern)). Invoking the reset() method will set the scanner’s delimiter to the default.
Scanner scanner = new Scanner(text).useDelimiter("\\s*,\\s*");
String initialString = "text";
InputStream targetStream = new ByteArrayInputStream(initialString.getBytes());
    InputStream targetStream = IOUtils.toInputStream(initialString);
Mocking an input stream is a lot of work and not really worth doing. There are many ways to get fake input streams that your tests can set up, without using mock objects. Try this:
String fakeInput = "This is the string that your fake input stream will return";
StringReader reader = new StringReader(fakeInput);
InputStream fakeStream = new ReaderInputStream(reader);
Note that ReaderInputStream is in Apache Commons IO
import java.util.*;
import java.util.Scanner;
import java.text.*;

    public String toString() {
        Iterator<E> it = iterator();
        if (! it.hasNext())
            return "[]";

        StringBuilder sb = new StringBuilder();
        for (;;) {
            E e =;
            sb.append(e == this ? "(this Collection)" : e);
            if (! it.hasNext())
                return sb.append(']').toString();
            sb.append(',').append(' ');

In Java 8 or later:
String listString = String.join(", ", list);
In case the list is not of type String, a joining collector can be used:
String listString =
                        .collect(Collectors.joining(", "));

String joined = Joiner.on("\t").join(list);
ArrayList class (Java Docs) extends AbstractList class, which extends AbstractCollectionclass which contains a toString() method (Java Docs). So you simply write
Arrays.toString (current_array) 
import org.apache.commons.lang3.StringUtils
StringUtils.join(slist, ',');
You'll have to use the latter sometimes when the compiler cannot automatically figure out what kind of Map is needed (this is called type inference). For example, consider a method declared like this:
public void foobar(Map<String,String> map){ ... }
When passing the empty Map directly to it, you have to be explicit about the type:
foobar(Collections.emptyMap());                // doesn't compile
foobar(Collections.<String,String>emptyMap()); // works fine
2) If you need to be able to modify the Map, then for example:
new HashMap<String,String>();

Addendum: if your project uses Guava, you have the following alternatives:
1) Immutable map:
// or:
ImmutableMap.<String, String>of();
Granted, no big benefits here compared to Collections.emptyMap()From the Javadoc:
This map behaves and performs comparably to Collections.emptyMap(), and is preferable mainly for consistency and maintainability of your code.
2) Map that you can modify:
// or:
Maps.<String, String>newHashMap();
Maps contains similar factory methods for instantiating other types of maps as well, such as TreeMap or LinkedHashMap.
It is, in my personal experience admittedly, very useful in cases where an API requires a collection of parameters, but you have nothing to provide. For example you may have an API that looks something like this, and does not allow null references:
public ResultSet executeQuery(String query, Map<String, Object> queryParameters);
If you have a query that doesn't take any parameters, it's certainly a bit wasteful to create a HashMap, which involves allocating an array, when you could just pass in the 'Empty Map' which is effectively a constant, the way it's implemented in java.util.Collections.
Why would I want an immutable empty collection? What is the point?
There are two different concepts here that appear strange when viewed together. It makes more sense when you treat the two concepts separately.
  • Firstly, you should prefer to use an immutable collection rather than a mutable one wherever possible. The benefits of immuablity are well documented elsewhere.
  • Secondly, you should prefer to use an empty collection rather than to use null as a sentinel. This is well described here. It means that you will have much cleaner, easier to understand code, with fewer places for bugs to hide.
 final variable in Java can be assigned a value only once, we can assign a value either in declaration or later.
    final int i = 10;
    i = 30; // Error because i is final.
blank final variable in Java is a final variable that is not initialized during declaration. Below is a simple example of blank final.
    // A simple blank final example 
    final int i;
    i = 30;
If we have more than one constructors or overloaded constructor in class, then blank final variable must be initialized in all of them. However constructor chaining can be used to initialize the blank final variable.
How can I determine the type of a generic field in Java?
As pointed out by wowest in a comment, you actually need to call Field#getGenericType(), check if the returned Type is a ParameterizedType and then grab the parameters accordingly. UseParameterizedType#getRawType() and ParameterizedType#getActualTypeArgument() to get the raw type and an array of the types argument of a ParameterizedType respectively. The following code demonstrates this:
for (Field field : Person.class.getDeclaredFields()) {
    System.out.print("Field: " + field.getName() + " - ");
    Type type = field.getGenericType();
    if (type instanceof ParameterizedType) {
        ParameterizedType pType = (ParameterizedType)type;
        System.out.print("Raw type: " + pType.getRawType() + " - ");
        System.out.println("Type args: " + pType.getActualTypeArguments()[0]);
    } else {
        System.out.println("Type: " + field.getType());
class Person {
  public final String name;
  public final List<Person> children;  

//in main
Field[] fields = Person.class.getDeclaredFields();
for (Field field : fields) {
  Type type = field.getGenericType();
  System.out.println("field name: " + field.getName());
  if (type instanceof ParameterizedType) {
    ParameterizedType ptype = (ParameterizedType) type;
    System.out.println("-raw type:" + ptype.getRawType());
    System.out.println("-type arg: " + ptype.getActualTypeArguments()[0]);
  } else {
    System.out.println("-field type: " + field.getType());
This outputs
field name: name
-field type: class java.lang.String
field name: children
-raw type:interface java.util.List
-type arg: class com.blah.Person
    public boolean equals(Object obj) {
        return EqualsBuilder.reflectionEquals(this, obj);

    public String toString() {
        return ToStringBuilder.reflectionToString(this,

    public int hashCode() {
        return HashCodeBuilder.reflectionHashCode(this);
With Guava you can use Lists.newArrayList(Iterable) or Sets.newHashSet(Iterable), among other similar methods. This will of course copy all the elements in to memory. If that isn't acceptable, I think your code that works with these ought to take Iterable rather than Collection. Guava also happens to provide convenient methods for doing things you can do on a Collection using an Iterable (such as Iterables.isEmpty(Iterable) or Iterables.contains(Iterable, Object)), but the performance implications are more obvious.
The short answer is that (by definition) "-0.0 is less than 0.0" in all the methods provided by the Double class (that is, equals(), compare(), compareTo(), etc)
Double allows all floating point numbers to be "totally ordered on a number line". Primitives behave the way a user will think of things (a real world definition) ... 0d = -0d
The following snippets illustrate the behaviour ...
final double d1 = 0d, d2 = -0d;

System.out.println(d1 == d2); //prints ... true
System.out.println(d1 < d2);  //prints ... false
System.out.println(d2 < d1);  //prints ... false
System.out.println(, d2)); //prints ... 1
System.out.println(, d1)); //prints ... -1
The equality operator (==) and the inequality operators (< and >) treat negative zero and (positive) zero exactly the same. However, theMath.min and Math.max functions treat (positive) zero and negative zero differently. While not shown in this code, StrictMath.min and StrictMath.max functions also treat (positive) zero as different than negative zero.

how come a primitive float value can be -0.0?
floating point numbers are stored in memory using the IEEE 754 standard meaning that there could be rounding errors. You could never be able to store a floating point number of infinite precision with finite resources.
You should never test if a floating point number == to some other, i.e. never write code like this:
if (a == b)
where a and b are floats. Due to rounding errors those two numbers might be stored as different values in memory.
You should define a precision you want to work with:
private final static double EPSILON = 0.00001;
and then test against the precision you need
if (Math.abs(a - b) < epsilon)
So in your case if you want to test that a floating point number equals to zero in the given precision:
if (Math.abs(a) < epsilon)

You could define callFriend this way:
public <T extends Animal> T callFriend(String name, Class<T> type) {
    return type.cast(friends.get(name));
Then call it as such:
jerry.callFriend("spike", Dog.class).bark();
jerry.callFriend("quacker", Duck.class).quack();

ou could implement it like this:

public <T extends Animal> T callFriend(String name) {
    return (T)friends.get(name);

The return type will be inferred from the caller. However, note the @SuppressWarnings annotation: that tells you that this code isn't typesafe. You have to verify it yourself, or you could get ClassCastExceptions at runtime.

Unfortunately, the way you're using it (without assigning the return value to a temporary variable), the only way to make the compiler happy is to call it like this:


While this may be a little nicer than casting, you are probably better off giving the Animal class an abstract talk() method, as David Schmitt said.
It's perfectly fine to declare a method with the signature public <T> T getDate().
However, it is impossible to implement the method that returns what you want. What a method does at runtime cannot depend on its type parameter alone, because it doesn't know its type parameter.
To get an intuition for this, realize that any code written with generics can also be written equivalently without using generics, by simply removing generic parameters and inserting casts where appropriate. This is what "type erasure" means.
Therefore, to see whether your method would be possible in Generics, simply ask, how would you do it without Generics:
public Object getDate() 
    // what would you do here?

Date myDate = (Date)getDate();
If you can't do it without Generics, you cannot do it with Generics either.

With Arrays.asList, sometimes it's necessary to specify the return type. Java attempts to "narrow" the return type based on the common supertype(s) of all the arguments, but sometimes you need a specific type. For example,
List<Number> list = Arrays.asList(1, 2, 3);
Here, Arrays.asList returns a List<Integer> and you get a compiler error. To get it to return a List<Number>, you must specify the generic type.
List<Number> list = Arrays.<Number>asList(1, 2, 3);
11、floor:返回此 set 中小于等于给定元素的最大元素;如果不存在这样的元素,则返回 null。
public E floor(E e) {
        return m.floorKey(e);
12、headSet:返回此 set 的部分视图,其元素严格小于 toElement。
public SortedSet<E> headSet(E toElement) {
        return headSet(toElement, false);
13、higher:返回此 set 中严格大于给定元素的最小元素;如果不存在这样的元素,则返回 null。
public E higher(E e) {
        return m.higherKey(e);
16、last:返回此 set 中当前最后一个(最高)元素。
public E last() {
        return m.lastKey();
17、lower:返回此 set 中严格小于给定元素的最大元素;如果不存在这样的元素,则返回 null。
public E lower(E e) {return m.lowerKey(e);}
22、subSet:返回此 set 的部分视图
3、ceiling:返回此 set 中大于等于给定元素的最小元素;如果不存在这样的元素,则返回 null。
public E ceiling(E e) {return m.ceilingKey(e);}
为什么会产生这样不同的结果呢?这是因为indexOf和binarySearch的实现机制不同,indexOf是基于equals来实现的只要equals返回TRUE就认为已经找到了相同的元素。而binarySearch是基于compareTo方法的,当compareTo返回0 时就认为已经找到了该元素。在我们实现的Student类中我们覆写了compareTo和equals方法,但是我们的compareTo、equals的比较依据不同,一个是基于age、一个是基于name。比较依据不同那么得到的结果很有可能会不同。所以知道了原因,我们就好修改了:将两者之间的比较依据保持一致即可。
Sooo… when you say “wrongly” do you mean, not assigning the subList to a new ArrayList?
Yes, particularly when you do not know how you are going to use the sublist.
"clear" is relocating objects in the underlying native array (an Object[]), but it doesn't resize the array. If you want reduce the array size after removing some items in the ArrayList, use trimToSize() method.
Unused element references of the array are set to null, so the elements could be garbage collected.
Two things I can think of are:
  1. list.sublist(0, 5) returns an empty list, therefore .clear() does nothing.
  2. Not sure of the inner workings of the List implementation you're using (ArrayList, LinkedList, etc), but having the equals and hashCode implemented may be important. I had a simiarl issue with Maps, where HashMap definitely needs the hashCode implementation.



public static void main(String[] args) {
        List<Integer> list1 = new ArrayList<Integer>();
        //通过subList生成一个与list1一样的列表 list3
        List<Integer> list3 = list1.subList(0, list1.size());
        System.out.println("list1'size:" + list1.size());
        System.out.println("list3'size:" + list3.size());
Exception in thread "main" java.util.ConcurrentModificationException
    at java.util.ArrayList$SubList.checkForComodification(Unknown Source)
    at java.util.ArrayList$SubList.size(Unknown Source)
    at com.chenssy.test.arrayList.SubListTest.main(
public int size() {
            return this.size;
private void checkForComodification() {
            if (ArrayList.this.modCount != this.modCount)
                throw new ConcurrentModificationException();
该方法表明当原列表的modCount与this.modCount不相等时就会抛出ConcurrentModificationException。同时我们知道modCount 在new的过程中 “继承”了原列表modCount,只有在修改该列表(子列表)时才会修改该值(先表现在原列表后作用于子列表)。而在该实例中我们是操作原列表,原列表的modCount当然不会反应在子列表的modCount上啦,所以才会抛出该异常。


for(int i = 0 ; i < list1.size() ; i++){
   if(i >= 100 && i <= 200){
        * 当然这段代码存在问题,list remove之后后面的元素会填充上来,
         * 所以需要对i进行简单的处理,当然这个不是这里讨论的问题。
list1.subList(100, 200).clear();

    protected void removeRange(int fromIndex, int toIndex) {
        int numMoved = size - toIndex;
        System.arraycopy(elementData, toIndex, elementData, fromIndex,

        // clear to let GC do its work
        int newSize = size - (toIndex-fromIndex);
        for (int i = newSize; i < size; i++) {
            elementData[i] = null;
        size = newSize;


    public List<E> subList(int fromIndex, int toIndex) {
        subListRangeCheck(fromIndex, toIndex, size);
        return new SubList(this, 0, fromIndex, toIndex);


        protected void removeRange(int fromIndex, int toIndex) {
            parent.removeRange(parentOffset + fromIndex,
                               parentOffset + toIndex);
            this.modCount = parent.modCount;
            this.size -= toIndex - fromIndex;



public static void main(String[] args) {
        int[] ints = {1,2,3,4,5};
        List list = Arrays.asList(ints);
        System.out.println("list'size:" + list.size());


public static void main(String[] args) {
        Integer[] ints = {1,2,3,4,5};
        List list = Arrays.asList(ints);
该实例就是讲ints通过asList转换为list 类别,然后再通过add方法加一个元素,这个实例简单的不能再简单了,但是运行结果呢?打出我们所料:
Exception in thread "main" java.lang.UnsupportedOperationException
    at java.util.AbstractList.add(Unknown Source)
    at java.util.AbstractList.add(Unknown Source)
    at com.chenssy.test.arrayList.AsListTest.main(
public static <T> List<T> asList(T... a) {
        return new ArrayList<>(a);
asList接受参数后,直接new 一个ArrayList,到这里看应该是没有错误的啊?别急,再往下看:
private static class ArrayList<E> extends AbstractList<E>
    implements RandomAccess,{
        private static final long serialVersionUID = -2764017481108945198L;
        private final E[] a;

        ArrayList(E[] array) {
            if (array==null)
                throw new NullPointerException();
            a = array;
public boolean add(E e) {
        add(size(), e);
        return true;
    public E set(int index, E element) {
        throw new UnsupportedOperationException();
    public void add(int index, E element) {
        throw new UnsupportedOperationException();
    public E remove(int index) {
        throw new UnsupportedOperationException();

A method may occasionally need to use

 a parameter for its return value - what might be loosely called an "output parameter" or a "result parameter". The caller creates an output parameter object, and then passes it to a method which changes the state of the object (its data). When the method returns, the caller then examines this new state.
if (annotation.annotationType().equals(javax.validation.Valid.class)){}

I would use a ByteArrayOutputStream. And on finish you can call:
new String( baos.toByteArray(), codepage );
or better
baos.toString( codepage );
For the String constructor the codepage can be a String or an instance of java.nio.charset.Charset. A possible value is java.nio.charset.StandardCharsets.UTF_8.
The method toString accept only a String as codepage parameter (stand Java 8).
A new feature was introduced by JDK 7 which allows to write numeric literals using the underscore character. Numeric literals are broken to enhance the readability.
         int inum = 1_00_00_000;
         System.out.println("inum:" + inum);
         long lnum = 1_00_00_000;
         System.out.println("lnum:" + lnum);
         float fnum = 2.10_001F;
         System.out.println("fnum:" + fnum);
         double dnum = 2.10_12_001;
         System.out.println("dnum:" + dnum);
Does JVM create an object of class Main?
The answer is “No”. We have studied that the reason for main() static in Java is to make sure that the main() can be called without any instance. To justify the same, we can see that the following program compiles and runs fine.
// Not Main is abstract
abstract class Main {
    public static void main(String args[])

Getting the inputstream from a classpath resource

Find element position in a Java TreeMap
An alternative solution would be to use TreeMap's headMap method. If the word exists in the TreeMap, then the size() of its head map is equal to the index of the word in the dictionary. 
            if (tm.containsKey(s)) {
                // Here is the operation you are looking for.
                // It does not work for items not in the dictionary.
                int pos = tm.headMap(s).size();
                System.out.println("Key '"+s+"' is at the position "+pos);
Sort map by value
First creating a java Comparator, this snippet will sort map values in ascending order by passing the comparator to Collections.sort.
    Comparator<Map.Entry<Integer, String>> byMapValues = new Comparator<Map.Entry<Integer, String>>() {
        public int compare(Map.Entry<Integer, String> left, Map.Entry<Integer, String> right) {
            return left.getValue().compareTo(right.getValue());
    // create a list of map entries
    List<Map.Entry<Integer, String>> candyBars = new ArrayList<Map.Entry<Integer, String>>();
    // add all candy bars
    // sort the collection
    Collections.sort(candyBars, byMapValues);

    Comparator<Entry<Integer, String>> byValue = (entry1, entry2) -> entry1.getValue().compareTo(
    Optional<Entry<Integer, String>> val = CANDY_BARS

Ordering<Map.Entry<Integer, String>> byMapValues = new Ordering<Map.Entry<Integer, String>>() {
   public int compare(Map.Entry<Integer, String> left, Map.Entry<Integer, String> right) {
        return left.getValue().compareTo(right.getValue());
List<Map.Entry<Integer, String>> candyBars = Lists.newArrayList(CANDY_BARS.entrySet());
Collections.sort(candyBars, byMapValues);
        Set<Entry<String, Integer>> set = map.entrySet();
        List<Entry<String, Integer>> list = new ArrayList<Entry<String, Integer>>(set);
        Collections.sort( list, new Comparator<Map.Entry<String, Integer>>()
            public int compare( Map.Entry<String, Integer> o1, Map.Entry<String, Integer> o2 )
                return (o2.getValue()).compareTo( o1.getValue() );
        } );
- See more at:


List<String> list = Arrays.asList(arr);
ArrayList<String> arrayList = new ArrayList<String>(Arrays.asList(arr));


Set<String> set = new HashSet<String>(Arrays.asList(arr));
return set.contains(targetValue);
在判断一个数组是否包含某个值的时候,推荐使用for循环遍历的形式或者使用Apache Commons类库中提供的ArrayUtils类的contains方法。


ArrayList<String> list = new ArrayList<String>(Arrays.asList("a","b","c","d"));
for(int i=0;i<list.size();i++){
ArrayList<String> list = new ArrayList<String>(Arrays.asList("a","b","c","d"));
for(String s:list){
ArrayList<String> list = new ArrayList<String>(Arrays.asList("a","b","c","d"));
for(String s:list){
迭代器(Iterator)是工作在一个独立的线程中,并且拥有一个 mutex 锁。 迭代器被创建之后会建立一个指向原来对象的单链索引表,当原来的对象数量发生变化时,这个索引表的内容不会同步改变,所以当索引指针往后移动的时候就找不到要迭代的对象,所以按照 fail-fast 原则 迭代器会马上抛出java.util.ConcurrentModificationException 异常。
ArrayList<String> list = new ArrayList<String>(Arrays.asList("a", "b", "c", "d"));
Iterator<String> iter = list.iterator();
while (iter.hasNext()) {
    String s =;

    if (s.equals("a")) {
在Java中如果 执行过多的流操作 或者 开启过多未关闭的Socket ,并且没有及时的关闭,就可能会出现 too many open files 的错误。这就是因为系统的文件句柄数不够了....
ulimit -n
ulimit -n 2048
# locale
很多系统的编码都是这个 C ,在 这边博客中 说,C是系统默认的Locale,默认由ANSI C来支持。也就是说默认的编码是ANSI C!
java -Dfile.encoding=UTF-8 xxxx
  • PRG="$0"
    PRGDIR=`dirname "$PRG"`
    [ -z "$ROOT_PATH" ] && ROOT_PATH=`cd "$PRGDIR/.." >/dev/null; pwd`
    [ -z "$JRE_HOME" ] && JRE_HOME=`cd "$ROOT_PATH/lib/jre" >/dev/null; pwd`
在同步静态方法里面的代码块时,有一点需要注意的是,在普通方法当中,我们使用的this或者new object(),在静态代码块里面是需要使用.class文件作为监视器。

Reference to a static methodClass::staticMethodNameString::valueOf
Reference to an instance method of a specific objectobject::instanceMethodNamex::toString
Reference to an instance method of a arbitrary object supplied laterClass::instanceMethodNameString::toString
Reference to a constructorClassName::newString::new
or as lambdas
KindSyntaxAs Lambda
Reference to a static methodClass::staticMethodName(s) -> String.valueOf(s)
Reference to an instance method of a specific objectobject::instanceMethodName() -> "hello".toString() 
Reference to an instance method of a arbitrary object supplied laterClass::instanceMethodName(s) -> s.toString()
Reference to a constructorClassName::new() -> new String()


Review (572) System Design (334) System Design - Review (198) Java (189) Coding (75) Interview-System Design (65) Interview (63) Book Notes (59) Coding - Review (59) to-do (45) Linux (43) Knowledge (39) Interview-Java (35) Knowledge - Review (32) Database (31) Design Patterns (31) Big Data (29) Product Architecture (28) MultiThread (27) Soft Skills (27) Concurrency (26) Cracking Code Interview (26) Miscs (25) Distributed (24) OOD Design (24) Google (23) Career (22) Interview - Review (21) Java - Code (21) Operating System (21) Interview Q&A (20) System Design - Practice (20) Tips (19) Algorithm (17) Company - Facebook (17) Security (17) How to Ace Interview (16) Brain Teaser (14) Linux - Shell (14) Redis (14) Testing (14) Tools (14) Code Quality (13) Search (13) Spark (13) Spring (13) Company - LinkedIn (12) How to (12) Interview-Database (12) Interview-Operating System (12) Solr (12) Architecture Principles (11) Resource (10) Amazon (9) Cache (9) Git (9) Interview - MultiThread (9) Scalability (9) Trouble Shooting (9) Web Dev (9) Architecture Model (8) Better Programmer (8) Cassandra (8) Company - Uber (8) Java67 (8) Math (8) OO Design principles (8) SOLID (8) Design (7) Interview Corner (7) JVM (7) Java Basics (7) Kafka (7) Mac (7) Machine Learning (7) NoSQL (7) C++ (6) Chrome (6) File System (6) Highscalability (6) How to Better (6) Network (6) Restful (6) CareerCup (5) Code Review (5) Hash (5) How to Interview (5) JDK Source Code (5) JavaScript (5) Leetcode (5) Must Known (5) Python (5)

Popular Posts