Massive Technical Interviews Tips: Java Misc

Thursday, December 10, 2015

Java Misc

http://zhangyi.xyz/usage-of-java-nested-class

嵌套类在这其中，扮演了极为重要的角色。它既丰富了类的层次，又可以灵活控制内部结构的访问限制与粒度，使得我们在开放性与封闭性之间、公开接口与内部实现之间取得适度的平衡。

嵌套类之所以能扮演这样的设计平衡角色，是因为嵌套类与其主类的关系不同，主类的所有成员对嵌套类而言都是完全开放的，主类的私有成员可以被嵌套类访问，而嵌套类则可以被看做是类边界中自成一体的高内聚类而被主类调用。因此，嵌套类的定义其实就是对类内部成员的进一步封装。虽然在一个类中定义嵌套类并不能减小类定义的规模，但由于嵌套类体现了不同层次的封装，使得一个相对较大的主类可以显得更有层次感，不至于因为成员过多而显得过于混乱。

当一个类的业务逻辑非常复杂，且它承担的职责却又不足以单独分离为另外的类型时，内部嵌套类尤其是静态嵌套类就会变得非常有用

封装Builder

Builder模式常常用于组装一个类，它通过更加流畅的接口形式简化构建组成元素的逻辑。在Java中，除非必要，我们一般会将一个类的Builder定义为内部嵌套类。这几乎已经成为一种惯有模式了。

例如框架airlift定义了Request类，它是客户端请求对象的一个封装。组成一个Request对象需要诸如Uri、header、http verb、body等元素，且这些元素的组成会因为客户请求的不同而调用不同的组装方法。这是典型的Builder模式应用场景

https://github.com/prestodb/presto/blob/master/presto-jdbc/src/main/java/com/facebook/presto/jdbc/PrestoResultSet.java

封装Iterator

当我们需要在类中提供自定义的迭代器，且又不需要将该迭代器的实现对外公开时，都可以通过嵌套类实现一个内部迭代器。这是迭代器模式的一种惯用法

当然，我们必须谨记的一点是：究竟职责应该定义在独立类中，还是定义在附属的嵌套类，判断标准不是看代码量以及类的规模，而是看这些需要封装的逻辑究竟属于内部概念，还是外部概念。

例如Presto框架的HiveWriterFactory类。它是一个工厂类，创建的产品为HiveWriter。在创建过程中，需要根据列名、列类型以及列的Hive类型进行组合，并将组合后的结果赋值给Properties类型的schema。

这些值并不需要公开，如果不对其进行封装，则存在问题：

没有体现领域概念，只有散乱的三种类型的变量
无法将其作为整体放入到集合中，因而也无法调用集合的API对其进行转换

http://programming.guide/java/emptylist-vs-new-collection.html

The main difference between new ArrayList<>() and Collections.emptyList() (or the slightly shorter variant List.of() introduced in JDK 9) is that the latter returns an immutable list, i.e., a list to which you cannot add elements.

In addition, Collections.emptyList()/List.of() avoids creating a new object. From the javadoc:

Implementations of this method need not create a separate List object for each call. Using this method is likely to have comparable cost to using the like-named field. (Unlike this method, the field does not provide type safety.)

The standard implementation of emptyList looks as follows:

public static final <T> List<T> emptyList() {
    return (List<T>) EMPTY_LIST;
}

So if this is a hotspot in your code, there’s even a small performance argument to be made.

https://stackoverflow.com/questions/1669282/find-max-value-in-java-with-a-predefined-comparator

If Foo implements Comparable<Foo>, then Collections.max(Collection) is what you're looking for.

If not, you can create a Comparator<Foo> and use Collections.max(Collection, Comparator) instead.

public static <E> Queue<E> checkedQueue(Queue<E> queue, Class<E> type) {

return new CheckedQueue<>(queue, type);

}

java.util.Collections.CheckedCollection.typeCheck(Object)

E typeCheck(Object o) {

if (o != null && !type.isInstance(o))

throw new ClassCastException(badElementMsg(o));

return (E) o;

}

http://blog.joda.org/2014/02/turning-off-doclint-in-jdk-8-javadoc.html

With JDK 8, you are unable to get Javadoc unless your tool meets the standards of doclint. Some of its rules are:

no self-closed HTML tags, such as <br /> or <a id="x" />
no unclosed HTML tags, such as <ul> without matching </ul>
no invalid HTML end tags, such as </br>
no invalid HTML attributes, based on doclint's interpretation of W3C HTML 4.01
no duplicate HTML id attribute
no empty HTML href attribute
no incorrectly nested headers, such as class documentation must have <h3>, not <h4>
no invalid HTML tags, such as List<String> (where you forgot to escape using <)
no broken @link references
no broken @param references, they must match the actual parameter name
no broken @throws references, the first word must be a class name

If you are running from maven, you need to use the additionalparam setting, as per the manual. Either add it as a global property:

  <properties>
    <additionalparam>-Xdoclint:none</additionalparam>
  </properties>

or add it to the maven-javadoc-plugin:

  <plugins>
    <plugin>
      <groupId>org.apache.maven.plugins</groupId>
      <artifactId>maven-javadoc-plugin</artifactId>
      <configuration>
        <additionalparam>-Xdoclint:none</additionalparam>
      </configuration>
    </plugin>
  </plugins>

Ant also uses additionalparam to pass in -Xdoclint:none, see the manual.

Gradle does not expose additionalparam but Tim Yates and Cedric Champeau advise of this solution:

  if (JavaVersion.current().isJava8Compatible()) {
    allprojects {
      tasks.withType(Javadoc) {
        options.addStringOption('Xdoclint:none', '-quiet')
      }
    }
  }

https://stackoverflow.com/questions/1417190/should-a-static-final-logger-be-declared-in-upper-case
https://google.github.io/styleguide/javaguide.html#s5.2.4-constant-names

The logger reference is not a constant, but a final reference, and should NOT be in uppercase. A constant VALUE should be in uppercase.

private static final Logger logger = Logger.getLogger(MyClass.class);

private static final double MY_CONSTANT = 0.0;

Constants are static final fields whose contents are deeply immutable and whose methods have no detectable side effects. This includes primitives, Strings, immutable types, and immutable collections of immutable types. If any of the instance's observable state can change, it is not a constant. Merely intending to never mutate the object is not enough. Examples:

static final Logger logger = Logger.getLogger(MyClass.getName());
static final String[] nonEmptyArray = {"these", "can", "change"};

Class and member modifiers, when present, appear in the order recommended by the Java Language Specification:

public protected private abstract default static final transient volatile synchronized native strictfp

https://stackoverflow.com/questions/41323735/is-actively-throwing-assertionerror-in-java-good-practice
https://github.com/google/guava/wiki/ConditionalFailuresExplained

Kind of check	The throwing method is saying...	Commonly indicated with...
Precondition	"You messed up (caller)."	`IllegalArgumentException`, `IllegalStateException`
Assertion	"I messed up."	`assert`, `AssertionError`
Verification	"Someone I depend on messed up."	`VerifyException`
Test assertion	"The code I'm testing messed up."	`assertThat`, `assertEquals`, `AssertionError`
Impossible condition	"What the? the world is messed up!"	`AssertionError`
Exceptional result	"No one messed up, exactly (at least in this VM)."	other checked or unchecked exceptions

The page says an AssertionError is the recommended way to handle these cases. The comments in their Verify class also offers some useful insights about choosing exceptions. In cases where AssertionError seems too strong raising a VerifyException can be a good compromise.

https://stackoverflow.com/questions/861296/why-does-the-java-util-setv-interface-not-provide-a-getobject-o-method
HashSet doesn't provide get method
https://stackoverflow.com/questions/861296/why-does-the-java-util-setv-interface-not-provide-a-getobject-o-method

http://www.programmr.com/blogs/two-things-every-java-developer-should-know-about-booleans
Boolean b = Boolean.FALSE;

Don't Expect a primitive Boolean value to occupy only 1 bit

It's logical to think that a boolean primitive value will take 1 bit of memory, and unfortunately it's a fallacy that many Java developers have bought into. Boolean values in Java almost always take more than a bit - significantly more.

There are no Java virtual machine instructions solely dedicated to operations on boolean values. Instead, expressions in the Java programming language that operate on boolean values are compiled to use values of the Java virtual machine int data type. -- The Java Virtual Machine Specification

What this means is boolean values always take more than one byte, but how much more depends where the value is being stored: in the stack, or on the heap.

The JVM uses a 32-bit stack cell, which will cause each boolean value to occupy an entire stack cell of 32 bits. However, the size of boolean values on the heap are implementation dependent. The semantics of storing objects and data on the heap falls under the 'private implementation' rules of the JVM. This means the JVM implementation can choose the size of boolean values on the heap.

Boolean arrays, however, are encoded as arrays of bytes, giving 8 bits to every boolean element of the array:

In Oracle’s Java Virtual Machine implementation, boolean arrays in the Java programming language are encoded as Java Virtual Machine byte arrays, using 8 bits per boolean element.

And finally, according to this answer on SO, Boolean object take 16 bits of memory:

header:   8 bytes 
value:    1 byte 
padding:  7 bytes
------------------
sum:      16 bytes

https://stackoverflow.com/questions/20948361/why-does-the-boolean-data-type-need-8-bits/20948403#20948403

Although the Java Virtual Machine defines a boolean type, it only provides very limited support for it. There are no Java Virtual Machine instructions solely dedicated to operations on boolean values. Instead, expressions in the Java programming language that operate on boolean values are compiled to use values of the Java Virtual Machine int data type.

In Oracle’s Java Virtual Machine implementation, boolean arrays in the Java programming language are encoded as Java Virtual Machine byte arrays, using 8 bits per boolean element.

For example Boolean type looks in memory like this

header:   8 bytes 
value:    1 byte 
padding:  7 bytes
------------------
sum:      16 bytes

As an alternative to boolean[] you can use for example java.util.BitSet.

Why is hard to store booleans as 1 bit? Read Vlad from Moscow answer. You cant address one bit of memory.

It depends of addressability of the memory. The least addressable unit is byte. You can take an address of a byte and do the address arithmetic with it. Moreover there are built-in machine commands that operate with bytes. However it is impossible to take the address of a bit and perform the address arithmetic. In any case at first you have to calculate the address of the byte that contains the target bit and apply additional machine commands that depend of the position of the bit in the byte that to set or reset this bit.

https://stackoverflow.com/questions/383551/what-is-the-size-of-a-boolean-variable-in-java

That suggests that booleans can basically be packed into a byte each by Sun's JVM.

The actual information represented by a boolean value in Java is one bit: 1 for true, 0 for false. However, the actual size of a boolean variable in memory is not precisely defined by the Java specification. See Primitive Data Types in Java.

The boolean data type has only two possible values: true and false. Use this data type for simple flags that track true/false conditions. This data type represents one bit of information, but its "size" isn't something that's precisely defined.

http://www.oracle.com/technetwork/articles/java/java8-optional-2175753.html

Do Something If a Value Is Present

Now that you have an Optional object, you can access the methods available to explicitly deal with the presence or absence of values. Instead of having to remember to do a null check, as follows:

SoundCard soundcard = ...;
if(soundcard != null){
  System.out.println(soundcard);
}

You can use the ifPresent() method, as follows:

Optional<Soundcard> soundcard = ...;
soundcard.ifPresent(System.out::println);

Default Values and Actions

A typical pattern is to return a default value if you determine that the result of an operation is null. In general, you can use the ternary operator, as follows, to achieve this:

Soundcard soundcard = 
  maybeSoundcard != null ? maybeSoundcard 
            : new Soundcard("basic_sound_card");

Using an Optional object, you can rewrite this code by using the orElse() method, which provides a default value if Optional is empty:

Soundcard soundcard = maybeSoundcard.orElse(new Soundcard("defaut"));

Similarly, you can use the orElseThrow() method, which instead of providing a default value if Optional is empty, throws an exception:

Soundcard soundcard = 
  maybeSoundCard.orElseThrow(IllegalStateException::new);

Rejecting Certain Values Using the `filter` Method

Often you need to call a method on an object and check some property. For example, you might need to check whether the USB port is a particular version. To do this in a safe way, you first need to check whether the reference pointing to a USB object is null and then call the getVersion() method, as follows:

USB usb = ...;
if(usb != null && "3.0".equals(usb.getVersion())){
  System.out.println("ok");
}

This pattern can be rewritten using the filter method on an Optional object, as follows:

Optional<USB> maybeUSB = ...;
maybeUSB.filter(usb -> "3.0".equals(usb.getVersion())
                    .ifPresent(() -> System.out.println("ok"));

The filter method takes a predicate as an argument. If a value is present in the Optional object and it matches the predicate, the filter method returns that value; otherwise, it returns an empty Optional object. You might have seen a similar pattern already if you have used the filter method with the Stream interface.

Extracting and Transforming Values Using the `map` Method

Another common pattern is to extract information from an object. For example, from a Soundcard object, you might want to extract the USB object and then further check whether it is of the correct version. You would typically write the following code:

if(soundcard != null){
  USB usb = soundcard.getUSB();
  if(usb != null && "3.0".equals(usb.getVersion()){
    System.out.println("ok");
  }
}

We can rewrite this pattern of "checking for null and extracting" (here, the Soundcard object) using the map method.

Optional<USB> usb = maybeSoundcard.map(Soundcard::getUSB);

There's a direct parallel to the map method used with streams. There, you pass a function to the map method, which applies this function to each element of a stream. However, nothing happens if the stream is empty.

The map method of the Optional class does exactly the same: the value contained inside Optional is "transformed" by the function passed as an argument (here, a method reference to extract the USB port), while nothing happens if Optional is empty.

Finally, we can combine the map method with the filter method to reject a USB port whose version is different than 3.0:

maybeSoundcard.map(Soundcard::getUSB)
      .filter(usb -> "3.0".equals(usb.getVersion())
      .ifPresent(() -> System.out.println("ok"));

https://stackoverflow.com/questions/35623981/invalid-argument-to-operation/45596375#45596375

Why map.get(key)++ not work but Integer i =1; i++ works:

https://bukkit.org/threads/invalid-argument-to-operation.336350/

increment and decrement operators only work on variables, not values.

Just like you can't do: 1++;

Interesting that the foreach loop for Java's Stack iterates from the bottom of the stack.

http://www.cnblogs.com/javanerd/p/6646307.html

这其实是一个JDK中的bug，http://bugs.java.com/bugdatabase/view_bug.do?bug_id=4475301 里有很详细的说明了，大意就是Stack继承了java.util.Vector，所以用的Vector中的iterator，所以在遍历元素的时候会按照FIFO的顺序遍历元素。原文如下：

It was an incorrect design decision to have Stack extend Vector ("is-a" rather than "has-a"). We sympathize with the submitter but cannot fix this because
of compatibility.

1. 不要用for循环，如下这种方式遍历Stack，注意这样遍历完一次，stack就空了。

while (!stack.isEmpty()) {
      Object o = stack.pop();
      System.out.println(o);
}

Deque<Integer> stack2 = new ArrayDeque<Integer>();
    stack2.push(1);
    stack2.push(2);
    stack2.push(3);
    for (Integer i : stack2) {
        System.out.println(i);
    }

What is a reasonable order of Java modifiers (abstract, final, public, static, etc.)?
http://cr.openjdk.java.net/~alundblad/styleguide/index-v6.html#toc-modifiers

Modifiers should go in the following order

Access modifier (public / private / protected)
abstract
static
final
transient
volatile
default
synchronized
native
strictfp

Modifiers should not be written out when they are implicit. For example, interface methods should neither be declared public nor abstract, and nested enums and interfaces should not be declared static.

Method parameters and local variables should not be declared final unless it improves readability or documents an actual design decision.

Fields should be declared final unless there is a compelling reason to make them mutable.

Writing out modifiers where they are implicit clutters the code and learning which modifiers are implicit where is easy.
Although method parameters should typically not be mutated, consistently marking all parameters in every methods as final is an exaggeration.
Making fields immutable where possible is good programming practice. Refer to Effective Java, Item 15: Minimize Mutability for details.

https://stackoverflow.com/questions/16731240/what-is-a-reasonable-order-of-java-modifiers-abstract-final-public-static-e
https://news.ycombinator.com/item?id=3906574
No, that’s not endianness; endianness refers to the ordering of bytes within a multi-byte value—least significant byte first or most significant byte first, generally. The order of octets in a UTF-8 code point is fixed, and because a bit is not an addressable unit of memory, the storage order of bits within an octet is immaterial.

It's endianness independent in the sense that the order in which you interpret the bytes in each character does not depend on the processor architecture, unlike UTF-16.

If your processor interprets the bits in each byte in a different order, that might be a problem, but it's not what we're talking about when we usually talk about the endianness of character encodings.

http://en.wikipedia.org/wiki/Endianness

es, basically. UTF-8 doesn't encode a code point as a single integer; it encodes it as a sequence of bytes, with a particular order, where some of the bits are used to represent the code point, and some of them are just used to represent whether you are looking at an initial byte or a continuation byte.

I'd recommend checking out the description of UTF-8 in Wikipdia. The tables make it fairly clear how the encoding works: http://en.wikipedia.org/wiki/UTF-8#Description

https://en.wikipedia.org/wiki/UTF-8#Description

Since the restriction of the Unicode code-space to 21-bit values in 2003, UTF-8 is defined to encode code points in one to four bytes, depending on the number of significant bits in the numerical value of the code point. The following table shows the structure of the encoding. The x characters are replaced by the bits of the code point. If the number of significant bits is no more than 7, the first line applies; if no more than 11 bits, the second line applies, and so on.

Number of bytes	Bits for code point	First code point	Last code point	Byte 1	Byte 2	Byte 3	Byte 4
1	7	U+0000	U+007F	`0xxxxxxx`
2	11	U+0080	U+07FF	`110xxxxx`	`10xxxxxx`
3	16	U+0800	U+FFFF	`1110xxxx`	`10xxxxxx`	`10xxxxxx`
4	21	U+10000	U+10FFFF	`11110xxx`	`10xxxxxx`	`10xxxxxx`	`10xxxxxx`

The first 128 characters (US-ASCII) need one byte. The next 1,920 characters need two bytes to encode, which covers the remainder of almost all Latin-script alphabets, and also Greek, Cyrillic, Coptic, Armenian, Hebrew, Arabic, Syriac, Thaana and N'Ko alphabets, as well as Combining Diacritical Marks. Three bytes are needed for characters in the rest of the Basic Multilingual Plane, which contains virtually all characters in common use^[10] including most Chinese, Japanese and Korean characters. Four bytes are needed for characters in the other planes of Unicode, which include less common CJK characters, various historic scripts, mathematical symbols, and emoji (pictographic symbols).

Backward compatibility: Backwards compatibility with ASCII and the enormous amount of software designed to process ASCII-encoded text was the main driving force behind the design of UTF-8. In UTF-8, single bytes with values in the range of 0 to 127 map directly to Unicode code points in the ASCII range. Single bytes in this range represent characters, as they do in ASCII. Moreover, 7-bit bytes (bytes where the most significant bit is 0) never appear in a multi-byte sequence, and no valid multi-byte sequence decodes to an ASCII code-point.

Prefix code: The first byte indicates the number of bytes in the sequence. Reading from a stream can instantaneously decode each individual fully received sequence, without first having to wait for either the first byte of a next sequence or an end-of-stream indication. The length of multi-byte sequences is easily determined by humans as it is simply the number of high-order 1s in the leading byte. An incorrect character will not be decoded if a stream ends mid-sequence.

Self-synchronization: The leading bytes and the continuation bytes do not share values (continuation bytes start with 10 while single bytes start with 0 and longer lead bytes start with 11). This means a search will not accidentally find the sequence for one character starting in the middle of another character. It also means the start of a character can be found from a random position by backing up at most 3 bytes to find the leading byte. An incorrect character will not be decoded if a stream starts mid-sequence, and a shorter sequence will never appear inside a longer one.

https://stackoverflow.com/questions/4655250/difference-between-utf-8-and-utf-16

They're simply different schemes for representing Unicode characters.

Both are variable-length - UTF-16 uses 2 bytes for all characters in the basic multilingual plane (BMP) which contains most characters in common use.

UTF-8 uses between 1 and 3 bytes for characters in the BMP, up to 4 for characters in the current Unicode range of U+0000 to U+1FFFFF, and is extensible up to U+7FFFFFFF if that ever becomes necessary... but notably all ASCII characters are represented in a single byte each.

For the purposes of a message digest it won't matter which of these you pick, so long as everyone who tries to recreate the digest uses the same option.

(Note that all Java characters are UTF-16 code points within the BMP; to represent characters above U+FFFF you need to use surrogate pairs in Java.)

https://stackoverflow.com/questions/8923866/convert-utf8-to-utf16-using-iconv

UTF-16LE tells iconv to generate little-endian UTF-16 without a BOM (Byte Order Mark). Apparently it assumes that since you specified LE, the BOM isn't necessary.

UTF-16 tells it to generate UTF-16 text (in the local machine's byte order) with a BOM.

If you're on a little-endian machine, I don't see a way to tell iconv to generate big-endian UTF-16 with a BOM, but I might just be missing something.

https://stackoverflow.com/questions/701624/difference-between-big-endian-and-little-endian-byte-order

Big-Endian (BE) / Little-Endian (LE) are two ways to organize multi-byte words. For example, when using two bytes to represent a character in UTF-16, there are two ways to represent the character 0x1234 as a string of bytes (0x00-0xFF):

Byte Index:      0  1
---------------------
Big-Endian:     12 34
Little-Endian:  34 12

In order to decide if a text uses UTF-16BE or UTF-16LE, the specification recommends to prepend a Byte Order Mark (BOM) to the string, representing the character U+FEFF. So, if the first two bytes of a UTF-16 encoded text file are FE, FF, the encoding is UTF-16BE. For FF, FE, it is UTF-16LE.

A visual example: The word "Example" in different encodings (UTF-16 with BOM):

Byte Index:   0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
------------------------------------------------------------
ASCII:       45 78 61 6d 70 6c 65
UTF-16BE:    FE FF 00 45 00 78 00 61 00 6d 00 70 00 6c 00 65
UTF-16LE:    FF FE 45 00 78 00 61 00 6d 00 70 00 6c 00 65 00

UTF-16 (16-bit Unicode Transformation Format) is a character encoding capable of encoding all 1,112,064 valid code points of Unicode.
http://iwillgetthatjobatgoogle.tumblr.com/post/32389142173/java-limit-for-recursion

Result for the example (for the current JVM) is about 10 000. It’s the max depth for the current JVM with default settings.

Default stack size for the machine is 320 kb, so it means that for this case each recursive call uses 32 extra bytes (for a program counter, method variables etc).

Avoid using recursion (especially with a lot of additional variables), because:

It really increases amount of memory which is being used
Usually isn’t effective
Can force stack overflow
It stores all the arguments in RAM

https://stackoverflow.com/questions/4734108/what-is-the-maximum-depth-of-the-java-call-stack

I tested on my system and didn't find any constant value, sometimes stack overflow occurs after 8900 calls, sometimes only after 7700, random numbers.

public class MainClass {

    private static long depth=0L;

    public static void main(String[] args){
        deep(); 
    }

    private static void deep(){
        System.err.println(++depth);
        deep();
    }

}

In Java it crashed at 8027; in Scala it got up to 8594755 before I got bored.

It depends on the amount of virtual memory allocated to the stack.

http://www.odi.ch/weblog/posting.php?posting=411

You can tune this with the -Xss VM parameter or with the Thread(ThreadGroup, Runnable, String, long) constructor.

https://docs.oracle.com/javase/8/docs/api/java/time/Duration.html#parse-java.lang.CharSequence-

This will parse a textual representation of a duration, including the string produced by toString(). The formats accepted are based on the ISO-8601 duration format PnDTnHnMn.nS with days considered to be exactly 24 hours.

The string starts with an optional sign, denoted by the ASCII negative or positive symbol. If negative, the whole period is negated. The ASCII letter "P" is next in upper or lower case. There are then four sections, each consisting of a number and a suffix. The sections have suffixes in ASCII of "D", "H", "M" and "S" for days, hours, minutes and seconds, accepted in upper or lower case. The suffixes must occur in order. The ASCII letter "T" must occur before the first occurrence, if any, of an hour, minute or second section. At least one of the four sections must be present, and if "T" is present there must be at least one section after the "T". The number part of each section must consist of one or more ASCII digits. The number may be prefixed by the ASCII negative or positive symbol. The number of days, hours and minutes must parse to an long. The number of seconds must parse to an long with optional fraction. The decimal point may be either a dot or a comma. The fractional part may have from zero to 9 digits.

The leading plus/minus sign, and negative values for other units are not part of the ISO-8601 standard.

Examples:

    "PT20.345S" -- parses as "20.345 seconds"
    "PT15M"     -- parses as "15 minutes" (where a minute is 60 seconds)
    "PT10H"     -- parses as "10 hours" (where an hour is 3600 seconds)
    "P2D"       -- parses as "2 days" (where a day is 24 hours or 86400 seconds)
    "P2DT3H4M"  -- parses as "2 days, 3 hours and 4 minutes"
    "P-6H3M"    -- parses as "-6 hours and +3 minutes"
    "-P6H3M"    -- parses as "-6 hours and -3 minutes"
    "-P-6H+3M"  -- parses as "+6 hours and -3 minutes"

https://docs.oracle.com/javase/tutorial/datetime/iso/period.html

When you write code to specify an amount of time, use the class or method that best meets your needs: the Duration class, Period class, or the ChronoUnit.between method. A Durationmeasures an amount of time using time-based values (seconds, nanoseconds). A Period uses date-based values (years, months, days).

Note: A Duration of one day is exactly 24 hours long. A Period of one day, when added to a ZonedDateTime, may vary according to the time zone. For example, if it occurs on the first or last day of daylight saving time.

https://stackoverflow.com/questions/6403851/parsing-time-strings-like-1h-30min

Duration parsing is now included in Java 8. Use standard ISO 8601 format with Duration.parse.

Duration d = Duration.parse("PT1H30M")

You can convert this duration to the total length in milliseconds. Beware that Duration has a resolution of nanoseconds, so you may have data loss going from nanoseconds to milliseconds.

long milliseconds = d.toMillis();

https://stackoverflow.com/questions/4128436/query-string-manipulation-in-java

String queryString = "variableA=89&variableB=100";
Map<String,String> queryParameters = Splitter
    .on("&")
    .withKeyValueSeparator("=")
    .split(queryString);
System.out.println(queryParameters.get("variableA"));

prints out

This I think is a very readable alternative to parsing it yourself.

Edit: As @raulk pointed out, this solution does not account for escaped characters. However, this may not be an issue because before you URL-Decode, the query string is guaranteed to not have any escaped characters that conflict with '=' and '&'. You can use this to your advantage in the following way.

https://commons.apache.org/proper/commons-collections/apidocs/org/apache/commons/collections4/map/CaseInsensitiveMap.html


  Map<String, String> map = new CaseInsensitiveMap<String, String>();
  map.put("One", "One");
  map.put("Two", "Two");
  map.put(null, "Three");
  map.put("one", "Four");

creates a CaseInsensitiveMap with three entries.
map.get(null) returns "Three" and map.get("ONE") returns "Four". The Set returned by keySet() equals {"one", "two", null}.

This map will violate the detail of various Map and map view contracts. As a general rule, don't compare this map to other maps. In particular, you can't use decorators like ListOrderedMap on it, which silently assume that these contracts are fulfilled.

How to replace case-insensitive literal substrings in Java
https://stackoverflow.com/questions/5054995/how-to-replace-case-insensitive-literal-substrings-in-java

String target = "FOOBar";
target = target.replaceAll("(?i)foo", "");

Case insensitive string as HashMap key
https://stackoverflow.com/questions/8236945/case-insensitive-string-as-hashmap-key

Map<String, String> nodeMap = 
    new TreeMap<String, String>(String.CASE_INSENSITIVE_ORDER);

public class CaseInsensitiveMap extends HashMap<String, String> {

    @Override
    public String put(String key, String value) {
       return super.put(key.toLowerCase(), value);
    }

    // not @Override because that would require the key parameter to be of type Object
    public String get(String key) {
       return super.get(key.toLowerCase());
    }
}

    SortedSet<String> eliminatedDups2 = new TreeSet<String>(IGNORE_CASE);

    Set<String> s1 = new TreeSet<String>(String.CASE_INSENSITIVE_ORDER);

https://www.beyondjava.net/blog/how-to-invoke-jsr-303-bean-validation-programmatically/

      ValidatorFactory factory = Validation.buildDefaultValidatorFactory();

      Validator validator = factory.getValidator();

      Set<ConstraintViolation<AdditionBean>> errors = validator.validate(bean);

replaceAll("\\s{2,}", " ").trim();

https://docs.oracle.com/javase/tutorial/java/javaOO/nested.html

Compelling reasons for using nested classes include the following:

It is a way of logically grouping classes that are only used in one place: If a class is useful to only one other class, then it is logical to embed it in that class and keep the two together. Nesting such "helper classes" makes their package more streamlined.
It increases encapsulation: Consider two top-level classes, A and B, where B needs access to members of A that would otherwise be declared private. By hiding class B within class A, A's members can be declared private and B can access them. In addition, B itself can be hidden from the outside world.
It can lead to more readable and maintainable code: Nesting small classes within top-level classes places the code closer to where it is used.

Static nested classes are accessed using the enclosing class name:

OuterClass.StaticNestedClass

For example, to create an object for the static nested class, use this syntax:

OuterClass.StaticNestedClass nestedObject =
     new OuterClass.StaticNestedClass();

Objects that are instances of an inner class exist within an instance of the outer class. Consider the following classes:

class OuterClass {
    ...
    class InnerClass {
        ...
    }
}

An instance of InnerClass can exist only within an instance of OuterClass and has direct access to the methods and fields of its enclosing instance.

To instantiate an inner class, you must first instantiate the outer class. Then, create the inner object within the outer object with this syntax:

OuterClass.InnerClass innerObject = outerObject.new InnerClass();

http://stackoverflow.com/questions/36157340/elidedsemicolonandrightbrace-expected
Syntax error on token ")", ElidedSemicolonAndRightBrace expected
This error simply means that you have too many closing parenthesis.

text = text.startsWith(",") ? text.substring(1) : text;

http://stackoverflow.com/questions/36705880/concatenate-string-values-with-delimiter-handling-null-and-empty-strings-in-java

String joined = 
    Stream.of(val1, val2, val3, val4)
          .filter(s -> s != null && !s.isEmpty())
          .collect(joining(","));

http://blog.csdn.net/u014688145/article/details/53148607
public static void swap(Integer a,Integer b){ Integer temp =a; a = b; b = temp; }

进入swap方法体后，a = integerA,b =integerB,引用a和引用b，copy了实际变量integerA和integerB，也就是说，虽然方法体内完成了对引用的交换，但a和b分别为躺在内存中的实际数据2和3的另外一个指向罢了。方法体中完成了交换，却不影响integerA和integerB的指向。那跟基本类型的值传递有何区别，基本类型的传递是拷贝内存单元的实际数据，即内存单元中存在两份一模一样的数据，分别由变量a和方法体内的a表示，而引用传递，在内存单元中只存放了一份实际数据，只是变量integerA和方法体内的a均指向该内存单元。

MyInteger integerA = new MyInteger(2); MyInteger integerB = new MyInteger(3); System.out.println("交换前 -> "+integerA+":"+integerB); swap(integerA, integerB); System.out.println("交换前 -> "+integerA+":"+integerB); public static void swap(MyInteger a,MyInteger b){ MyInteger temp =new MyInteger(a.getNum());//拷贝一份新的值 a.setNum(b.getNum());;//a为b的值 b.setNum(temp.getNum());//b为a的值 }

http://www.geeksforgeeks.org/method-overloading-ambiguity-varargs-java/

    static void fun(int ... a) 

{

        System.out.print("fun(int ...): " +

                "Number of args: " + a.length +

                " Contents: ");

        // using for each loop to display contents of a

        for(int x : a)

            System.out.print(x + " ");

        System.out.println();

}

    // A method that takes varargs(here booleans).

    static void fun(boolean ... a)

{

        System.out.print("fun(boolean ...) " +

                "Number of args: " + a.length +

                " Contents: ");

        // using for each loop to display contents of a

        for(boolean x : a)

            System.out.print(x + " ");

        System.out.println();

}

fun(); // Error: Ambiguous!

According to (JLS 15.2.2), there are 3 phases used in overload resolution: First phase performs overload resolution without permitting boxing or unboxing conversion, Second phase performs overload resolution while allowing boxing and unboxing and Third phase allows overloading to be combined with variable arity methods, boxing, and unboxing. If no applicable method is found during these phases, then ambiguity occurs.
The call above could be translated into a call to fun(int …) or fun(boolean …). Both are equally valid and do not be resolved after all three phases of overload resolution because both the data types are different. Thus, the call is inherently ambiguous.

The following overloaded versions of fun( )are inherently ambiguous:

static void fun(int ... a) { // method body  }
static void fun(int n, int ... a) { //method body }

Here, although the parameter lists of fun( ) differ, there is no way for the compiler to resolve the following call:

fun(1)

This call may resolve to fun(int … a) or fun(int n, int … a) method, thus creating ambiguity. To solve these ambiguity errors like above, we will need to forego overloading and simply use two different method names.

http://www.geeksforgeeks.org/double-brace-initialization-java/

        Set<String> sets = new HashSet<String>()

{

{

                add("one");

                add("two");

                add("three");

}

};

The first brace creates a new Anonymous Inner Class. These inner classes are capable of accessing the behavior of their parent class. So, in our case, we are actually creating a subclass of HashSet class, so this inner class is capable of using add() method.

The second braces are instance initializers. The code an instance initializers inside is executed whenever an instance is created.

http://stackoverflow.com/questions/1883345/whats-up-with-javas-n-in-printf

There is also one specifier that doesn't correspond to an argument. It is "%n" which outputs a line break. A "\n" can also be used in some cases, but since "%n" always outputs the correct platform-specific line separator, it is portable across platforms whereas"\n" is not.

http://www.javaquery.com/2015/06/what-is-difference-between.html
String.valueOf() is null safe. You can avoid java.lang.NullPointerException by using it. It returns "null" String when Object is null.

toString() can cause the java.lang.NullPointerException. It throws java.lang.NullPointerException when Object is null and also terminate the execution of program in case its not handled properly.
http://stackoverflow.com/questions/1187093/can-i-escape-braces-in-a-java-messageformat

The documentation for MessageFormat knows the answer:

Within a String, "''" represents a single quote. A QuotedString can contain arbitrary characters except single quotes; the surrounding single quotes are removed. AnUnquotedString can contain arbitrary characters except single quotes and left curly brackets. Thus, a string that should result in the formatted message "'{0}'" can be written as "'''{'0}''" or "'''{0}'''".

You can put them inside single quotes e.g.

'{'return {2};'}'

http://joel.barciausk.as/2008/12/10/a-guide-to-java-messageformat/

https://en.wikipedia.org/wiki/.properties

# The key and element characters #, !, =, and : are written with
# a preceding backslash to ensure that they are properly loaded.
website = http\://en.wikipedia.org/
language = English
# The backslash below tells the application to continue reading
# the value onto the next line.
message = Welcome to \
          Wikipedia!
# Add spaces to the key
key\ with\ spaces = This is the value that could be looked up with the key "key with spaces".
# Unicode
tab : \u0009

https://docs.oracle.com/cd/E23095_01/Platform.93/ATGProgGuide/html/s0204propertiesfileformat01.html

path=c:\\docs\\doc1

https://www.onehippo.org/library/development/editing-properties-files.html

http://stackoverflow.com/questions/10194855/java-map-key-class-value-instance-of-that-class

// Typesafe heterogeneous container pattern - implementation
public class Favorites {
  private Map<Class<?>, Object> favorites =
    new HashMap<Class<?>, Object>();

  public <T> void putFavorite(Class<T> type, T instance) {
    if (type == null)
      throw new NullPointerException("Type is null");
    favorites.put(type, instance);
  }

  public <T> T getFavorite(Class<T> type) {
    return type.cast(favorites.get(type));
  }
}

http://stackoverflow.com/questions/7259906/propagating-threadlocal-to-a-new-thread-fetched-from-a-executorservice

I am using the following super class for my tasks that need to have access to request scope. Basically you can just extend it and implement your logic in onRun() method.

import org.springframework.web.context.request.RequestAttributes;
import org.springframework.web.context.request.RequestContextHolder;

public abstract class RequestAwareRunnable implements Runnable {
  private final RequestAttributes requestAttributes;
  private Thread thread;

  public RequestAwareRunnable() {
    this.requestAttributes = RequestContextHolder.getRequestAttributes();
    this.thread = Thread.currentThread();
  }

  public void run() {
    try {
      RequestContextHolder.setRequestAttributes(requestAttributes);
      onRun();
    } finally {
      if (Thread.currentThread() != thread) {
        RequestContextHolder.resetRequestAttributes();
      }
      thread = null;
    }
  }

  protected abstract void onRun();
}

http://stackoverflow.com/questions/7322469/java-equals-for-a-class-is-same-as-equals
instanceof returns true also for subclasses

Class is final, so its equals() cannot be overridden. Its equals() method is inherited from Object which reads

public boolean equals(Object obj) {
    return (this == obj);
}

So yes, they are the same thing for a Class, or any type which doesn't override equals(Object)

To answer your second question, each ClassLoader can only load a class once and will always give you the same Class for a given fully qualified name.

http://www.cnblogs.com/yangqiangyu/p/5246515.html

String s1 = "abc";
StringBuffer s2 = new StringBuffer(s1); 
System.out.println(s1.equals(s2));

这是true 还是false呢？答案是false。

首先s1变量引用了字符串”abc”,然后StringBuffer s2 = new StringBuffer(s1)，新建了一个StringBuffer对象调用append()方法返回自身。调用String的equals方法。重点就是这个equals方法里有个instance of，必需是同一类型的才进行比较否则直接返回false。

    public boolean equals(Object anObject) {
        if (this == anObject) {
            return true;
        }
        //关键点就在这里了
        if (anObject instanceof String) {
            String anotherString = (String) anObject;
            int n = value.length;
            if (n == anotherString.value.length) {
                char v1[] = value;
                char v2[] = anotherString.value;
                int i = 0;
                while (n-- != 0) {
                    if (v1[i] != v2[i])
                            return false;
                    i++;
                }
                return true;
            }
        }
        return false;
    }

StringBuilder/StringBuffer in java8 doesn't overwrite equals method. so different sb are different no matter what's the content.

下面的代码在内存会产生几个对象呢？
String s1 = new String(“abc”);
String s2 = new String(“abc”);

答案：3个
有了上面的分析，相信大家都明白了，new了两个对象，加上string pool里的一个”abc”。

String是不可变的常量，每当我们创建一个字符串对象的时候，如果堆区的常量池里不存在这个字符串，就会创建一个存储在常量池里(String存的地方叫String pool)，如果存在了，就直接把变量的地址指向常量池里，比如：String b = “abc” 这句话内存表示如下。下面开始上题
这里写图片描述

1.1

String s1 = new String("abc");
String s2 = new String("abc"); 
System.out.println(s1 == s2);

输出结果是什么呢？
从上面的图也大概说了jvm里面有堆、栈区。堆区里面主要存放的是局部变量，栈区里存放的是new出来的对象。==对于对象类型比较的是地址。所以在s1和s1是分别引用了堆里面new出来的不同对象的地址，图形理解如下

答案很明显了，地址不同输出false.

String s1 = "abc";
String s2 = new String("abc");
s2.intern();
System.out.println(s1 ==s2);

上面的代码第二行String s2 = new String(“abc”); s2其实是引用到了new的对象，虽然在第三行调用了intern方法，但是没有赋值给s2，所以s2的引用还是没有变。所以返回false。
如果第三行代码改成s2 = s2.intern()就会返回true了。

 String s1 = "abc";
 String s2 = new String("abc");
 s2 = s2.intern();
 System.out.println(s1==s2);

好了，今天就到这里。之后会继续分析。如果喜欢我的文章欢迎关注我。各位看官大爷的支持是我最大的动力！！

http://www.hollischuang.com/archives/1246
http://www.hollischuang.com/archives/1230

字符串池是方法区中的一部分特殊存储。当一个字符串被被创建的时候，首先会去这个字符串池中查找，如果找到，直接返回对该字符串的引用。

如果字符串可变的话，当两个引用指向指向同一个字符串时，对其中一个做修改就会影响另外一个。（请记住该影响，有助于理解后面的内容）

缓存Hashcode

Java中经常会用到字符串的哈希码（hashcode）。例如，在HashMap中，字符串的不可变能保证其hashcode永远保持一致，这样就可以避免一些不必要的麻烦。这也就意味着每次在使用一个字符串的hashcode的时候不用重新计算一次，这样更加高效。

在String类中，有以下代码：

private int hash;//this is used to cache hash code.

以上代码中hash变量中就保存了一个String对象的hashcode，因为String类不可变，所以一旦对象被创建，该hash值也无法改变。所以，每次想要使用该对象的hashcode的时候，直接返回即可。

String被广泛的使用在其他Java类中充当参数。比如网络连接、打开文件等操作。如果字符串可变，那么类似操作可能导致安全问题。因为某个方法在调用连接操作的时候，他认为会连接到某台机器，但是实际上并没有（其他引用同一String对象的值修改会导致该连接中的字符串内容被修改）。可变的字符串也可能导致反射的安全问题，因为他的参数也是字符串。

因为不可变对象不能被改变，所以他们可以自由地在多个线程之间共享。不需要任何同步处理。

总之，String被设计成不可变的主要目的是为了安全和高效。所以，使String是一个不可变类是一个很好的设计。

Here it uses temp variable h when computes the hashcode. otherwise the hashcode function would be not threadsafe.

java.lang.String.hashCode()

public int hashCode() {

int h = hash;

if (h == 0 && value.length > 0) {

char val[] = value;

for (int i = 0; i < value.length; i++) {

h = 31 * h + val[i];

}

hash = h;

}

return h;

}

https://leetcode.com/discuss/71763/share-some-thoughts

Set<String>[] sets = new Set[10];
sets[2] = new HashSet<>(); // autogenerics
sets[2].add("hello");
//sets[2].add(3); // no suitable method found for add(int)
String s = sets[2].iterator().next(); // no cast

if you're bothered by the warning just do:

@SuppressWarnings("unchecked") Set<String>[] sets = new Set[10];

http://stackoverflow.com/questions/5174696/how-locale-dependent-is-simpledateformat

Just found the getAvailableLocales static method on Locale, and it turns out that all the fields of a calendar can be locale dependent:

public static void main(String[] args) {
    String pattern = "yyyy-MM-dd HH:mm:ss";
    Date date = new Date();
    String defaultFmt = new SimpleDateFormat(pattern).format(date);

    for (Locale locale : Locale.getAvailableLocales()) {
        String localeFmt = new SimpleDateFormat(pattern, locale).format(date);
        if (!localeFmt.equals(defaultFmt)) {
            System.out.println(locale + " " + localeFmt);
        }
    }
}

On my system (in germany running an english version of ubuntu) this outputs the following list, lets hope the unicode character come through intact:

ja_JP_JP 23-03-03 16:53:09
hi_IN २०११-०३-०३ १६:५३:०९
th_TH 2554-03-03 16:53:09
th_TH_TH ๒๕๕๔-๐๓-๐๓ ๑๖:๕๓:๐๙

So Japan and Thailand use a different epoch but are otherwise based on the gregorian calendar, which explains why month and day are the same.

Other locales also use different scripts for writing numbers, for example Hindi spoken in Indonesia and a variant of Thai in Thailand.

To answer the question, the locale should alway be specified to a known value when a locale independant String is needed.

Edit: Java 1.6 added a constant Locale.ROOT to specify a language/country neutral locale. This would be preferred to specifying the English locale for output targeted at a computer.

The root locale is the locale whose language, country, and variant are empty ("") strings. This is regarded as the base locale of all locales, and is used as the language/country neutral locale for the locale sensitive operations.

http://stackoverflow.com/questions/22091107/get-date-object-in-utc-format-in-java

A Date doesn't have any time zone. What you're seeing is only the formatting of the date by the Date.toString() method, which uses your local timezone, always, to transform the timezone-agnostic date into a String that you can understand.

If you want to display the timezone-agnostic date as a string using the UTC timezone, then use a SimpleDateFormat with the UTC timezone (as you're already doing in your question).

In other terms, the timezone is not a property of the date. It's a property of the format used to transform the date into a string.

http://stackoverflow.com/questions/17569608/format-a-message-using-messageformat-format-in-java

Add an extra apostrophe ' to the MessageFormat pattern String to ensure the ' character is displayed

String text = 
     java.text.MessageFormat.format("You''re about to delete {0} rows.", 5);

http://www.mscharhag.com/java/resource-bundle-single-quote-escaping

“This is the specified behavior, although admittedly it’s somewhat confusing. An apostrophe (also known as “single quote”) in a MessageFormat pattern starts a quoted string, in which {0} is just treated as a literal string and is not interpreted. Two single quotes in sequence in the pattern result in one single quote in the output string. Seehttp://java.sun.com/j2se/1.4/docs/api/java/text/MessageFormat.html#patterns for more information.”

https://bernytech.wordpress.com/2009/06/23/messageformat/

http://www.geeksforgeeks.org/few-tricky-programs-in-java/
Comments that execute :

    public static void main(String[] args)

{

         // the line below this gives an output

         // \u000d System.out.println("comment executed");

}

The reason for this is that the Java compiler parses the unicode character \u000d as a new line

    public static void main(String[] args)

{

    loop1:

    for (int i = 0; i < 5; i++)

{

        for (int j = 0; j < 5; j++) 

{

            if (i == 3)

                break loop1;

            System.out.println("i = " + i + " j = " + j);

}

}

You can also use continue to jump to start of the named loop.

We can also use break (or continue) in a nested if-else with for loops in order to break several loops with if-else, so one can avoid setting lot of flags and testing them in the if-else in order to continue or not in this nested level.

http://konigsberg.blogspot.com/2008/04/integergetinteger-are-you-kidding-me.html

Integer.valueOf(String) converts a String to a number by assuming the Stringis a numeric representation. In other words. Integer.valueOf("12345") yields the number 12345.
Integer.getInteger(String) converts a String to a number by assuming theString is the name of a system property numeric representation. In other words.Integer.getInteger("12345") is likely to yield null.

Why would anybody consider this a sufficient distinction? How many bugs are people going to create by using getInteger when they meant valueOf and vice versa?

This type of overloading is called near-phrase overloading. I just made that term up right now. It's when people use very similar words to mean different things.

Update: it turns out there is something worse: Boolean.getBoolean("true") is usually equal to Boolean.FALSE.

Redirecting System.out.println() output to a file in Java
PrintStream o = new PrintStream(new File("A.txt"));
// Store current System.out before assigning a new value
PrintStream console = System.out;
// Assign o to output stream
System.setOut(o);
System.out.println("This will be written to the text file");
// Use stored value for output stream
System.setOut(console);
System.out.println("This will be written on the console!");

http://calvin1978.blogcn.com/articles/jdk.html

旧版JDK，反射时可能抛出ClassNotFoundException、NoSuchMethodException、IllegalAccessException还有InvocationTargetExcetpion，不知道别人怎样，反正我肯定会很偷懒的只捕捉或声明Exception类了，虽然可能有一百个理由说这样不好。JDK7之后，这堆异常有了叫ReflectiveOperationExcetpion的父类，抓它就行。

旧版JDK，还有个很莫名其妙的地方，就是所有反射，都拿不到参数名，无论名字叫啥，都返回arg0，arg1，所以在CXF，SpringMVC里，你都要把参数名字用annotation再写一遍：

Person getEmployee(@PathParam("dept") Long dept, @QueryParam("id") Long id)

现在，JDK8新提供的类java.lang.reflect.Parameter可以反射参数名了，编译时要加参数，如 javac -parameters xxx.java，或者Eclipse里设置。然后就可以写成:

Person getEmployee(@PathParam Long dept, @QueryParam Long id)

6. 比AtomicLong更好的高并发计数器

在超高并发的场景下，AtomicLong其实没有银弹，虽然没有锁，一样要通过不停循环的CAS来解决并发冲突。

for ( ; ; ) {
long current = get();
long next = current + 1;
if (compareAndSet(current, next))
return next;
}

可见，如果并发很高，每条线程可能要转好几轮的compareAndSet()才把自己的increment()做了。

那这时候，是不是会想起ConcurrentHashMap，分散开十六把锁来分散冲突概率的模式？

JDK8新增了一个LongAdder来实现这个思路，内部有多个计数器，每次increment()会落到其中一个计数器上，到sum()的时候再把它们的值汇总。

没有JDK8的同学也没所谓，Guava把LongAdder拷贝了一份。

但注意，此计数器适合高并发地increment()，到了某个时刻才sum()一次的统计型场景，如果要频繁、高并发地查询计数器的当前值，分散计数器带来的好处就抵消了。
另外，它的实现也比AtomicLong复杂不少，如果并发度不是那么高，继续用AtomicLong其实也挺好，简单就是好。
PS. 在酷壳有一篇更详细的讲解：<从LongAdder看更高效的无锁实现>

5. JDK7/8中排序算法的改进

面试季的同学背一脑袋的插入、归并、冒泡、快排，那，JDK到底看上了哪家的排序算法？

Colletions.sort(list) 与 Arrays.sort(T[])
Colletions.sort()实际会将list转为数组，然后调用Arrays.sort()，排完了再转回List。
PS. JDK8里，List有自己的sort()方法了，像ArrayList就直接用自己内部的数组来排，而LinkedList, CopyOnWriteArrayList还是要复制出一份数组。

In jdk8, java.util.Collections#sort

public static <T> void sort(List<T> list, Comparator<? super T> c) {
    list.sort(c);}

而Arrays.sort()，对原始类型(int[],double[],char[],byte[])，JDK6里用的是快速排序，对于对象类型(Object[])，JDK6则使用归并排序。为什么要用不同的算法呢？

JDK7的进步
到了JDK7，快速排序升级为双基准快排(双基准快排 vs 三路快排)；归并排序升级为归并排序的改进版TimSort，一个JDK的自我进化。

JDK8的进步
再到了JDK8，对大集合增加了Arrays.parallelSort()函数，使用fork-Join框架，充分利用多核，对大的集合进行切分然后再归并排序，而在小的连续片段里，依然使用TimSort与DualPivotQuickSort。

4. 高并发的ThreadLocalRandom

JDK7的Concurrent包里有一个ThreadLocalRandom，伪随机数序列的算法和父类util.Random一样，遵照高德纳老爷子在《The Art of Computer Programming, Volume 2》里说的：

x(0)=seed;
x(i+1)=(A* x(i) +B) mod M;

区别是Random里的seed要用到AtomicLong，还要经常compareAndSet(current, next)来避免并发冲突，而ThreadLocalRandom用ThreadLocal模式来解决并发问题，seed用long就行了。

用法： int r = ThreadLocalRandom.current() .nextInt(1000);

没有JDK7的，可自行Copy Paste这个类，Netty和Esper都是这么干的。

ImportNews翻译了一篇更详细的文章：多线程环境下生成随机数

Nashorn——在JDK 8中融合Java与JavaScript之力： JDK8的新JavaScript引擎据说是JDK6时的2-10倍，另外Avatar.js可以在Java里跑Node.js及其类库，然后Node.js里又再调用Java的类库，纸包鸡包纸。

public class ThreadLocalRandom extends Random {
private static final AtomicLong seeder = new AtomicLong(initialSeed());
}
class Random implements java.io.Serializable {
private final AtomicLong seed;
}

http://calvin1978.blogcn.com/articles/stringbuilder.html

1. 初始长度好重要，值得说四次。

StringBuilder的内部有一个char[]，不断的append()就是不断的往char[]里填东西的过程。

new StringBuilder() 时char[]的默认长度是16，然后，如果要append第17个字符，怎么办？

用System.arraycopy成倍复制扩容！！！！

这样一来有数组拷贝的成本，二来原来的char[]也白白浪费了要被GC掉。可以想见，一个129字符长度的字符串，经过了16，32，64, 128四次的复制和丢弃，合共申请了496字符的数组，在高性能场景下，这几乎不能忍。

所以，合理设置一个初始值多重要。

但如果我实在估算不好呢？多估一点点好了，只要字符串最后大于16，就算浪费一点点，也比成倍的扩容好。

3. 但，还是浪费了一倍的char[]

浪费发生在最后一步，StringBuilder.toString()

// Create a copy, don't share the array
return new String(value, 0, count);

String的构造函数会用 System.arraycopy()复制一把传入的char[]来保证安全性不可变性，如果故事就这样结束，StringBuilder里的char[]还是被白白牺牲了。

为了不浪费这些char[]，一种方法是用Unsafe之类的各种黑科技，绕过构造函数直接给String的char[]属性赋值，但很少人这样做。

另一个靠谱一些的办法就是重用StringBuilder。而重用，还解决了前面的长度设置问题，因为即使一开始估算不准，多扩容几次之后也够了。

4. 重用StringBuilder

这个做法来源于JDK里的BigDecimal类（没事看看JDK代码多重要），SpringSide里将代码提取成StringBuilderHolder，里面只有一个函数

public StringBuilder getStringBuilder() {
sb.setLength(0);
return sb;
}

StringBuilder.setLength()函数只重置它的count指针，而char[]则会继续重用，而toString()时会把当前的count指针也作为参数传给String的构造函数，所以不用担心把超过新内容大小的旧内容也传进去了。可见，StringBuilder是完全可以被重用的。

为了避免并发冲突，这个Holder一般设为ThreadLocal，标准写法见BigDecimal或StringBuilderHolder的注释。

5. ＋与 StringBuilder

String s ＝ “hello ” + user.getName();

这一句经过javac编译后的效果，的确等价于使用StringBuilder，但没有设定长度。

String s ＝ new StringBuilder().append(“hello”).append(user.getName());

但是，如果像下面这样：

String s ＝ “hello ”;
// 隔了其他一些语句
s = s ＋ user.getName();

每一条语句，都会生成一个新的StringBuilder，这里就有了两个StringBuilder，性能就完全不一样了。如果是在循环体里s+=i; 就更加多得没谱。

据R大说，努力的JVM工程师们在运行优化阶段，根据+XX:+OptimizeStringConcat(JDK7u40后默认打开)，把相邻的(中间没隔着控制语句) StringBuilder合成一个，也会努力的猜长度。

所以，保险起见还是继续自己用StringBuilder并设定长度好了。

7. 永远把日志的字符串拼接交给slf4j??

logger.info("Hello {}", user.getName());

对于不知道要不要输出的日志，交给slf4j在真的需要输出时才去拼接的确能省节约成本。

但对于一定要输出的日志，直接自己用StringBuilder拼接更快。因为看看slf4j的实现，实际上就是不断的indexof("{}"), 不断的subString()，再不断的用StringBuilder拼起来而已，没有银弹。

PS. slf4j中的StringBuilder在原始Message之外预留了50个字符，如果可变参数加起来长过50字符还是得复制扩容......而且StringBuilder也没有重用。

http://yq.aliyun.com/articles/2386
在函数内部为对象的每一个属性作赋值操作。这种方式简单自然，但存在一个致命性的问题：如果有一天在类中新增加了一个需要深拷贝的属性，那么相应的copy函数也得进行修改，这种方法给类的可扩展性带来了极大的不方便。

1. java Cloneable接口实现深拷贝
这种方式，需要类实现Colneable接口 clone 函数，在clone函数中调用super.clone。这种方式的深拷贝同样会带来另一个问题，如果类中有其他类的对象作为属性，则其他的类也需要重载并实现Cloneable接口。
public Object clone() throws CloneNotSupportedException {
ComplexDO newClass = (ComplexDO) super.clone();
newClass.l = new ArrayList<SimpleDO>();
for (SimpleDO simple : this.l) {
newClass.l.add((SimpleDO) simple.clone());
}
return newClass;
}
2. java 序列化实现深拷贝

这种方式的原理是利用java序列化，将一个对象序列化成二进制字节流，然后对该字节流反序列化赋值给一个对象。代码示例：

    public Object seirCopy(Object src) {
        try {
            ByteArrayOutputStream byteOut = new ByteArrayOutputStream();
            ObjectOutputStream out = new ObjectOutputStream(byteOut);
            out.writeObject(src);

            ByteArrayInputStream byteIn = new ByteArrayInputStream(byteOut.toByteArray());
            ObjectInputStream in = new ObjectInputStream(byteIn);
            Object dest = in.readObject();
            return dest;
        } catch (Exception e) {
            //do some error handler
            return null;
        }
 }

当然，也可以选用json等序列化的库来完成序列化，这种方式有效的规避了Cloneabel接口的可扩展缺点，一个函数就可以基本上适用于所有的类.缺点是相对内存拷贝，序列化需要先将对象转换成二进制字节流，然后反序列化将该二进制字节流重新拷贝到一块对象内存，相对慢点。

3. 号称最快的深拷贝二方库cloning源码分析
https://github.com/kostaskougios/cloning
Cloner cloner=new Cloner();
MyClass clone=cloner.deepClone(o);
实现Cloneable接口的拷贝是最快的，因为他只涉及到了内存拷贝，但是如果涉及的属性为普通对象比较多的时候写起来麻烦点
序列化/反序列化拷贝最慢
使用cloning库，由于使用了递归和反射机制相对Cloneable接口实现的拷贝要慢，但比序列化方式要快。
关注

http://blog.csdn.net/never_cxb/article/details/47204485

int	4个字节
char	2个字节
byte	1个字节
short	2个字节
long	8个字节
float	4个字节
double	8个字节

http://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html

boolean: The boolean data type has only two possible values: true and false. Use this data type for simple flags that track true/false conditions. This data type represents one bit of information, but its "size" isn't something that's precisely defined.

https://community.oracle.com/thread/2550321?start=0&tstart=0
In Oracle’s Java Virtual Machine implementation, boolean arrays in the Java
programming language are encoded as Java Virtual Machine byte arrays, using 8 bits per
boolean element.
http://chrononsystems.com/blog/hidden-evils-of-javas-byte-array-byte

Consider this piece of code which allocates a boolean array of 100,00:

boolean[] array = new boolean[100000];

What should the size of this array be?

Considering that a boolean can just be either true or false, every element in the array only needs a single bit of space each. Thus, the size of our boolean array in bytes should be:

100,000/8 + (overhead of array object) = 12,500 bytes + (overhead of array object)

But there lies the hidden evil….

The Evil Inside

As it turns out, when you allocate a boolean array, Java uses an entire byte for each element in the boolean array!

Thus the size of the boolean array is in fact:

100,000 bytes + (overhead of array object)

Remedy

So is there any way not to use the 7 extra bytes when you only need the 1 bit? Its here that the java.util.BitSetclass comes to rescue.

The BitSet class does indeed use a single bit to represent a true/false boolean value. Its implementation uses an array of ‘long’ values, where each bit of the long value can be individually manipulated to set any position in the entire BitSet to true or false.

The boolean array takes:
100,000 + 16 (array overhead) = 100,016 bytes

The BitSet object only takes :

100,000 / 64 (size of long) = 1562.5 = 1563 long values after rounding
1563*8 + 16 (array overhead) = 12, 520 bytes

http://java-performance.info/string-switch-implementation/
Java 7 String Switch is a syntactic sugar on top of the normal switch operator.

public int switchTest( final String s )
{
    switch ( s )
    {
        case "a" :
            System.out.println("aa");
            return 11;
        case "b" :
            System.out.println("bb");
            return 22;
        default :
            System.out.println("cc");
            return 33;
    }
}

It is converted by javac into the following code (decompiled back into Java):

The generated code consists of 2 parts:

Translation from String into a distinct int for each case, which is implemented in the first switch statement.
The actual switch based on int-s.

public int switchTest(String var1) {
    byte var3 = -1;
    switch(var1.hashCode()) {
    case 97:
        if(var1.equals("a")) {
            var3 = 0;
        }
        break;
    case 98:
        if(var1.equals("b")) {
            var3 = 1;
        }
    }
 
    switch(var3) {
    case 0:
        System.out.println("aa");
        return 11;
    case 1:
        System.out.println("bb");
        return 22;
    default:
        System.out.println("cc");
        return 33;
    }
}

The first switch contains a case for each distinct String.hashCode in the original String switchlabels. After matching by hash code, a string is compared for equality to every string with the same hash code. It is pretty unlikely that 2 strings used in switch labels will have the same hash code, so in most cases you will end up with exactly one String.equals call.

it becomes clear why you can not use null as a switch label: the first switch starts from calculating the hashCode of the switch argument.

What can we say about the performance of the underlying int switch? As you can find in one of my earlier articles, a switch is implemented as a fixed map with a table size of approximately 20 (which is fine for most of common cases).

Finally, we should note that String.hashCode implementation has implicitly became the part ofthe Java Language Specification after it was used in the String switch implementation. It can no longer be changed without breaking the .class files containing String switch, which were compiled with the older versions of Java.

http://www.zhuangjingyang.com/arrays-fill-java-lang-arraystoreexception-java-lang-integer.html

有四维数组a, 想为a统一赋初值-1，调用方法Arrays.fill() 结果抛出该异常。

查看API文档后，知：

Arrays.fill(int[] a,int value) 只有一维数组，没有多维数组的重载方法。

解决方法：

先定义一个一维数组，然后赋值为你想要的初值，再用这个数组去给目标数组赋值。

  int[][] target = new int[2][4];
  
  //用中间数组,假设赋初值为-1
  //这里元素个数要大于等于二维的列个数，否则会抛出异常
  int [] temp = new int[4];
  
  Arrays.fill(temp, -1);
  Arrays.fill(target, temp);

由代码我们可以看出来，其中的原理不过是拿一维的去填充二维的每一维。因为如果你一维的个数不够填充二维的一维的话，会抛出异常，所以用于填充的数组长度必须要大于等于二维的列长度。

由此我们可以推广到四维填充，以及更多维数亦是如此。省去多个for循环的尴尬。

  //二维
  int[][] target = new int[2][4];
  //用中间数组,假设赋初值为-1
  //这里元素个数要大于等于二维的列个数，否则会抛出异常
  int [] temp = new int[4];
  Arrays.fill(temp, -1);
  Arrays.fill(target, temp);
  //3维
  int [][][] target4 = new int [2][2][2];
  Arrays.fill(target4, target);

http://www.chenguanghe.com/amazon-abstract-class-%E4%B8%8E-anonymous-class/
抽象类不能实例化 - 即便其没有抽象方法

public abstract class Car {

    public int getAge(){

        return  5;

}

    public static void main(String[] args) {

        Car mycar = new Car(){};

        System.out.println(mycar.getAge());

        LinkedList list = new LinkedList();

}

}

Car虽然是抽象类, 但是没有抽象方法, 所以它不是那么的”抽象”.
初始化Car的时候, 我用了Anonymous Class类初始化它, 并且Car又没有抽象的方法, 所以这个初始化是可以的.

其中匿名类初始化我们在做Comparator的时候经常使用, 只是里面的compare方法是抽象的,每次使用都需要定义一下, 而这里, Car中没有抽象方法, 所以就可以被当做匿名类初始化.

为什么不建议使用protected创建域？

尽管可以创建protected域，但是最好的方式还是将域保持为private；你应当一直保留“更改底层实现”的权利。然后通过protected方法来控制类的继承者的访问权限。

Java守护线程简介

只要JVM中存在至少一个用户线程，JVM就不会退出。

Java中的垃圾回收线程是一个典型的守护线程，它为其他用户线程回收垃圾而存在，假如用户线程全部都退出了，也就不会产生垃圾了，它也就没有必要存在了，这时JVM将退出，随之垃圾回收线程也就退出了。

http://chrononsystems.com/blog/hidden-evils-of-javas-stringsplit-and-stringr

In this case, the String.split() and String.replace*() methods (with the sole exception of String.replace(char, char) ) internally use the regular expression apis themselves, which can result in performance issues for your application.

public String[] split(String regex, int limit) {

return Pattern.compile(regex).split(this, limit);

}

java序列化与blockdata

前些日子系统发现一个bug，序列化后再反序列化对象后，原先的文字会产生乱码。纠结一通宵，发现原因是java对象序列化再反序列化时，通过read(byte[])没有读到完整的数据。

http://blog.2baxb.me/archives/974

在创建对象时，如果对象override了finalize()方法，jvm会同时创建一个Finalizer对象
所有Finalizer对象组成了一个双向链表
所有Finalizer对象都有一个名为queue的成员变量，指向的都是Finalizer类的静态Queue。
cms gc执行到mark阶段的最后时，会把需要gc的对象加入到Reference的pending list中。
有一个专门的高级别线程Reference Handler处理pending list，把pending list中的对象取出来，放到这个对象所指的Reference Queue中，对于Finalizer对象来说，这个queue指向Finalizer类的静态Queue。
Finalizer类有一个专门的线程负责从queue中取对象，并且执行finalizer引用的对象的finalize函数。

http://blog.mgm-tp.com/2012/03/hashset-java-puzzler/
If an implementation of hashCode() uses mutable fields to calculate the value,HashSet.contains() produces unexpected results, i.e. your object seems to be not a member of the set.

using mutable fields in hashCode() is a recipe for disaster. And disaster strikes when instances of this class are put in a hash-based collection like HashSet or HashMap (as map keys).
http://javaadventure.blogspot.com/2007/02/hashcode-pitfalls-with-hashset-and.html

https://codingstyle.cn/topics/126
public class Child extends Father { static { System.out.println("child-->static"); } private int n = 20; { System.out.println("Child Non-Static"); n = 30; } public int x = 200; public Child() { this("The other constructor"); System.out.println("child constructor body: " + n); } public Child(String s) { System.out.println(s); } public void age() { System.out.println("age=" + n); } public void printX() { System.out.println("x=" + x); } public static void main(String[] args) { new Child().printX(); } } class Father { static { System.out.println("super-->static"); } public static int n = 10; public int x = 100; public Father() { System.out.println("super's x=" + x); age(); } { System.out.println("Father Non-Static"); } public void age() { System.out.println("nothing"); } }

类始化化（类加载时）
对象初始化
先父后子
从上而下

super-->static
child-->static
Father Non-Static
super's x=100
age=0    // 特殊的输出结果                       
Child Non-Static
The other constructor
child constructor body: 30
x=200

解释

多态什么时候生效？

类初始化后，类与类的继承关系，方法列表等信息JVM已经获取了，多态调用机制(犹如C++的虚表)已经具备工作的条件了。为此，在Father的构造函数调用了可被覆写的函数age（不好的编码实践），按照预期，age的调用结果与多态机制一致。

零值初始化

但因为在初始化Father对象时，Child对象还未真正地初始化，Java默认地将Child对象的空间进行了零值初始化处理，为此，此刻Child对象的n为0。

结论：

不要企图在构造函数中调用可以被覆写的函数，以避免运行时的不确定性；一般地，构造函数一般只调用private(隐含了final的语义)的方法。

http://www.java2blog.com/2013/02/difference-between-comparator-and-comparable-in-java.html

Parameter	Comparable	Comparator
Sorting logic	Sorting logic must be in same class whose objects are being sorted. Hence this is called natural ordering of objects	Sorting logic is in separate class. Hence we can write different sorting based on different attributes of objects to be sorted. E.g. Sorting using id,name etc.
Implementation	Class whose objects to be sorted must implement this interface.e.g Country class needs to implement comparable to collection of country object by id	Class whose objects to be sorted do not need to implement this interface.Some other class can implement this interface. E.g.-CountrySortByIdComparator class can implement Comparator interface to sort collection of country object by id

https://stackoverflow.com/questions/26049329/javadoc-in-jdk-8-invalid-self-closing-element-not-allowed

Taken from "What's New in JDK 8" from oracle.com:

The javac tool now has support for checking the content of javadoc comments for issues that could lead to various problems, such as invalid HTML or accessibility issues, in the files that are generated when javadoc is run. The feature is enabled by the new -Xdoclint option. For more details, see the output from running "javac -X". This feature is also available in the javadoc tool, and is enabled there by default.

Now I did what it told me to do. On JDK 7, the output of "javac -X" does not mention the -Xdoclint option. However, on JDK 8, it gives:

 -Xdoclint:(all|none|[-]<group>)[/<access>]
    Enable or disable specific checks for problems in javadoc comments,
    where <group> is one of accessibility, html, missing, reference, or syntax,
    and <access> is one of public, protected, package, or private.

So, run the Javadoc utility as follows:

javadoc.exe -Xdoclint:none <other options...>

To remove the errors in the javaDocs just replace:

<p/> with just <p>
<br/> with just <br>

Thursday, December 10, 2015

Java Misc

封装Builder

封装Iterator

Do Something If a Value Is Present

Default Values and Actions

Rejecting Certain Values Using the `filter` Method

Extracting and Transforming Values Using the `map` Method

1.1

缓存Hashcode

6. 比AtomicLong更好的高并发计数器

5. JDK7/8中排序算法的改进

4. 高并发的ThreadLocalRandom

1. 初始长度好重要，值得说四次。

3. 但，还是浪费了一倍的char[]

4. 重用StringBuilder

5. ＋与 StringBuilder

7. 永远把日志的字符串拼接交给slf4j??

解释

多态什么时候生效？

零值初始化

结论：

Labels

Popular Posts

Thursday, December 10, 2015

Java Misc

封装Builder

封装Iterator

Do Something If a Value Is Present

Default Values and Actions

Rejecting Certain Values Using the filter Method

Extracting and Transforming Values Using the map Method

1.1

缓存Hashcode

6. 比AtomicLong更好的高并发计数器

5. JDK7/8中排序算法的改进

4. 高并发的ThreadLocalRandom

1. 初始长度好重要，值得说四次。

3. 但，还是浪费了一倍的char[]

4. 重用StringBuilder

5. ＋ 与 StringBuilder

7. 永远把日志的字符串拼接交给slf4j??

解释

多态什么时候生效？

零值初始化

结论：

Labels

Popular Posts

Rejecting Certain Values Using the `filter` Method

Extracting and Transforming Values Using the `map` Method

5. ＋与 StringBuilder