Sunday, October 25, 2015

Java SimpleDateFormat + Commons FastDateFormat



http://blog.jrwang.me/2016/java-simpledateformat-multithread-threadlocal/
    private static SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");

    public static String format(Date date) {
        return dateFormat.format(date);
    }

    public static Date parse(String dateStr) throws ParseException {
        return dateFormat.parse(dateStr);
    }

    public static void main(String[] args) {
        final CountDownLatch latch = new CountDownLatch(1);
        final String[] strs = new String[] {"2016-01-01 10:24:00", "2016-01-02 20:48:00", "2016-01-11 12:24:00"};
        for (int i = 0; i < 10; i++) {
            new Thread(new Runnable() {
                @Override
                public void run() {
                    try {
                        latch.await();
                    } catch (InterruptedException e) {
                        e.printStackTrace();
                    }

                    for (int i = 0; i < 10; i++){
                        try {
                            System.out.println(Thread.currentThread().getName()+ "\t" + parse(strs[i % strs.length]));
                            Thread.sleep(100);
                        } catch (ParseException e) {
                            e.printStackTrace();
                        } catch (InterruptedException e) {
                            e.printStackTrace();
                        }
                    }
                }
            }).start();
        }
        latch.countDown();
    }
在 JDK 的文档中提到了 SimpleDateFormat 的线程安全问题:
Date formats are not synchronized. It is recommended to create separate format instances for each thread. If multiple threads access a format concurrently, it must be synchronized externally.
那么,原因又是什么呢?我们来简单地看一下 SimpleDateFormat 的源码:
private StringBuffer format(Date date, StringBuffer toAppendTo,
                            FieldDelegate delegate) {
    // Convert input date to time field list
    calendar.setTime(date);

    boolean useDateFormatSymbols = useDateFormatSymbols();

    for (int i = 0; i < compiledPattern.length; ) {
        int tag = compiledPattern[i] >>> 8;
        int count = compiledPattern[i++] & 0xff;
        if (count == 255) {
            count = compiledPattern[i++] << 16;
            count |= compiledPattern[i++];
        }

        switch (tag) {
        case TAG_QUOTE_ASCII_CHAR:
            toAppendTo.append((char)count);
            break;

        case TAG_QUOTE_CHARS:
            toAppendTo.append(compiledPattern, i, count);
            i += count;
            break;

        default:
            subFormat(tag, count, delegate, toAppendTo, useDateFormatSymbols);
            break;
        }
    }
    return toAppendTo;
}

可以看到,在 format() 方法中先将日期存放到一个 Calendar 对象中,而这个 Calender对象在 SimpleDateFormat 中还是以成员变量存在的。在随后调用 subFormat() 时会再次用到成员变量 calendar。这就是引发问题的根源。在 parse() 方法中也会存在相应的问题。
试想,在多线程环境下,如果两个线程都使用同一个 SimpleDateFormat 实例,那么就有可能存在其中一个线程修改了 calendar 后紧接着另一个线程也修改了 calendar,那么随后第一个线程用到 calendar 时已经不是它所期待的值了。
SimpleDateFormat 其实是有状态的,它使用一个 Calendar 成员变量来保存状态;如果要求 SimpleDateFormat 的 parse() 和 format() 是线程安全的,那么它其实应该是无状态的。将 Calendar 对象作为局部变量,内部在进行方法调用时每次都把它作为参数进行传递,其实就应该可以做到线程安全了。JDK 中 SimpleDateFormat 的实现之所以没有这样做可能是出于性能上的考虑,可以节约每次方法调用时都要创建 Calendar 对象的开销。但这种有状态的设计在某些场景下却反而带来了使用上的不便。

最简单的方法就是每次要使用 SimpleDateFormat 时都创建一个局部的 SimpleDateFormat对象。局部变量,自然就不存在线程安全的问题了。但如果需要频繁进行调用的话,每次都要创建新的对象,开销太大。
第二种方式,就是对 SimpleDateFormat 进行加锁,这样可以确保同一时间只有一个线程可以持有锁,进而解决线程安全的问题。但是这种方法在多线程竞争激烈的时候会带来效率问题。
第三种方式,就是使用 ThreadLocal。 ThreadLocal 可以确保每个线程都可以得到单独的一个 SimpleDateFormat 的对象,那么自然也就不存在竞争问题了。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
public class DateUtil {
    private static ThreadLocal<SimpleDateFormat> local = new ThreadLocal<SimpleDateFormat>() {
        @Override
        protected SimpleDateFormat initialValue() {
            return new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
        }
    };

    public static String format(Date date) {
        return local.get().format(date);
    }

    public static Date parse(String dateStr) throws ParseException {
        return local.get().parse(dateStr);
    }
}
用 ThreadLocal 来实现其实是有点类似于缓存的思路,每个线程都有一个独享的对象,避免了频繁创建对象,也避免了多线程的竞争。
也可以将 SimpleDateFormat 对象的创建进行延迟加载:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
public class DateUtil {
    private static ThreadLocal<SimpleDateFormat> local = new ThreadLocal<SimpleDateFormat>();

    private static SimpleDateFormat getDateFormat() {//synchronized?
        SimpleDateFormat dateFormat = local.get();
        if (dateFormat == null) {
            dateFormat = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
            local.set(dateFormat);
        }
        return dateFormat;
    }

    public static String format(Date date) {
        return getDateFormat().format(date);
    }

    public static Date parse(String dateStr) throws ParseException {
        return getDateFormat().parse(dateStr);
    }
}
http://stackoverflow.com/questions/3257068/is-the-java-messageformat-class-thread-safe-as-opposed-to-simpledateformat
Message formats are not synchronized. It is recommended to create separate format instances for each thread. If multiple threads access a format concurrently, it must be synchronized externally.
So officially, no - it's not thread-safe.
The docs for SimpleDateFormat say much the same thing.
Now, the docs may just be being conservative, and in practice it'll work just fine in multiple threads, but it's not worth the risk.

http://jeremymanson.blogspot.com/2010/01/note-on-thread-unsafety-of-format.html
If you create a Format object (or a MessageFormatNumberFormatDecimalFormatChoiceFormatDateFormat or SimpleDateFormat object), it cannot be shared among threads. The above code does not share its SimpleDateFormat object among threads, so it is safe.

If the formatter field had been a static field of the class, the method would not be thread-safe. However, this way, you have to create a (relatively) expensive SimpleDateFormat object every time you invoke the method. There are many ways around this. One answer (if you want to avoid locking) is to use a ThreadLocal:

  private static final ThreadLocal formatters = 
    new ThreadLocal() {
      @Override public SimpleDateFormat initialValue() {
        return new SimpleDateFormat("yyyy-MM-dd HH:mm:ss.SSS");
      }
    };
  public static boolean getDate(String str, String str2) {
     SimpleDateFormat formatter = formatters.get();
     ...

http://stackoverflow.com/questions/6840803/simpledateformat-thread-safety
SimpleDateFormat stores intermediate results in instance fields. So if one instance is used by two threads they can mess each other's results.
Looking at the source code reveals that there is a Calendar instance field, which is used by operations on DateFormat / SimpleDateFormat
For example parse(..) calls calendar.clear() initially and then calendar.add(..). If another thread invokes parse(..) before the completion of the first invocation, it will clear the calendar, but the other invocation will expect it to be populated with intermediate results of the calculation.
To be honest, I don't understand why they need the instance field, but that's the way it is.
They don't need the instance field; it is undoubtedly the result of sloppy programming in a misguided attempt at efficiency. 

http://www.luyue.org/simpledateformat-thread-safety/
use jmeter to simulates multiple threads accessing the same SimpleDateFormat object. 
  • Use SimpleDateFormat as a local variable and make sure no multithreads can access it concurrently
  • Externally synchronize call to format() and parse() methods
  • Use Apache commons lang3 FastDateFormat
  • Use joda time library
  • Some people suggest using ThreadLocal. But you should use ThreadLocal cautiously because improper usage could cause memory leaking.
http://www.javacodegeeks.com/2010/07/java-best-practices-dateformat-in.html
due to lack of synchronization on the DateFormat class. Typical exceptions thrown when parsing to create a Date object are :
  • java.lang.NumberFormatException
  • java.lang.ArrayIndexOutOfBoundsException
using the ThreadLocal approach without Thread pools, is equivalent to using the “getDateInstance(..)” approach due to the fact that every newThread has to initialize its local DateFormat instance prior using it, thus a new DateFormatinstance will be created with every single execution.
public class ConcurrentDateFormatAccess {
09
10 private ThreadLocal<DateFormat> df = new ThreadLocal<DateFormat> () {
11
12  @Override
13  public DateFormat get() {
14   return super.get();
15  }
16
17  @Override
18  protected DateFormat initialValue() {
19   return new SimpleDateFormat("yyyy MM dd");
20  }
21
22  @Override
23  public void remove() {
24   super.remove();
25  }
26
27  @Override
28  public void set(DateFormat value) {
29   super.set(value);
30  }
31
32 };
33
34 public Date convertStringToDate(String dateString) throwsParseException {
35  return df.get().parse(dateString);
36 }
37
38}
http://li-ma.blogspot.com/2007/10/thread-safe-date-formatparser.html

http://www.cnblogs.com/xjpz/p/5083101.html
而将"YYYY"换成"yyyy"即SimpleDateFormat("yyyy-MM-dd")之后结果都完全正确。
A week year is in sync with a WEEK_OF_YEAR cycle. All weeks between the first and last weeks (inclusive) have the same week year value. Therefore, the first and last days of a week year may have different calendar year values.
"YYYY"大写的代表周年。根据原文,在我理解所谓周年的意思就是一年的开始那天的那周的第一天到下一年开始那天的那周的第一天的前一天。
由于一周的第一天就是星期天,而2016年的元旦在本周五(2015-12-28)。那么2016年的周年就是从2015-12-27(星期天) 到2016-12-31为止。因为2017年元旦刚好在一周的第一天周日。
我们可以测试一下(前面已经测试了2015-12-27('YYYY')格式化后的年份是2016),
同理,2015年的周年就是 2014-12-28 到 2015-12-26.

Labels

Review (572) System Design (334) System Design - Review (198) Java (189) Coding (75) Interview-System Design (65) Interview (63) Book Notes (59) Coding - Review (59) to-do (45) Linux (43) Knowledge (39) Interview-Java (35) Knowledge - Review (32) Database (31) Design Patterns (31) Big Data (29) Product Architecture (28) MultiThread (27) Soft Skills (27) Concurrency (26) Cracking Code Interview (26) Miscs (25) Distributed (24) OOD Design (24) Google (23) Career (22) Interview - Review (21) Java - Code (21) Operating System (21) Interview Q&A (20) System Design - Practice (20) Tips (19) Algorithm (17) Company - Facebook (17) Security (17) How to Ace Interview (16) Brain Teaser (14) Linux - Shell (14) Redis (14) Testing (14) Tools (14) Code Quality (13) Search (13) Spark (13) Spring (13) Company - LinkedIn (12) How to (12) Interview-Database (12) Interview-Operating System (12) Solr (12) Architecture Principles (11) Resource (10) Amazon (9) Cache (9) Git (9) Interview - MultiThread (9) Scalability (9) Trouble Shooting (9) Web Dev (9) Architecture Model (8) Better Programmer (8) Cassandra (8) Company - Uber (8) Java67 (8) Math (8) OO Design principles (8) SOLID (8) Design (7) Interview Corner (7) JVM (7) Java Basics (7) Kafka (7) Mac (7) Machine Learning (7) NoSQL (7) C++ (6) Chrome (6) File System (6) Highscalability (6) How to Better (6) Network (6) Restful (6) CareerCup (5) Code Review (5) Hash (5) How to Interview (5) JDK Source Code (5) JavaScript (5) Leetcode (5) Must Known (5) Python (5)

Popular Posts