Saturday, October 24, 2015

Java IO Utils
We use GZIPInputStream to read gzip files, use ZipInputStream to read zip files.

ByteArrayInputStream bais = new ByteArrayInputStream(responseBytes);
GZIPInputStream gzis = new GZIPInputStream(bais);
InputStreamReader reader = new InputStreamReader(gzis);
BufferedReader in = new BufferedReader(reader);

String readed;
while ((readed = in.readLine()) != null) {
    public static String read(InputStream input) throws IOException {
        try (BufferedReader buffer = new BufferedReader(new InputStreamReader(input))) {
            return buffer.lines().collect(Collectors.joining("\n"));
first you must create the ZipInputStream instance giving the file that you wish to expand. Then you iterate using the getNextEntry method on the stream, which returns the header data for each entry in turn. Importantly this does not contain the data, which is actually read from the stream separately.
The package provides the following classes for extracting files and directories from a ZIP archive:
    • ZipInputStream: this is the main class which can be used for reading zip file and extracting files and directories (entries) within the archive. Here are some important usages of this class:
      • read a zip via its constructor ZipInputStream(FileInputStream)
      • read entries of files and directories via method getNextEntry()
      • read binary data of current entry via method read(byte)
      • close current entry via method closeEntry()
      • close the zip file via method close()
    • ZipEntry: this class represents an entry in the zip file. Each file or directory is represented as a ZipEntry object. Its methodgetName() returns a String which represents path of the file/directory. The path is in the following form:

    public void unzip(String zipFilePath, String destDirectory) throws IOException {
        File destDir = new File(destDirectory);
        if (!destDir.exists()) {
        ZipInputStream zipIn = new ZipInputStream(new FileInputStream(zipFilePath));
        ZipEntry entry = zipIn.getNextEntry();
        // iterates over entries in the zip file
        while (entry != null) {
            String filePath = destDirectory + File.separator + entry.getName();
            if (!entry.isDirectory()) {
                // if the entry is a file, extracts it
                extractFile(zipIn, filePath);
            else {
                // if the entry is a directory, make the directory
                File dir = new File(filePath);
            entry = zipIn.getNextEntry();
     * Extracts a zip entry (file entry)
     * @param zipIn
     * @param filePath
     * @throws IOException
    private void extractFile(ZipInputStream zipIn, String filePath) throws IOException {
        BufferedOutputStream bos = new BufferedOutputStream(new FileOutputStream(filePath));
        byte[] bytesIn = new byte[BUFFER_SIZE];
        int read = 0;
        while ((read = != -1) {
            bos.write(bytesIn, 0, read);
  1. public boolean equals(Object obj) {  
  2.     if ((obj != null) && (obj instanceof File)) {  
  3.         return compareTo((File)obj) == 0;  
  4.     }  
  5.     return false;  
  6. }  
  7. static private FileSystem fs = FileSystem.getFileSystem();  
  8. public int compareTo(File pathname) {  
  9.     return, pathname);  
  10. }中没有对Unix/Linux的实现,只有Win32FileSystem,所以都是默认调用的这个实现类。 它对文件的比较,其实就是对文件名和绝对路径的比较。如果两个File对象有相同的getPath(),就认为他们是同一个文件。而且能看出来,Windows是不区分大小写的。
  1. public int compare(File f1, File f2) {  
  2.     return f1.getPath().compareToIgnoreCase(f2.getPath());  
  3. }  
这样通过比较绝对路径来检验两个对象是否指向同一个文件的方法,能适用大部分的情况,但也要小心。比如说,Linux下面,文件名对大小写是敏感的,就不能ignore了。而且通过硬链接建立的文件,实质还是指向同一个文件的,但是在File.equal()中却为false, File)
    public int compare(File f1, File f2) {
        return f1.getPath().compareTo(f2.getPath());
  1. public boolean isSameFile(Path path, Path path2) throws IOException {  
  2.     return provider(path).isSameFile(path, path2);   
  3. }  
  4. private static FileSystemProvider provider(Path path) {  
  5.     return path.getFileSystem().provider();  
  6. }  
  1. public boolean isSameFile(Path obj1, Path obj2) throws IOException {  
  2.     UnixPath file1 = UnixPath.toUnixPath(obj1);  
  3.     if (file1.equals(obj2))  
  4.         return true;  
  6.     file1.checkRead();file2.checkRead();  
  7.     UnixFileAttributes attrs1 = UnixFileAttributes.get(file1, true);  
  8.     UnixFileAttributes attrs2 = UnixFileAttributes.get(file2, true);  
  9.     return attrs1.isSameFile(attrs2);  
  10. }  
  1. public boolean equals(Object ob) {  
  2.     if ((ob != null) && (ob instanceof UnixPath))  
  3.         return compareTo((Path)ob) == 0;    // compare two path  
  4.     return false;  
  5. }  
  6. public int compareTo(Path other) {  
  7.     int len1 = path.length;  
  8.     int len2 = ((UnixPath) other).path.length;  
  9.     int n = Math.min(len1, len2);  
  10.     byte v1[] = path;  
  11.     byte v2[] = ((UnixPath) other).path;  
  12.     int k = 0;  
  13.     while (k < n) {  
  14.         int c1 = v1[k] & 0xff;  
  15.         int c2 = v2[k] & 0xff;  
  16.         if (c1 != c2)  
  17.             return c1 - c2;  
  18.     }  
  19.     return len1 - len2;  
  20. }  
  1. boolean isSameFile(UnixFileAttributes attrs) {  
  2.     return ((st_ino == attrs.st_ino) && (st_dev == attrs.st_dev));  
  3. }  
而对于Windows系统,也是大同小异,来看看WindowsFileSystemProvider.isSameFile(),WindowsPath.equal()和 WindowsFileAttributes.isSameFile()。
  1. static boolean isSameFile(WindowsFileAttributes attrs1, WindowsFileAttributes attrs2) {  
  2.     // volume serial number and file index must be the same  
  3.     return (attrs1.volSerialNumber == attrs2.volSerialNumber) &&  
  4.         (attrs1.fileIndexHigh == attrs2.fileIndexHigh) &&  
  5.         (attrs1.fileIndexLow == attrs2.fileIndexLow);  
Scanner Class – (easy, less typing, but not recommended very slow, refer this for reasons of slowness): In most of the cases we get TLE while using scanner class. It uses built-in nextInt(), nextLong(), nextDouble methods to read the desired object after initiating scanner object with input stream.(eg

java.util.Scanner class is a simple text scanner which can parse primitive types and strings. It internally uses regular expressions to read different types. class reads text from a character-input stream, buffering characters so as to provide for the efficient reading of sequence of characters

In Scanner class if we call nextLine() method after any one of the seven nextXXX() method then the nextLine() doesn’t not read values from console and cursor will not come into console it will skip that step. The nextXXX() methods are nextInt(), nextFloat(), nextByte(), nextShort(), nextDouble(), nextLong(), next().
In BufferReader class there is no such type of problem. This problem occurs only for Scanner class, due to nextXXX() methods ignore newline character and nextLine() only reads newline character. If we use one more call of nextLine() method between nextXXX() and nextLine(), then this problem will not occur because nextLine() will consume the newline character. See this for the corrected program. This problem is same as scanf() followed by gets() in C/C++.

  • BufferedReader is synchronous while Scanner is not. BufferedReader should be used if we are working with multiple threads.
  • BufferedReader has significantly larger buffer memory than Scanner.
  • The Scanner has a little buffer (1KB char buffer) as opposed to the BufferedReader (8KB byte buffer), but it’s more than enough.
  • BufferedReader is a bit faster as compared to scanner because scanner does parsing of input data and BufferedReader simply reads sequence of characters.


Review (572) System Design (334) System Design - Review (198) Java (189) Coding (75) Interview-System Design (65) Interview (63) Book Notes (59) Coding - Review (59) to-do (45) Linux (43) Knowledge (39) Interview-Java (35) Knowledge - Review (32) Database (31) Design Patterns (31) Big Data (29) Product Architecture (28) MultiThread (27) Soft Skills (27) Concurrency (26) Cracking Code Interview (26) Miscs (25) Distributed (24) OOD Design (24) Google (23) Career (22) Interview - Review (21) Java - Code (21) Operating System (21) Interview Q&A (20) System Design - Practice (20) Tips (19) Algorithm (17) Company - Facebook (17) Security (17) How to Ace Interview (16) Brain Teaser (14) Linux - Shell (14) Redis (14) Testing (14) Tools (14) Code Quality (13) Search (13) Spark (13) Spring (13) Company - LinkedIn (12) How to (12) Interview-Database (12) Interview-Operating System (12) Solr (12) Architecture Principles (11) Resource (10) Amazon (9) Cache (9) Git (9) Interview - MultiThread (9) Scalability (9) Trouble Shooting (9) Web Dev (9) Architecture Model (8) Better Programmer (8) Cassandra (8) Company - Uber (8) Java67 (8) Math (8) OO Design principles (8) SOLID (8) Design (7) Interview Corner (7) JVM (7) Java Basics (7) Kafka (7) Mac (7) Machine Learning (7) NoSQL (7) C++ (6) Chrome (6) File System (6) Highscalability (6) How to Better (6) Network (6) Restful (6) CareerCup (5) Code Review (5) Hash (5) How to Interview (5) JDK Source Code (5) JavaScript (5) Leetcode (5) Must Known (5) Python (5)

Popular Posts