Dropbox Interview Question: Find all duplicate files by c... | Glassdoor
给一个file path,把里面所有相同的文件都放到一起,把路径用List<List<String>>
输出出来。
相同的定义式byte对比。
相同文件的文件名不一定一样,里面可能还会有sub folder
Your solution needs to be tackle a couple of problems: obtaining a list of
all the files in the file system (e.g. via DFS), binning the lists into
possible matches, repeat via swappable heuristics until your certainty is
100%. (eg size 1st, md5 2nd, byte stream 3rd)
follow up question: what if the files are very big and md5 is too slow.
Randomly sample parts of the file.
Read full article from Dropbox Interview Question: Find all duplicate files by c... | Glassdoor
给一个file path,把里面所有相同的文件都放到一起,把路径用List<List<String>>
输出出来。
相同的定义式byte对比。
相同文件的文件名不一定一样,里面可能还会有sub folder
Your solution needs to be tackle a couple of problems: obtaining a list of
all the files in the file system (e.g. via DFS), binning the lists into
possible matches, repeat via swappable heuristics until your certainty is
100%. (eg size 1st, md5 2nd, byte stream 3rd)
follow up question: what if the files are very big and md5 is too slow.
Randomly sample parts of the file.
Read full article from Dropbox Interview Question: Find all duplicate files by c... | Glassdoor