Monday, July 10, 2017

Mini Inverted Index - LintCode



https://github.com/zxqiu/leetcode-lintcode/blob/master/system%20design/Inverted_Index.javaCreate an inverted index with given documents.
 Notice
Ensure that data does not include punctuation.
 Example

Given a list of documents with id and content. (class Document)
[
  {
    "id": 1,
    "content": "This is the content of document 1 it is very short"
  },
  {
    "id": 2,
    "content": "This is the content of document 2 it is very long bilabial bilabial heheh hahaha ..."
  },
]
Return an inverted index (HashMap with key is the word and value is a list of document ids).
{
   "This": [1, 2],
   "is": [1, 2],
   ...
}
解:
这道题只需要把每个document中的每个词作为key放进HashMap中,并且把对应的document id组成一个List放进HashMap的value中。
注意需要查重,避免同样的id多次放进某一个key的List里面。

 * Definition of Document:
 * class Document {
 *     public int id;
 *     public String content;
 * }
 */
public class Solution {
    /**
     * @param docs a list of documents
     * @return an inverted index
     */
    public Map<String, List<Integer>> invertedIndex(List<Document> docs) {
        Map<String, List<Integer>> ret = new HashMap<String, List<Integer>>();
        if (docs == null) {
            return ret;
        }
       
        for (Document doc : docs) {
            String[] words = doc.content.split(" ");
           
            for (String word : words) {
                if (word.length() == 0) {
                    continue;
                }
                if (ret.containsKey(word)) {
                    if (ret.get(word).contains(doc.id)) {
                        continue;
                    }
                    ret.get(word).add(doc.id);
                } else {
                    List<Integer> list = new ArrayList<Integer>();
                    list.add(doc.id);
                    ret.put(word, list);
                }
            }
        }
       
        return ret;
    }


Labels

Review (572) System Design (334) System Design - Review (198) Java (189) Coding (75) Interview-System Design (65) Interview (63) Book Notes (59) Coding - Review (59) to-do (45) Linux (43) Knowledge (39) Interview-Java (35) Knowledge - Review (32) Database (31) Design Patterns (31) Big Data (29) Product Architecture (28) MultiThread (27) Soft Skills (27) Concurrency (26) Cracking Code Interview (26) Miscs (25) Distributed (24) OOD Design (24) Google (23) Career (22) Interview - Review (21) Java - Code (21) Operating System (21) Interview Q&A (20) System Design - Practice (20) Tips (19) Algorithm (17) Company - Facebook (17) Security (17) How to Ace Interview (16) Brain Teaser (14) Linux - Shell (14) Redis (14) Testing (14) Tools (14) Code Quality (13) Search (13) Spark (13) Spring (13) Company - LinkedIn (12) How to (12) Interview-Database (12) Interview-Operating System (12) Solr (12) Architecture Principles (11) Resource (10) Amazon (9) Cache (9) Git (9) Interview - MultiThread (9) Scalability (9) Trouble Shooting (9) Web Dev (9) Architecture Model (8) Better Programmer (8) Cassandra (8) Company - Uber (8) Java67 (8) Math (8) OO Design principles (8) SOLID (8) Design (7) Interview Corner (7) JVM (7) Java Basics (7) Kafka (7) Mac (7) Machine Learning (7) NoSQL (7) C++ (6) Chrome (6) File System (6) Highscalability (6) How to Better (6) Network (6) Restful (6) CareerCup (5) Code Review (5) Hash (5) How to Interview (5) JDK Source Code (5) JavaScript (5) Leetcode (5) Must Known (5) Python (5)

Popular Posts