Monday, July 10, 2017

Mini Inverted Index - LintCode


https://github.com/zxqiu/leetcode-lintcode/blob/master/system%20design/Inverted_Index.javaCreate an inverted index with given documents.
 Notice
Ensure that data does not include punctuation.
 Example

Given a list of documents with id and content. (class Document)
[
  {
    "id": 1,
    "content": "This is the content of document 1 it is very short"
  },
  {
    "id": 2,
    "content": "This is the content of document 2 it is very long bilabial bilabial heheh hahaha ..."
  },
]
Return an inverted index (HashMap with key is the word and value is a list of document ids).
{
   "This": [1, 2],
   "is": [1, 2],
   ...
}
解:
这道题只需要把每个document中的每个词作为key放进HashMap中,并且把对应的document id组成一个List放进HashMap的value中。
注意需要查重,避免同样的id多次放进某一个key的List里面。

 * Definition of Document:
 * class Document {
 *     public int id;
 *     public String content;
 * }
 */
public class Solution {
    /**
     * @param docs a list of documents
     * @return an inverted index
     */
    public Map<String, List<Integer>> invertedIndex(List<Document> docs) {
        Map<String, List<Integer>> ret = new HashMap<String, List<Integer>>();
        if (docs == null) {
            return ret;
        }
       
        for (Document doc : docs) {
            String[] words = doc.content.split(" ");
           
            for (String word : words) {
                if (word.length() == 0) {
                    continue;
                }
                if (ret.containsKey(word)) {
                    if (ret.get(word).contains(doc.id)) {
                        continue;
                    }
                    ret.get(word).add(doc.id);
                } else {
                    List<Integer> list = new ArrayList<Integer>();
                    list.add(doc.id);
                    ret.put(word, list);
                }
            }
        }
       
        return ret;
    }



Labels

Review (553) System Design (289) System Design - Review (189) Java (177) Coding (75) Interview-System Design (65) Interview (60) Book Notes (59) Coding - Review (59) to-do (45) Knowledge (39) Linux (38) Interview-Java (35) Knowledge - Review (32) Database (29) Design Patterns (29) Product Architecture (28) Big Data (27) Miscs (25) Concurrency (24) Cracking Code Interview (24) MultiThread (24) Soft Skills (23) Career (22) Interview - Review (21) Java - Code (21) Operating System (21) Distributed (20) Interview Q&A (20) OOD Design (20) System Design - Practice (19) Algorithm (15) How to Ace Interview (15) Security (15) Brain Teaser (14) Google (13) Linux - Shell (13) Spark (13) Spring (13) Code Quality (12) How to (12) Interview-Database (12) Interview-Operating System (12) Redis (12) Tools (12) Architecture Principles (11) Company - LinkedIn (11) Resource (10) Testing (10) Amazon (9) Cache (9) Search (9) Solr (9) Web Dev (9) Architecture Model (8) Better Programmer (8) Company - Uber (8) Interview - MultiThread (8) Java67 (8) Math (8) OO Design principles (8) SOLID (8) Scalability (8) Git (7) Interview Corner (7) JVM (7) Java Basics (7) Machine Learning (7) NoSQL (7) C++ (6) Design (6) File System (6) Highscalability (6) How to Better (6) Network (6) Restful (6) Trouble Shooting (6) CareerCup (5) Cassandra (5) Code Review (5) Company - Facebook (5) Hash (5) How to Interview (5) JDK Source Code (5) JavaScript (5) Kafka (5) Leetcode (5) Must Known (5) Be Architect (4) Big Fata (4) C (4) Company Product Architecture (4) Data structures (4) Design Principles (4) Facebook (4) GeeksforGeeks (4) Generics (4) Google Interview (4) Hardware (4) JDK8 (4) Optimization (4) Product + Framework (4) Shopping System (4) Source Code (4) Web Service (4) node.js (4) Back-of-Envelope (3) Company - Pinterest (3) Company - Twiiter (3) Company - Twitter (3) Consistent Hash (3) GOF (3) Game Design (3) GeoHash (3) Growth (3) Guava (3) Interview-Big Data (3) Interview-Linux (3) Interview-Network (3) Java EE Patterns (3) Javarevisited (3) Map Reduce (3) Math - Probabilities (3) Performance (3) Puzzles (3) Python (3) Resource-System Desgin (3) Scala (3) UML (3) geeksquiz (3) AI (2) API Design (2) AngularJS (2) Behavior Question (2) Bugs (2) Coding Interview (2) Company - Netflix (2) Crawler (2) Cross Data Center (2) Data Structure Design (2) Database-Shard (2) Debugging (2) Docker (2) Elasticsearch (2) Garbage Collection (2) Go (2) Hadoop (2) Html (2) Interview - Soft Skills (2) Interview-Miscs (2) Interview-Web (2) JDK (2) Logging (2) POI (2) Papers (2) Programming (2) Project Practice (2) Random (2) Software Desgin (2) System Design - Feed (2) Thread Synchronization (2) Video (2) ZooKeeper (2) reddit (2) Ads (1) Advanced data structures (1) Algorithm - Review (1) Android (1) Approximate Algorithms (1) Base X (1) Bash (1) Books (1) C# (1) CSS (1) Chrome (1) Client-Side (1) Cloud (1) CodingHorror (1) Company - Yelp (1) Counter (1) DSL (1) Dead Lock (1) Difficult Puzzles (1) Distributed ALgorithm (1) Eclipse (1) Facebook Interview (1) Function Design (1) Functional (1) GoLang (1) How to Solve Problems (1) ID Generation (1) IO (1) Important (1) Internals (1) Interview - Dropbox (1) Interview - Project Experience (1) Interview Tips (1) Interview-Brain Teaser (1) Interview-How (1) Interview-Mics (1) Interview-Process (1) Jeff Dean (1) Joda (1) LeetCode - Review (1) Library (1) LinkedIn (1) LintCode (1) Mac (1) Micro-Services (1) Mini System (1) MySQL (1) Nigix (1) NonBlock (1) Process (1) Productivity (1) Program Output (1) Programcreek (1) Quora (1) RPC (1) Raft (1) RateLimiter (1) Reactive (1) Reading (1) Reading Code (1) Refactoring (1) Resource-Java (1) Resource-System Design (1) Resume (1) SQL (1) Sampling (1) Shuffle (1) Slide Window (1) Spotify (1) Stability (1) Storm (1) Summary (1) System Design - TODO (1) Tic Tac Toe (1) Time Management (1) Web Tools (1) algolist (1) corejavainterviewquestions (1) martin fowler (1) mitbbs (1)

Popular Posts