Tuesday, January 13, 2015

Solr Security



http://java.dzone.com/articles/custom-security-filtering-solr
Post filtering
Even without caching, filter sets default to generate in advance.  In some cases it can be extremely expensive and prohibitive to generate a filter set.  One example of this is with access control filtering that needs to take the users query context into account in order to know which documents are allowed to be returned or not.   Ideally only matching documents, documents that match the query and straightforward filters, should be evaluated for security access control.  It’s wasteful to evaluate any other documents that wouldn’t otherwise match anyway.
  • Documents have an “access control list” associated with them, specifying allowed and disallowed users as well as allowed and disallowed groups.
  • The access control list is an ordered list of allowed/disallowed users and groups.  Order matters, such that the first matching rule determines access.
  • If no allowing access is found, the document is not allowed.
Solr has a relatively new PostFilter capability that allows this last check on filtering documents on the fly. 
In this implementation, the access control rules are entirely specified on each document, in the acl field.  In order to efficiently filter by these rules at query time, Lucene’s FieldCache is used.  There is upfront cost in time and RAM in building the FieldCache data structure, making this rapid to access at query time; when FieldCache is used (sorting, some faceting implementations, function queries, and this custom query parser) it is wise to put in appropriate warming queries to have the FieldCache entries built at commit-time rather than end users waiting longer at query-time.
To make it easy to present, a quick and dirty Velocity template, ids.vm, was added to the conf/velocity directory:
And finally let’s see the results, using the base request of http://localhost:8983/solr/select?q=*:*&wt=velocity&v.template=ids, 


https://trello.com/c/5z5PpR4r/50-design-solr-document-level-security-filter-solution
Document Level Security

Manifold CF (Connector Framework)

One way to add document level security to your search is through Apache ManifoldCF. ManifoldCF "defines a security model for target repositories that permits them to enforce source-repository security policies".

It works by adding security tokens from the source repositories as metadata on the indexed documents. Then, at query time, a Search Component adds a filter to all queries, matching only documents the logged-in user is allowed to see. ManifoldCF supports AD security out of the box.


Path Based Authentication
  <requestHandler name="/instock" class="solr.DisMaxRequestHandler" >
    <lst name="appends">
      <str name="fq">inStock:true</str>
    </lst>
    <lst name="invariants">
      <str name="facet.field">cat</str>
    </lst>
  </requestHandler>

Authentication: You have to authenticate to access any path starting with "/core1/". Other paths can be accessed without authenticating. Authentication will have to be performed against a "realm" called "Test Realm". The "realm" will verify credentials



No comments:

Post a Comment

Labels

Review (554) System Design (293) System Design - Review (189) Java (178) Coding (75) Interview-System Design (65) Interview (60) Book Notes (59) Coding - Review (59) to-do (45) Knowledge (39) Linux (39) Interview-Java (35) Knowledge - Review (32) Database (30) Design Patterns (29) Product Architecture (28) Big Data (27) Soft Skills (27) Miscs (25) MultiThread (25) Concurrency (24) Cracking Code Interview (24) Career (22) Interview - Review (21) Java - Code (21) Operating System (21) Distributed (20) Interview Q&A (20) OOD Design (20) System Design - Practice (19) Security (17) Algorithm (15) How to Ace Interview (15) Brain Teaser (14) Google (13) Linux - Shell (13) Spark (13) Spring (13) Code Quality (12) How to (12) Interview-Database (12) Interview-Operating System (12) Redis (12) Tools (12) Architecture Principles (11) Company - LinkedIn (11) Testing (11) Resource (10) Solr (10) Amazon (9) Cache (9) Search (9) Web Dev (9) Architecture Model (8) Better Programmer (8) Company - Uber (8) Interview - MultiThread (8) Java67 (8) Math (8) OO Design principles (8) SOLID (8) Scalability (8) Cassandra (7) Git (7) Interview Corner (7) JVM (7) Java Basics (7) Machine Learning (7) NoSQL (7) C++ (6) Design (6) File System (6) Highscalability (6) How to Better (6) Kafka (6) Network (6) Restful (6) Trouble Shooting (6) CareerCup (5) Code Review (5) Company - Facebook (5) Hash (5) How to Interview (5) JDK Source Code (5) JavaScript (5) Leetcode (5) Must Known (5) Be Architect (4) Big Fata (4) C (4) Company Product Architecture (4) Data structures (4) Design Principles (4) Facebook (4) GeeksforGeeks (4) Generics (4) Google Interview (4) Hardware (4) JDK8 (4) Optimization (4) Product + Framework (4) Shopping System (4) Source Code (4) Web Service (4) node.js (4) Back-of-Envelope (3) Company - Pinterest (3) Company - Twiiter (3) Company - Twitter (3) Consistent Hash (3) GOF (3) Game Design (3) GeoHash (3) Growth (3) Guava (3) Interview-Big Data (3) Interview-Linux (3) Interview-Network (3) Java EE Patterns (3) Javarevisited (3) Map Reduce (3) Math - Probabilities (3) Performance (3) Puzzles (3) Python (3) Resource-System Desgin (3) Scala (3) UML (3) geeksquiz (3) AI (2) API Design (2) AngularJS (2) Behavior Question (2) Bugs (2) Coding Interview (2) Company - Netflix (2) Crawler (2) Cross Data Center (2) Data Structure Design (2) Database-Shard (2) Debugging (2) Docker (2) Elasticsearch (2) Garbage Collection (2) Go (2) Hadoop (2) Html (2) Interview - Soft Skills (2) Interview-Miscs (2) Interview-Web (2) JDK (2) Logging (2) POI (2) Papers (2) Programming (2) Project Practice (2) Random (2) Software Desgin (2) System Design - Feed (2) Thread Synchronization (2) Video (2) ZooKeeper (2) reddit (2) Ads (1) Advanced data structures (1) Algorithm - Review (1) Android (1) Approximate Algorithms (1) Base X (1) Bash (1) Books (1) C# (1) CSS (1) Chrome (1) Client-Side (1) Cloud (1) CodingHorror (1) Company - Yelp (1) Counter (1) DSL (1) Dead Lock (1) Difficult Puzzles (1) Distributed ALgorithm (1) Eclipse (1) Facebook Interview (1) Function Design (1) Functional (1) GoLang (1) How to Solve Problems (1) ID Generation (1) IO (1) Important (1) Internals (1) Interview - Dropbox (1) Interview - Project Experience (1) Interview Tips (1) Interview-Brain Teaser (1) Interview-How (1) Interview-Mics (1) Interview-Process (1) Jeff Dean (1) Joda (1) LeetCode - Review (1) Library (1) LinkedIn (1) LintCode (1) Mac (1) Micro-Services (1) Mini System (1) MySQL (1) Nigix (1) NonBlock (1) Process (1) Productivity (1) Program Output (1) Programcreek (1) Quora (1) RPC (1) Raft (1) RateLimiter (1) Reactive (1) Reading (1) Reading Code (1) Refactoring (1) Resource-Java (1) Resource-System Design (1) Resume (1) SQL (1) Sampling (1) Shuffle (1) Slide Window (1) Spotify (1) Stability (1) Storm (1) Summary (1) System Design - TODO (1) Tic Tac Toe (1) Time Management (1) Web Tools (1) algolist (1) corejavainterviewquestions (1) martin fowler (1) mitbbs (1)

Popular Posts