Wednesday, February 5, 2020

Java Stream Advanced Usage



https://stackoverflow.com/questions/36255007/is-there-any-way-to-reuse-a-stream
.A stream should be operated on (invoking an intermediate or terminal stream operation) only once.
A stream implementation may throw IllegalStateException if it detects that the stream is being reused.

https://stackoverflow.com/questions/38963338/stream-way-to-get-index-of-first-element-matching-boolean
Streams and indexing don't mix well. You're usually better off falling back to an old-style loop at that point

OptionalInt indexOpt = IntStream.range(0, users.size())
     .filter(i -> searchName.equals(users.get(i)))
     .findFirst();


https://rules.sonarsource.com/java/RSPEC-3864
  • As long as a stream implementation can reach the final step, it can freely optimize processing by only producing some elements or even none at all (e.g. relying on other collection methods for counting elements). Accordingly, the peek() action will be invoked for fewer elements or not at all.
http://marxsoftware.blogspot.com/2018/06/peeking-inside-java-streams.html

https://www.baeldung.com/java-streams-peek-api
The reason peek() didn't work in our first example is that it's an intermediate operation and we didn't apply a terminal operation to the pipeline. 

peek()‘s Javadoc page says: “This method exists mainly to support debugging, where you want to see the elements as they flow past a certain point in a pipeline“.

On top of that, peek() can be useful in another scenario: when we want to alter the inner state of an element. For example, let's say we want to convert all user's name to lowercase before printing them:
1
2
3
Stream<User> userStream = Stream.of(new User("Alice"), new User("Bob"), new User("Chuck"));
userStream.peek(u -> u.setName(u.getName().toLowerCase()))
  .forEach(System.out::println);
Alternatively, we could have used map(), but peek() is more convenient since we don't want to replace the element.

https://mkyong.com/java8/java-8-stream-the-peek-is-not-working-with-count/
Refer to the Java 9 .count() Java docs
An implementation may choose to not execute the stream pipeline (either sequentially or in parallel) 
if it is capable of computing the count directly from the stream source. 
In such cases no source elements will be traversed and no intermediate operations will be evaluated. 

Since Java 9, if JDK compiler is able computing the count directly from the stream (optimization in Java 9), it didn’t traverse the stream, so there is no need to run peek() at all.

https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/stream/Stream.html#peek(java.util.function.Consumer)
This method exists mainly to support debugging, where you want to see the elements as they flow past a certain point in a pipeline:

     Stream.of("one", "two", "three", "four")
         .filter(e -> e.length() > 3)
         .peek(e -> System.out.println("Filtered value: " + e))
         .map(String::toUpperCase)
         .peek(e -> System.out.println("Mapped value: " + e))
         .collect(Collectors.toList());
 
In cases where the stream implementation is able to optimize away the production of some or all the elements (such as with short-circuiting operations like findFirst, or in the example described in count()), the action will not be invoked for those elements.

Don't use the API in an unintended way, even if it accomplishes your immediate goal. That approach may break in the future, and it is also unclear to future maintainers.
Further, while streams guarantee maintaining the encounter order for certain combination of operations even for parallel streams, these guarantees do not apply to peek. When collecting into a list, the resulting list will have the right order for ordered parallel streams, but the peek action may get invoked in an arbitrary order and concurrently.
So the most useful thing you can do with peek is to find out whether a stream element has been processed which is exactly what the API documentation says:
This method exists mainly to support debugging, where you want to see the elements as they flow past a certain point in a pipeline

2. Readability – for most people, at least
1. Performance – you will lose on it
3. Maintainability



The streams framework does not (and cannot) enforce any of these. If the computation is not independent, then running it in parallel will not make any sense and might even be harmfully wrong. The other criteria stem from three engineering issues and tradeoffs:
Splittability
The most efficiently splittable collections include ArrayLists and {Concurrent}HashMaps, as well as plain arrays (i.e., those of form T[], split using static java.util.Arrays methods). The least efficient are LinkedLists, BlockingQueues, and most IO-based sources. Others are somewhere in the middle. (Data structures tend to be efficiently splittable if they internally support random access, efficient search, or both.) If it takes longer to partition data than to process it, the effort is wasted. So, if the Q factor of computations is high enough, you may get a parallel speedup even for a LinkedList, but this is not very common. Additionally, some sources cannot be split completely down to single elements, so there may be limits in how finely tasks are partitioned.

https://www.baeldung.com/guava-21-new
https://guava.dev/releases/23.0/api/docs/com/google/common/collect/Streams.html


https://docs.oracle.com/javase/9/docs/api/java/util/stream/Stream.html#count--
     List<String> asList = stringStream.collect(ArrayList::new, ArrayList::add,
                                                ArrayList::addAll);
 
The following will take a stream of strings and concatenates them into a single string:

     String concat = stringStream.collect(StringBuilder::new, StringBuilder::append,
                                          StringBuilder::append)
                                 .toString();
 

Labels

Review (572) System Design (334) System Design - Review (198) Java (189) Coding (75) Interview-System Design (65) Interview (63) Book Notes (59) Coding - Review (59) to-do (45) Linux (43) Knowledge (39) Interview-Java (35) Knowledge - Review (32) Database (31) Design Patterns (31) Big Data (29) Product Architecture (28) MultiThread (27) Soft Skills (27) Concurrency (26) Cracking Code Interview (26) Miscs (25) Distributed (24) OOD Design (24) Google (23) Career (22) Interview - Review (21) Java - Code (21) Operating System (21) Interview Q&A (20) System Design - Practice (20) Tips (19) Algorithm (17) Company - Facebook (17) Security (17) How to Ace Interview (16) Brain Teaser (14) Linux - Shell (14) Redis (14) Testing (14) Tools (14) Code Quality (13) Search (13) Spark (13) Spring (13) Company - LinkedIn (12) How to (12) Interview-Database (12) Interview-Operating System (12) Solr (12) Architecture Principles (11) Resource (10) Amazon (9) Cache (9) Git (9) Interview - MultiThread (9) Scalability (9) Trouble Shooting (9) Web Dev (9) Architecture Model (8) Better Programmer (8) Cassandra (8) Company - Uber (8) Java67 (8) Math (8) OO Design principles (8) SOLID (8) Design (7) Interview Corner (7) JVM (7) Java Basics (7) Kafka (7) Mac (7) Machine Learning (7) NoSQL (7) C++ (6) Chrome (6) File System (6) Highscalability (6) How to Better (6) Network (6) Restful (6) CareerCup (5) Code Review (5) Hash (5) How to Interview (5) JDK Source Code (5) JavaScript (5) Leetcode (5) Must Known (5) Python (5)

Popular Posts