Wednesday, October 5, 2016

Learning Scala Part 2



http://www.scala-lang.org/old/node/7532
Check scala version at runtime (for debug)
scala.util.Properties.versionString
http://stackoverflow.com/questions/31818279/scala-regex-example-not-working
I tried it on Scala 2.11.6 everything works fine. Could be connected to the Scala version.
Best action would be to update to 2.11.x. If you are stuck with 2.10 you could try 2.10.5.
http://rnowling.github.io/software/engineering/2015/07/01/gotcha-scala-collections.html
In my initial implementation, I used Scala’s version of a mutable vector, ArrayBuffer. Like Java’s ArrayList and Vector and Python’s listArrayBuffer is backed by an array (contiguous block of memory). Using an array enables ArrayBuffer to provide random access reads and writes in O(1) operations but searching, insertions, and deletions require O(N) operations. ArrayBufferalso tends to be very cache friendly for iteration since adjacent elements are contiguous in memory.
ArrayBuffer can be a poor choice when dealing with a large number (10k or more) of elements. While expanding the array, vectors use O(3N) memory. Even worse, a vector can be cause significant heap fragmentation, leading to very expensive allocations that require garbage collecting and compacting the heap.

Since I didn’t need random access, I was able to use a ListBuffer, which is backed by a linked list. Linked lists sacrifice constant-time random access and locality but don’t require contiguous blocks of memory. Using a ListBuffer improved my run times (but not considerably) and enabled me to read all ~100 GB of my data into memory.
I profiled my application with YourKit and realized that my application was spending significant amounts of time deleting/appending at the end and in calls to size in a loop. I was used to Java’s LinkedList and Python’s deque which provide several enhancements over Scala’s ListBuffer. Both implementations use doubly-linked lists and maintainer pointers both the first and last links, enabling constant time pushes, pops, and peeks on both ends whereas ListBuffer requires O(N) operations for pushes, pops, and peeks at the end of the list. Additionally, LinkedList and deque store a counter which is updated whenever elements are added or removed, enabling constant-time reporting of the list’s length. ListBuffer recomputes the length of the list every time size() is called, requiring O(N) operations. As a result of their implementations, LinkedList and deque can be used as efficiently as lists, stacks, and queues whereas Scala’s mutable collections library contains separate implementations for each.
In the end, I was able to restructure my calculations to work around ListBuffer’s inefficiencies. I ended up using a Stack and built my list in reverse. Once built, I called reverse to get the proper order. I also moved the call to size outside of the loop.
http://stackoverflow.com/questions/2712877/difference-between-array-and-list-in-scala
While an Array[A] is literally a Java array, a List[A] is an immutable data structure that is either Nil (the empty list) or consists of a pair (A, List[A]).
http://tutorials.jenkov.com/scala/exception-try-catch-finally.html
try {

  throwsException();

} catch {

  case e: IllegalArgumentException => println("illegal arg. exception");
  case e: IllegalStateException    => println("illegal state exception");
  case e: IOException              => println("IO exception");

} finally {

  println("this code is always executed");
    
}

http://tutorials.jenkov.com/scala/for.html
for(i <- 1 to 10) {
    println("i is " + i);
}
The i <- construct is called a generator. For each iteration it initializes the val i with a value.
The 1 to 10 is actually a method call that returns the Range type. It is equivalent to
(1).to(10);

to vs. until

You can use either the keyword to or until when creating a Range object. The difference is, that toincludes the last value in the range, whereas until leaves it out

for(i <- 1 to 10) {
    println("i is " + i);
}

for(i <- 1 until 10) {
    println("i is " + i);
}
The first loop iterates 10 times, from 1 to 10 including 10.
The second loop iterates 9 times, from 1 to 9, excluding the upper boundary value 10.

var myArray : Array[String] = new Array[String](10);

for(i <- 0 until myArray.length){
    myArray(i) = "value is: " + i;
}

for(value : String <- myArray ) {
    println(value);
}
for(value : String <- myArray if value.endsWith("5")) {
    println(value);
}


for(value : String <- myArray
    if value.endsWith("5");
    if value.indexOf("value") != -1 ) {

    println(value);
}
for(anArray : Array[String] <- myArray) {
    for(aString : String <- anArray) {
        println(aString);
    }
}

It is possible to bind values to a variable in the middle of a nested iteration, like this:
for(anArray : Array[String] <- myArray;
    aString : String        <- anArray;
    aStringUC = aString.toUpperCase()
    if aStringUC.indexOf("VALUE") != -1;
    if aStringUC.indexOf("5") != -1
    ) {

    println(aString);
}
http://stackoverflow.com/questions/3127208/when-is-a-return-type-required-for-methods-in-scala
The Scala compiler can often infer return types for methods, but there are some circumstances where it's required to specify the return type. Recursive methods, for example, require a return type to be specified.
I notice that sometimes I get the error message "overloaded method (methodname) requires return type", but it's not a general rule that return types must always be specified for overloaded methods (I have examples where I don't get this error).

down voteaccepted
The Chapter 2. Type Less, Do More of the Programming Scala book mentions:
When Explicit Type Annotations Are Required.
In practical terms, you have to provide explicit type annotations for the following situations:
Method return values in the following cases:
  • When you explicitly call return in a method (even at the end).
  • When a method is recursive.
  • When a method is overloaded and one of the methods calls another. The calling method needs a return type annotation.
  • When the inferred return type would be more general than you intended, e.g., Any.
Example:
// code-examples/TypeLessDoMore/method-nested-return-script.scala
// ERROR: Won't compile until you put a String return type on upCase.

def upCase(s: String) = {
  if (s.length == 0)
    return s    // ERROR - forces return type of upCase to be declared.
  else
    s.toUpperCase()
}
Overloaded methods can sometimes require an explicit return type. When one such method calls another, we have to add a return type to the one doing the calling, as in this example.
// code-examples/TypeLessDoMore/method-overloaded-return-script.scala
// Version 1 of "StringUtil" (with a compilation error).
// ERROR: Won't compile: needs a String return type on the second "joiner".

object StringUtil {
  def joiner(strings: List[String], separator: String): String =
    strings.mkString(separator)

  def joiner(strings: List[String]) = joiner(strings, " ")   // ERROR
}
import StringUtil._  // Import the joiner methods.

println( joiner(List("Programming", "Scala")) )
The two joiner methods concatenate a List of strings together.
The first method also takes an argument for the separator string.
The second method calls the first with a “default” separator of a single space.
If you run this script, you get the following error.
... 9: error: overloaded method joiner needs result type
def joiner(strings: List[String]) = joiner(strings, "")
Since the second joiner method calls the first, it requires an explicit String return type. It should look like this:
def joiner(strings: List[String]): String = joiner(strings, " ")

Basically, specifying the return type can be a good practice even though Scala can infer it.

Randall Schulz comments:
As a matter of (my personal) style, I give explicit return types for all but the most simple methods (basically, one-liners with no conditional logic).
Keep in mind that if you let the compiler infer a method's result type, it may well be more specific than you want. (E.g., HashMap instead of Map.)
And since you may want to expose the minimal interface in your return type (see for instance this SO question), this kind of inference might get in the way.

And about the last scenario ("When the inferred return type would be more general than you intended"), Ken Bloom adds:
specify the return type when you want the compiler to verify that code in the function returns the type you expected

http://docs.scala-lang.org/overviews/parallel-collections/overview.html
  1. list.par.map(_ + 42)
http://docs.scala-lang.org/overviews/parallel-collections/conversions.html
mutable
ArrayParArray
HashMapParHashMap
HashSetParHashSet
TrieMapParTrieMap
immutable
VectorParVector
RangeParRange
HashMapParHashMap
HashSetParHashSet

http://stackoverflow.com/questions/9535821/scala-mutable-var-method-parameter-reference

http://stackoverflow.com/questions/3954278/mutable-method-parameters-in-scala
You can't.
You'll have to declare an extra var (or use a more functional style :-)).
http://alvinalexander.com/scala/how-to-use-stream-class-lazy-list-scala-cookbook
Just like a List can be constructed with ::, a Stream can be constructed with the #:: method, using Stream.empty at the end of the expression instead of Nil:
scala> val stream = 1 #:: 2 #:: 3 #:: Stream.empty
stream: scala.collection.immutable.Stream[Int] = Stream(1, ?)
The REPL output shows that the stream begins with the number 1 but uses a ? to denote the end of the stream. This is because the end of the stream hasn’t been evaluated yet. For example, given a Stream:
scala> val stream = (1 to 100000000).toStream
stream: scala.collection.immutable.Stream[Int] = Stream(1, ?)
As discussed in Recipe 10.24, “Creating a Lazy View on a Collection”, transformer methods are computed lazily, so when transformers are called, you see the familiar ? character that indicates the end of the stream hasn’t been evaluated yet:
stream(0)  // returns 1
stream(1)  // returns 2
// ...
stream(10)  // returns 11
http://stackoverflow.com/questions/8566728/using-streams-for-iteration-in-scala
al naturals = Stream.from(0) // 0, 1, 2, ...
val odds = naturals.map(_ * 2 + 1) // 1, 3, 5, ...
val oddInverses = odds.map(1.0d / _) // 1/1, 1/3, 1/5, ...
val alternations = Stream.iterate(1)(-_) // 1, -1, 1, ...
val products = (oddInverses zip alternations)
      .map(ia => ia._1 * ia._2) // 1/1, -1/3, 1/5, ...

// Computes a stream representing the cumulative sum of another one
def sumUp(s : Stream[Double], acc : Double = 0.0d) : Stream[Double] =
  Stream.cons(s.head + acc, sumUp(s.tail, s.head + acc))

val pi = sumUp(products).map(_ * 4.0) // Approximations of pi.
http://daily-scala.blogspot.com/2010/01/streams-2-stream-construction.html
  1. import Stream._
  1. // create an infinite stream starting at 10
  2. scala> from (10) take 3 foreach println  
  1. // an infinite stream starting at 10 an increasing by 3
  2. scala> from (10,3) take 3 foreach println
  1. // converting an interator to a stream
  2. scala> (1 until 4).iterator.toStream foreach println
  1. // creating an Iterable to a stream
  2. scala> (1 until 4).toStream foreach println  
  1. signature is iterator(start)(elem)
  2. basically starting at 3 execute the function on the previous value in the stream.
  3. This is an infinite stream
  4. */
  5. scala> iterate(3){i => i-10} take 5 foreach println _   
scala> range(1,3) foreach println   
  1. scala> iterate(3){i => i} take 5 foreach println _  
  1. scala> iterate(3,5){i => i} foreach println       

scala doesn’t force you to catch exceptions that you don’t care about—not even checked exceptions.
you certainly should handle exceptions you can do something about—that’s what catch is for
use pattern matching for handling the exceptions
  ​catch​ {
​    ​case​ ex : IllegalArgumentException => println(ex.getMessage)
    ​case​ _ : Throwable => println(​"Something went wrong"​)
  }
  declaring checked exceptions optional. Scala doesn’t require us to declare what exceptions we intend to throw
When attempting to handle exceptions, Java watches over the order in which we place multiple catch blocks
Mind the Catch Order


synchronous transformations of immutable state: the Future.
Java's Future requires that you access the result via a blocking get method.
On the Java platform, each object is associated with a logical monitor, which can be used to control multi-threaded access to data.

many operations on Future require an implicit execution context that provides a strategy for executing functions asynchronously
On the JVM, the global execution context uses a thread pool.
import scala.concurrent.ExecutionContext.Implicits.global
val fut = Future { Thread.sleep(10000); 21 + 21 }

val result = fut.map(x => x + 1)
for {
  x <- fut1
  y <- fut2
} yield x + y
for expressions serialize their transformations

Future's collect method allows you to validate the future value and transform it in one operation.
The recover method allows you to transform a failed future into a successful one, allowing a successful future's result to pass through unchanged

The recoverWith method is similar to recover, except instead of recovering to a value like recover, the recoverWith method allows you to recover to a future value
val second = failure.transform(
 res => res * -1,
 ex => new Exception("see cause", ex)
)

val futureNums = List(fortyTwo, fortySix)
val folded =
  Future.fold(futureNums)(0) { (acc, num) =>
  acc + num
}
The Future.reduce method performs a fold without a zero, using the initial future result as the start value.
Performing side-effects: foreach, onComplete, and andThen
for (res <- success) println(res)

afuture onComplete {
  case Success(res) => println(res)
  case Failure(ex) => println(ex)
}
val x = Await.result(fut, 15.seconds)
import org.scalatest.Matchers._
x should be (42)

You could optimize this example a bit by using a withFilter call instead of filter. This would avoid the creation of an intermediate data structure

You could optimize this example a bit by using a withFilter call instead of filter. This would avoid the creation of an intermediate data structure

for (p <- persons; if !p.isMale; c <- p.children)
   yield (p.name, c.name)
for (p <- persons; n = p.name; if (n startsWith "To"))
yield n
for {
  p <- persons              // a generator
  n = p.name                // a definition
  if (n startsWith "To")    // a filter
} yield n

val syncArray = new ArrayBuffer[X] with mutable.SynchronizedBuffer[X]

.mapPartitionsWithIndex{(index, iterator) => { }

ArrayBuffer
- Prepends and removes O(n)
val buf = new ArrayBuffer[Int]()
buf += 12

ListBuffer

-
A ListBuffer is a mutable object (contained in package scala.collection.mutable), which can help you build lists more efficiently when you need to append

Labels

Review (572) System Design (334) System Design - Review (198) Java (189) Coding (75) Interview-System Design (65) Interview (63) Book Notes (59) Coding - Review (59) to-do (45) Linux (43) Knowledge (39) Interview-Java (35) Knowledge - Review (32) Database (31) Design Patterns (31) Big Data (29) Product Architecture (28) MultiThread (27) Soft Skills (27) Concurrency (26) Cracking Code Interview (26) Miscs (25) Distributed (24) OOD Design (24) Google (23) Career (22) Interview - Review (21) Java - Code (21) Operating System (21) Interview Q&A (20) System Design - Practice (20) Tips (19) Algorithm (17) Company - Facebook (17) Security (17) How to Ace Interview (16) Brain Teaser (14) Linux - Shell (14) Redis (14) Testing (14) Tools (14) Code Quality (13) Search (13) Spark (13) Spring (13) Company - LinkedIn (12) How to (12) Interview-Database (12) Interview-Operating System (12) Solr (12) Architecture Principles (11) Resource (10) Amazon (9) Cache (9) Git (9) Interview - MultiThread (9) Scalability (9) Trouble Shooting (9) Web Dev (9) Architecture Model (8) Better Programmer (8) Cassandra (8) Company - Uber (8) Java67 (8) Math (8) OO Design principles (8) SOLID (8) Design (7) Interview Corner (7) JVM (7) Java Basics (7) Kafka (7) Mac (7) Machine Learning (7) NoSQL (7) C++ (6) Chrome (6) File System (6) Highscalability (6) How to Better (6) Network (6) Restful (6) CareerCup (5) Code Review (5) Hash (5) How to Interview (5) JDK Source Code (5) JavaScript (5) Leetcode (5) Must Known (5) Python (5)

Popular Posts