Subtleties of Scala: learning CanBuildFrom

From the sandbox

In the Scala standard library, collection methods ( map , flatMap , scan, and others) accept an instance of type CanBuildFrom as an implicit parameter. In this article we will analyze in detail why this trait is needed, how it works and how the developer can be useful.

How it works

The main purpose that CanBuildFrom serves is to provide the compiler with a result type for the map , flatMap and the like methods , as indicated , for example, by defining a map in the TraversableLike tray :

defmap[B, That](f: A => B)(implicit bf: CanBuildFrom[Repr, B, That]): That

The method returns an object of type That , which appears in the description only as a parameter of CanBuildFrom . Suitable instance CanBuildFrom selected compiler based on the type of the original collection Repr types and user-defined function result B . The selection is made from the set of values declared in the Predef object and companions of the collections (the rules for choosing implicit values deserve a separate article and are described in detail in the language specification ).

In fact, when using CanBuildFrom , the same result type inference occurs as in the case of the simplest parameterized method:

scala> deff[T](x: List[T]): T = x.head
f: [T](x: List[T])T
scala> f(List(3))
res0: Int = 3
scala> f(List(3.14))
res1: Double = 3.14
scala> f(List("Pi"))
res2: String = Pi

That is, when called

List(1, 2, 3).map(_ * 2)

the compiler will select an instance of CanBuildFrom from the GenTraversableFactory class , which is described as follows:

classGenericCanBuildFrom[A] extendsCanBuildFrom[CC[_], A, CC[A]]

and will return a collection of the same type but with elements received from the user-defined function: CC [A] . In other cases, the compiler can choose a more suitable type of result, for example, for strings:

scala> "abc".map(_.toUpper) // Predef.StringCanBuildFrom
res3: String = ABC
scala> "abc".map(_ + "*") // Predef.fallbackStringCanBuildFrom[String]
res4: scala.collection.immutable.IndexedSeq[String] = Vector(a*, b*, c*)
scala> "abc".map(_.toInt) // Predef.fallbackStringCanBuildFrom[Int]
res5: scala.collection.immutable.IndexedSeq[Int] = Vector(97, 98, 99)

In the first case, StringCanBuildFrom is selected , the result is String :

implicitvalStringCanBuildFrom: CanBuildFrom[String, Char, String]

In the second and third - fallbackStringCanBuildFrom method , the result is IndexedSeq :

implicitdeffallbackStringCanBuildFrom[T]: CanBuildFrom[String, T, immutable.IndexedSeq[T]]

Using breakOut

Consider using the Map class . It is easy to convert a collection of this type to Iterable if you return not a pair from the conversion function, but a single value:

scala> Map(1 -> "a", 2 -> "b", 3 -> "c").map(_._2)
res6: scala.collection.immutable.Iterable[String] = List(a, b, c)

But to get a Map from the list of pairs you need to call the toMap method :

scala> List('a', 'b', 'c').map(x => x.toInt -> x)
res7: List[(Int, Char)] = List((97,a), (98,b), (99,c))
scala> List('a', 'b', 'c').map(x => x.toInt -> x).toMap
res8: scala.collection.immutable.Map[Int,Char] = Map(97 -> a, 98 -> b, 99 -> c)

Or use the breakOut method instead of the implicit parameter:

scala> import collection.breakOut
import collection.breakOut
scala> List('a', 'b', 'c').map(x => x.toInt -> x)(breakOut)
res9: scala.collection.immutable.IndexedSeq[(Int, Char)] = Vector((97,a), (98,b), (99,c))

The method, as the name implies, allows you to "break out" of the boundaries of the type of the original collection and give the compiler more freedom in choosing the CanBuildFrom instance :

defbreakOut[From, T, To](implicit b: CanBuildFrom[Nothing, T, To]): CanBuildFrom[From, T, To]

The description shows that breakOut does not specialize in any of the three parameters, which means that it can be used instead of any CanBuildFrom instance . BreakOut itself implicitly accepts an object of type CanBuildFrom , but the From parameter in this case is replaced by Nothing , which allows the compiler to use any available instance of CanBuildFrom (this happens because the From parameter is declared as contravariant, and the Nothing type is a descendant of any type.)

In other words, breakOut provides an additional “layer” that allows the compiler to choose from all available CanBuildFrom implementations , and not just those that are valid for the type of the source collection. In the example above, this makes it possible to use CanBuildFrom from the Map companion , despite the fact that we originally worked with List . Another example is getting a string from a list of characters:

scala> List('a', 'b', 'c').map(_.toUpper)
res10: List[Char] = List(A, B, C)
scala> List('a', 'b', 'c').map(_.toUpper)(breakOut)
res11: String = ABC

The implementation of CanBuildFrom [String, Char, String] is declared in Predef and therefore takes precedence over declarations in companion collections.

Future List Usage Example

As a small example of using CanBuildFrom, we will write an implementation that will automatically collect the Future list into a single object, as Future.sequence does :

List[Future[T]] -> Future[List[T]]

To get started, take a look inside CanBuildFrom . The trait declares two abstract apply methods that the builder of the new collection returns based on the results of the user-defined function:

defapply(): Builder[Elem, To]
defapply(from: From): Builder[Elem, To]

Therefore, to provide your own implementation of CanBuildFrom , you need to prepare Builder , in which to implement methods for adding an element, clearing the buffer and obtaining the result:

classFutureBuilder[A] extendsBuilder[Future[A], Future[Iterable[A]]] {
  privateval buff = ListBuffer[Future[A]]()
  def+=(elem: Future[A]) = { buff += elem; this }
  defclear= buff.clear
  defresult= Future.sequence(buff.toSeq)
}

The CanBuildFrom implementation itself is trivial:

classFutureCanBuildFrom[A] extendsCanBuildFrom[Any, Future[A], Future[Iterable[A]]] {
  defapply= newFutureBuilder[A]
  defapply(from: Any) = apply
}
implicitdeffutureCanBuildFrom[A] = newFutureCanBuildFrom[A]

We check:

scala> Range(0, 10).map(x => Future(x * x))
res12: scala.concurrent.Future[Iterable[Int]] = scala.concurrent.impl.Promise$DefaultPromise@360e2cfb

Everything is working! Thanks to the futureCanBuildFrom method , we got directly Future [Iterable [Int]] , i.e. Conversion of the staging collection was done automatically.

Warning: this is just an example of using CanBuildFrom , I am not saying that such a solution should be used in your combat code or that it is better than the usual wrapping in Future.sequence . Be careful not to copy the code into your project without a preliminary analysis of the consequences!

Conclusion

Using CanBuildFrom is closely related to implicit parameters, so a clear understanding of the logic of choosing values will save you from wasting time while debugging - do not be too lazy to look into the language specification or Scala FAQ . The compiler can also help and show what implicit values were selected if you build a program with the -Xprint: typer flag - this saves a lot of time.

CanBuildFrom is a very specific thing and you most likely will not have to work closely with it unless you are developing new data structures. Nevertheless, understanding the principles of its work will not be superfluous and will allow better understanding of the internal structure of the standard library.

That's all, thanks and success in learning Scala!

Corrections and additions to the article, as always, are welcome.

Tags:

Subtleties of Scala: learning CanBuildFrom

How it works

Using breakOut

Future List Usage Example

Conclusion

Also popular now: