released Mon, 30 Nov 2015
Swift Version 5.1

Map, FlatMap, Reduce & More

A practical introduction into the map, compactMap, filter, and reduce functions.

Intro

Even before Swift was released, iOS / Cocoa developers could use third party frameworks like ObjectiveSugar or ReactiveCocoa in order to gain functional programming constructs like map, flatMap or filter.

With Swift, they have become first class language citizens. There are many advantages to using them over a standard for loop. They typically express your intent better, they require less code, and they can be chained together in order to build up complex logic in a clear way.

In this post, I'd like to show another very cool functional addition to Swift which can sometimes be a better solution than map / filter constructs: reduce.

A simple Problem

Consider this problem: You're getting a list of persons from a JSON endpoint. You'd like to know the average age from all people living in California. The parsed data looks like this:

let persons: [[String: Any]] = [["name": "Carl Saxon", "city": "New York, NY", "age": 44],
 ["name": "Travis Downing", "city": "El Segundo, CA", "age": 34],
 ["name": "Liz Parker", "city": "San Francisco, CA", "age": 32],
 ["name": "John Newden", "city": "New Jersey, NY", "age": 21],
 ["name": "Hector Simons", "city": "San Diego, CA", "age": 37],
 ["name": "Brian Neo", "age": 27]]

Note the last entry, which omits a city for the person in question. Those cases have to be silently ignored.

The expected result in the example would be 3 persons , as we have three persons from California. Let's try to implement this in Swift in terms of compactMap and filter. The compactMap is used instead of map as compactMap automatically ignores empty optionals. So compactMap([0, nil, 1, 2, nil]) results in [0, 1, 2]. This eases the handling of persons without a proper city.

func infoFromState(state: String, persons: [[String: Any]]) 
     -> Int {
    return persons.compactMap( { $0["city"]?.componentsSeparatedByString(", ").last })
           .filter({$0 == state})
           .count
}
infoFromState(state: "CA", persons: persons)

That's simple enough.

However, now consider the following complication: You'd like to know how many of those persons live in California, and you'd like to know their average age. If we try upgrading the above example, we soon realize that his is a slightly harder problem.

There are various solutions, but they all seem to not fit well with functional constructs. A loop-based approach feels much simpler.

When we think about why this does fit, we realize it is because the shape of the data changes. map, compactMap, and filter all keep the shape of the data similar. Array goes in, array goes out. The amount and the contents of the array may change, but the array-shape stays. The problem above, however, requires us to change the shape to a struct or tuple with an Integer average and an Integer sum.

These are the kind of problems where you can apply reduce. So what is reduce? Lets have a look.

Reduce

Reduce is sort of a generalized version of map, compactMap, flatMap, or filter. The basic idea is to reduce a sequence into a different shape utilizing an accumulator that can keep incremental state. To do this, we hand the function a combinator closure/function/method which is called once for each item in the sequence. This may sound complicated but becomes really easy with a couple of examples.

It is a method on SequenceType and looks like this (simplified):

func reduce<T>(_ initialResult: T, _ nextPartialResult: (T, Self.Generator.Element) -> T) -> T

There're two parameters here:

  1. A initialResult value of generic type T
  2. A closure nextPartialResult that receives two parameters and returns the generic type T. The two parameters are T, once again, and the element type of the array. Imagine we had an array of String and we'd like to reduce it to the count of elements Int. Reduce would look like this:
func reduce(_ initialResult: Int, _ nextPartialResult: (Int, String) -> Int) -> Int

So here we have an initial value Int, and we have a closure which expects us to return the same type as the initial value (Int). The final value of the operation is also of the same type as the initial value.

If we take a very simple reduce operation - counting the elments in an array of String - , the evaluation will look like this:

func count(accumulator: Int, current: Int) -> Int {
   return accumulator += 1
}
[1, 2, 3].reduce(0, count)

// The following steps will be performed
count(0, 1) { return 0 + 1 } = 1
count(1, 2) { return 1 + 1 } = 2
count(2, 3) { return 2 + 1 } = 3
= 6

And if we want to sum up the integer elements in an array of Int, we'd write this:

func sum(accumulator: Int, current: Int) -> Int {
   return accumulator + current
}
[1, 2, 3].reduce(0, sum)
// The following steps will be performed
sum(0, 1) { return 0 + 1 } = 1
sum(1, 2) { return 1 + 2 } = 3
sum(3, 3) { return 3 + 3 } = 6
= 6

The nextPartialResult closure (sum in our example) will be called once for each item in the list [1, 2, 3]. The state will be kept in the accumulator variable which is just an Integer.

Let's start re-implementing some of our other, trusted, functional programming friends. In order to keep things simple for now, all these functions will operate on Int or Optional<Int>; i.e. we will ignore generics in here. Also, keep in mind that the implementations below exist to explain the behaviour of reduce. The native Swift implementations are usually much faster compared to the reduce versions below. Reduce shines in a different set of problems, which will be explained further below.

Map

Lets re-implement map and call it rmap (short for reduce map)

func rmap(_ elements: [Int], transform: (Int) -> Int) -> [Int] {
    return elements.reduce([Int](), combine: { 
      (var acc: [Int], obj: Int) -> [Int] in
       acc.append(transform(obj))
       return acc
    })
}
let input = [1, 2, 3, 4]
let output = rmap(input, transform: { $0 * 2})
assert(output == [2, 4, 6, 8])

This is a good example to understand the basics of reduce.

  • First, we're calling reduce on a sequence of elements elements.reduce....
  • Next, We're giving it the accumulator, i.e. an empty Int array, which will form or return type / result [Int]()
  • After that, we're handing in the combinator which takes two parameters. The first is the accumulator which we just provided acc: [Int], the second is the current object from our sequence obj: Int.
  • The actual code in the combinator is simple. We simply transform the obj and append it onto the accumulator. We then return the accumulator.

This looks like much more code than just calling map. That's indeed true, but the version above is extra detailed in order to better explain how reduce works. We can simplify it.

func rmap(_ elements: [Int], transform: (Int) -> Int) -> [Int] {
    return elements.reduce([Int](), combine: {$0 + [transform($1)]})
}
print(rmap([1, 2, 3, 4], transform: { $0 * 2}))
// [2, 4, 6, 8]

This still works fine. What happened here? We're using the convenient fact that in Swift, the + operator also works for two sequences. So [0, 1, 2] + [transform(4)] takes the left sequence and adds the right sequence, consisting out of the transformed element, to it.

It should be noted that, as of right now, [1, 2, 3] + [4] is slower than [1, 2, 3].append(4). If you operate on huge lists, instead of using collection + collection, you should have a mutable accumulator and mutate it in place:

func rmap(_ elements: [Int], transform: (Int) -> Int) -> [Int] {
    return elements.reduce([Int](), combine: { (var ac: [Int], b: Int) -> [Int] in 
        ac.append(transform(b))
        return ac
    })
}

In order to better understand reduce we will now go on and also implement compactMap and filter.

func rcompactMap(_ elements: [Int], transform: (Int) -> Int?) -> [Int] {
    return elements.reduce([Int](), 
       combine: { guard let m = transform($1) else { return $0 } 
                  return $0 + [m]})
}
print(rcompactMap([1, 3, 4], transform: { guard $0 != 3 else { return nil }; return $0 * 2}))
// [2, 8]

The main difference is that we're adding a guard to make sure the optional contains a value.

Filter

func rFilter(_ elements: [Int], filter: (Int) -> Bool) -> [Int] {
    return elements.reduce([Int](), 
       combine: { guard filter($1) else { return $0 } 
                  return $0 + [$1]})
}
print(rFilter([1, 3, 4, 6], filter: { $0 % 2 == 0}))
// [4, 6]

Again, a simple operation. We're leveraging guard again to make sure our filter condition holds.

Up until now, reduce may feel like a more complicated version of map or filter without any major advantages. However, the combinator does not need to be an array. It can be anything. This makes it easy for us to implement various reduction operations in a very simple way.

Reduce Examples

Let's start with a favorite of mine, the sum of a list of numbers:

[0, 1, 2, 3, 4].reduce(0, +)
// 10

+ is a valid combinator function as it will just add the lhs and the rhs and return the result, which is specifically the requirement reduce has.

Another example is building the product of a list of numbers:

[1, 2, 3, 4].reduce(1, *)
// 24

Or what about reversing a list:

[1, 2, 3, 4, 5].reduce([Int](), { [$1] + $0 })
// 5, 4, 3, 2, 1

Finally, something a tad more complicated. We'd like to partition a list based on a division criteria

typealias Acc = (l: [Int], r: [Int])
func partition(_ lst: [Int], criteria: (Int) -> Bool) -> Acc {
   return lst.reduce((l: [Int](), r: [Int]()), { (ac: Acc, o: Int) -> Acc in 
      if criteria(o) {
        return (l: ac.l + [o], r: ac.r)
      } else {
        return (r: ac.r + [o], l: ac.l)
      }
   })
}
partition([1, 2, 3, 4, 5, 6, 7, 8, 9], criteria: { $0 % 2 == 0 })
//: ([2, 4, 6, 8], [1, 3, 5, 7, 9])

The most interesting thing we're doing above is using a tuple as the accumulator. As you'll find out, once you start trying to incorporate reduce into your daily work-flow, tuples are a great way of quickly combining related data during a reduce operation.

Performance

Apart from the higher flexibility that reduce offers, it has another advantage: Oftentimes, chaining map and filter induces a performance penalty as Swift has to iterate over your collection multiple times in order to generate the required data. Imagine the following code:

[0, 1, 2, 3, 4].map({ $0 + 3})
    .filter({ $0 % 2 == 0})
    .reduce(0, +)

Apart from being nonsensical, it is also wasting CPU cycles. The initial sequence will be iterated over 3 times. First to map it, then to filter it, and finally to sum up the contents. Instead, all of this can just as well be implemented as one reduce call, which greatly improves the performance:

[0, 1, 2, 3, 4].reduce(0, { (ac: Int, r: Int) -> Int in 
   if (r + 3) % 2 == 0 {
     return ac + r + 3
   } else {
     return ac
   }
})

Here's a quick benchmark of running both versions and the for-loop version below over an list with 100.000 items:

var ux = 0
for i in Array(0...100000) {
    if (i + 3) % 2 == 0 {
        ux += (i + 3)
    }
}
For Loop: 0.030 seconds
Reduce: 0.034 seconds
Map/Filter: 0.072 seconds

As you can see, the reduce version' performance is very close to the mutating for loop and more than twice as fast as the chaining operation.

However, in other situations, chained operation can greatly outperform reduce. Consider the following example where we add 3 to each entry in the array.

Array(0...100000).map({ $0 + 3}).reverse().prefix(3)
// 0.027 Seconds

And the reduce version:

Array(0...100000).reduce([], { (ac: [Int], r: Int) -> [Int] in
    return ac + [r + 3]
}).prefix(3)

// 2.927 Seconds
For Loop: 0.027 seconds
Reduce: 2.927 seconds

Here, we have a stark performance difference of 0.027s for the chained operation vs. 2.927s for the reduce operation, what's happening here?

Arrays in Swift are value types with so-called copy on write semantics.

Copy on Write

Imagine you had a struct User:

struct User {
  var username: String
}
var benedikt = User(username: "Benedikt")
var secondBenedikt = benedikt
var thirdBenedikt = benedikt

struct types in Swift are value types. Value type means that each copy is a new distinct value. So if I were to change the username value of secondBenedikt to Klaus, then the username value of the other two benedikts (benedikt, thirdBenedikt) would still be Benedikt and not Klaus. So, everytime you do a a = b, b is copied to a.

Copy operations, however, are expensive. All that memory has to be copied from a to b. So Swift employs a smart trick: As long as you don't mutate / modify a variable, it will just not copy it.

So in our example above, benedikt, secondBenedikt, and thirdBenedikt are the same thing, they point to the same memory. Only once you change one of them (say benedikt.username = 'Hans') will they be copied into distinct types.

So what's all that to do with our reduce issue here?

Array Value Types

Arrays are value types, too. This means that whenever an array is mutated, a new copy is created. So in our reduce function:

Array(0...100000).reduce([], { (ac: [Int], r: Int) -> [Int] in
    return ac + [r + 3]
}).prefix(3)

This will copy the array 100000 times. That's why the performance is so abysmal. So how do we fix this?

The power of inout

There's another version of reduce with slightly different parameters. Its function signature looks like this (simplified):

func reduce<Result>(into initialResult: Result, 
  _ updateAccumulatingResult: (inout Result, Element) throws -> ()) rethrows -> Result

The magic is the inout Result. Inout is a special attribute that you can use in function signatures to denote to Swift that you wish to refer to the same instance of a type without making copies. The name implies how it works: When the function is called, the value is moved in to the function, when the function is done, the value is moved out again.

In the case of our arrays, instead of making multiple copies, we will always modify the same array.

So if we rewrite our reduce from above with reduce(into:) what is the performance?

Here is the updated code:

Array(0...100000).reduce(into: [Int](), { ac, r in
                return ac.append(r + 3)
            }).prefix(3)

And this is the new performance:

Map: 0.0295 seconds
Reduce Into: 0.0376 seconds
Reduce: 1.49 seconds

We're almost reached the speed of the simpler map implementation. It is much faster now. Awesome!

We can now go back to our initial count & average problem and try to solve it with reduce.

InfoFromState, take two


  func infoFromState(state: String, persons: [[String: Any]]) 
      -> (count: Int, age: Float) {

      // The type alias in the function will keep the code cleaner
      typealias Acc = (count: Int, age: Float)

      // reduce into a temporary variable
      let u = persons.reduce((count: 0, age: 0.0)) {
          (ac: Acc, p) -> Acc in

          // Retrive the state and the age
          guard let personState = (p["city"] as? String)?.componentsSeparatedByString(", ").last,
                personAge = p["age"] as? Int

            // make sure the person is from the correct state
            where personState == state

            // if age or state are missing, or personState!=state, leave
            else { return ac }

          // Finally, accumulate the acount and the age
          return (count: ac.count + 1, age: ac.age + Float(personAge))
      }

  // our result is the count and the age divided by count
  return (age: u.age / Float(u.count), count: u.count)
}
print(infoFromState(state: "CA", persons: persons))
// prints: (count: 3, age: 34.3333)

As in earlier examples above, we're once again using a tuple to share state in the accumulator. Apart from that, the code is easy to understand.

We also defined a typealias Acc within the func in order to simplify the type annotations a bit.

More Examples

This was a short overview of the power behind the reduce method. It is particularly helpful if you end up chaining a lot of functional methods together, or when output shape of your data differs from the input shape. We'll finish this guide with more reduce examples to give you inspirations for various use cases where reduce can easily be applied.

Unique

Return a list with all duplicates removed. The better solution would be to use a Set.

[1, 2, 5, 1, 7].reduce([], { (a: [Int], b: Int) -> [Int] in
  if a.contains(b) {
    return a
  } else {
    return a + [b]
  }
})
// prints: 1, 2, 5, 7

Group By

Go over a list and return a new list with the previous list' items grouped by a discriminator function. The function in question needs to return a Hashable type so that we can differentiate keys. The order of the items will be preserved while the order of the groups won't necessarily be preserved.

func groupby<T, H: Hashable>(_ items: [T], f: (T) -> H) -> [H: [T]] {
   return items.reduce([:], { (var ac: [H: [T]], o: T) -> [H: [T]] in 
       let h = f(o)
       if var c = ac[h] {
           c.append(o)
           ac.updateValue(c, forKey: h)
       } else {
           ac.updateValue([o], forKey: h)
       }
       return ac
   })
}
print(groupby([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], f: { $0 % 3 }))
// prints: [2: [2, 5, 8, 11], 0: [3, 6, 9, 12], 1: [1, 4, 7, 10]]
print(groupby(["Carl", "Cozy", "Bethlehem", "Belem", "Brand", "Zara"], f: { $0.characters.first! }))
// prints: ["C" : ["Carl" , "Cozy"] , "B" : ["Bethlehem" , "Belem" , "Brand"] , "Z" : ["Zara"]]

Interpose

This function returns the given items, with element inserted between every count items. The implementation below makes sure that the elements are only interposed and not appended at the end.

func interpose<T>(items: [T], element: T, count: Int = 1) -> [T] {
   typealias Acc = (ac: [T], cur: Int, cnt: Int)
   return items.reduce((ac: [], cur: 0, cnt: 1), { (a: Acc, o: T) -> Acc in 
       switch a {
       // the last item will not have any interposition
       case let (ac, cur, _) where (cur+1) == items.count: return (ac + [o], 0, 0)
       // interpose
       case let (ac, cur, c) where c == count:
          return (ac + [o, element], cur + 1, 1)
       // next
       case let (ac, cur, c):
          return (ac + [o], cur + 1, c + 1)
       }
   }).ac
}
print(interpose(items: [1, 2, 3, 4, 5], element: 9))
// : [1, 9, 2, 9, 3, 9, 4, 9, 5]
print(interpose(items: [1, 2, 3, 4, 5], element: 9, count: 2))
// : [1, 2, 9, 3, 4, 9, 5]

Interdig

This function allows you to combine two sequences by alternately selecting elements from each.

func interdig<T>(list1: [T], list2: [T]) -> [T] {
   return zip(list1, list2).reduce([], { (ac: [T], o: (T, T)) -> [T] in 
        return ac + [o.0, o.1]
   })
}
print(interdig(list1: [1, 3, 5], list2: [2, 4, 6]))
// : [1, 2, 3, 4, 5, 6]

Chunk

This function returns self, broken up into non-overlapping arrays of length n:

func chunk<T>(_ list: [T], length: Int) -> [[T]] {
   // Simplify the type signature by introducing a `typealias`
   typealias Acc = (stack: [[T]], cur: [T], cnt: Int)
   
   // Start with a `cnt` of 0
   let reducedList = list.reduce((stack: [], cur: [], cnt: 0), { (ac: Acc, o: T) -> Acc in
      if ac.cnt == length {
          return (stack: ac.stack + [ac.cur], cur: [o], cnt: 1)
      } else {
          return (stack: ac.stack, cur: ac.cur + [o], cnt: ac.cnt + 1)
      }
   })
   return reducedList.stack + [reducedList.cur]
}
print(chunk([1, 2, 3, 4, 5, 6, 7], length: 2))
// [[1, 2], [3, 4], [5, 6], [7]]

This function uses a more complicated accumulator consisting out of a stack, the current list, and the count.

Similar Articles