More on Prefix Sums
Authors: Darren Yao, Neo Wang, Qi Wang, Mihnea Brebenel
Contributors: Jesse Choe, Kevin Sheng, Brad Ma, Juheon Rhee
Max subarray sum, prefix sums in two dimensions, and a more complicated example.
Prerequisites
Max Subarray Sum
Focus Problem – try your best to solve this problem before continuing!
Solution - Max Subarray Sum
Consider the prefix sum array . The subarray sum , where is . Thus, we are looking for the maximum possible value of over .
For a fixed right bound , the maximum subarray sum is
Thus, we can keep a running minimum to store as we iterate through . This yields the maximum subarray sum for each possible right bound, and the maximum over all these values is our answer.
Implementation
C++
#include <algorithm>#include <iostream>#include <vector>using namespace std;using ll = long long;int main() {int n;cin >> n;
Java
import java.io.BufferedReader;import java.io.IOException;import java.io.InputStreamReader;import java.util.Arrays;public class MaxSubSum {public static void main(String[] args) throws IOException {BufferedReader read =new BufferedReader(new InputStreamReader(System.in));read.readLine();
Python
size = int(input())arr = [int(i) for i in input().split()]assert len(arr) == sizemax_subarray_sum = arr[0]min_pref_sum = 0running_pref_sum = 0for i in arr:running_pref_sum += imax_subarray_sum = max(max_subarray_sum, running_pref_sum - min_pref_sum)min_pref_sum = min(min_pref_sum, running_pref_sum)print(max_subarray_sum)
Alternative Solution - Kadane's Algorithm
2D Prefix Sums
Focus Problem – try your best to solve this problem before continuing!
Now, what if we wanted to process queries for the sum over a subrectangle of a 2D matrix with rows and columns? Let's assume both rows and columns are 1-indexed, and we use the following matrix as an example:
0 | 0 | 0 | 0 | 0 | 0 |
0 | 1 | 5 | 6 | 11 | 8 |
0 | 1 | 7 | 11 | 9 | 4 |
0 | 4 | 6 | 1 | 3 | 2 |
0 | 7 | 5 | 4 | 2 | 3 |
Naively, each sum query would then take time, for a total of . This is too slow.
Let's take the following example region, which we want to sum:
0 | 0 | 0 | 0 | 0 | 0 |
0 | 1 | 5 | 6 | 11 | 8 |
0 | 1 | 7 | 11 | 9 | 4 |
0 | 4 | 6 | 1 | 3 | 2 |
0 | 7 | 5 | 4 | 2 | 3 |
Manually summing all the cells, we have a submatrix sum of .
The first logical optimization would be to do one-dimensional prefix sums of each row. Then, we'd have the following row-prefix sum matrix. The desired subarray sum of each row in our desired region is simply the green cell minus the red cell in that respective row. We do this for each row to get .
0 | 0 | 0 | 0 | 0 | 0 |
0 | 1 | 6 | 12 | 23 | 31 |
0 | 1 | 8 | 19 | 28 | 32 |
0 | 4 | 10 | 11 | 14 | 16 |
0 | 7 | 12 | 16 | 18 | 21 |
Now, if we wanted to find a submatrix sum, we could break up the submatrix into a subarray for each row, and then add their sums, which would be calculated using the prefix sums method described earlier. Since the matrix has rows, the time complexity of this is . This might be fast enough for and , but we can do better.
In fact, we can do two-dimensional prefix sums. In our two dimensional prefix sum array, we have
This can be calculated as follows for row index and column index :
Let's calculate . Try playing with the interactive widget below by clicking the buttons to see which numbers are added in each step. Notice how we overcount a subrectangle, and how we fix this by subtracting .
The submatrix sum between rows and and columns and , can thus be expressed as follows:
Summing the blue region from above using the 2D prefix sums method, we add the value of the green square, subtract the values of the red squares, and then add the value of the gray square. In this example, we have
as expected.
0 | 0 | 0 | 0 | 0 | 0 |
0 | 1 | 6 | 12 | 23 | 31 |
0 | 2 | 14 | 31 | 51 | 63 |
0 | 6 | 24 | 42 | 65 | 79 |
0 | 13 | 36 | 58 | 83 | 100 |
Try playing with the interactive widget below by clicking the buttons to see which numbers are added in each step.
Since no matter the size of the submatrix we are summing, we only need to access four values of the 2D prefix sum array, this runs in per query after an preprocessing.
Solution - Forest Queries
Warning!
We need to be cautious of off-by-one errors, as intervals can be inclusive, exclusive, 1-indexed, etc.
C++
#include <iostream>#include <vector>using namespace std;constexpr int MAX_SIDE = 1000;int tree_pref[MAX_SIDE + 1][MAX_SIDE + 1];int forest[MAX_SIDE + 1][MAX_SIDE + 1];int main() {
Java
import java.io.*;import java.util.StringTokenizer;public class ForestQueries {static int N;static int Q;static int[][] pfx;static int[][] arr;public static void main(String[] args) {Kattio io = new Kattio();
Python
side_len, query_num = [int(i) for i in input().split()]tree_prefixes = [[0 for _ in range(side_len + 1)] for _ in range(side_len + 1)]for r in range(side_len):for ci, c in enumerate(input()):tree = c == "*"tree_prefixes[r + 1][ci + 1] += (tree_prefixes[r][ci + 1]+ tree_prefixes[r + 1][ci]- tree_prefixes[r][ci]+ tree
Problems
Status | Source | Problem Name | Difficulty | Tags | |
---|---|---|---|---|---|
Silver | Easy | Show Tags2D Prefix Sums | |||
Old Silver | Easy | Show Tags2D Prefix Sums | |||
CF | Normal | Show TagsPrefix Sums | |||
Silver | Hard | Show Tags2D Prefix Sums | |||
Gold | Hard | Show Tags2D Prefix Sums, Max Subarray Sum | |||
AC | Hard | Show Tags2D Prefix Sums, Trees | |||
Platinum | Very Hard | Show Tags2D Prefix Sums |
Difference Arrays
Resources | ||||
---|---|---|---|---|
Codeforces |
Focus Problem – try your best to solve this problem before continuing!
Explanation
We need to know how many times each operation is applied, i.e. how many times is contained within a query interval. For this, maintain a frequency array ; is the number of times operation is applied. The important step is how the frequency array is updated. Instead of iterating through the interval and incrementing each value by one, which would result in time complexity, we could just increment by one, the left bound, and decrement by one , the cell next to the right bound. After processing the queries we obtain the frequency array by doing prefix sums on , resulting in time complexity. The second part, applying the operations, can be done exactly the same way. This technique is called difference array and has linear time complexity coming from iterating through the array for making prefix sums.
Implementation
Time Complexity:
C++
#include <array>#include <iostream>#include <vector>using namespace std;int main() {int n, m, k;cin >> n >> m >> k;vector<int> a(n + 1);
Problems
Status | Source | Problem Name | Difficulty | Tags | |
---|---|---|---|---|---|
AtCoder | Normal | Show TagsDP, Difference Array | |||
CF | Normal | Show TagsDifference Array |
Quiz
For a grid with rows and columns, what's the ideal time complexity to compute a 2D prefix sum array of the grid.
Module Progress:
Join the USACO Forum!
Stuck on a problem, or don't understand a module? Join the USACO Forum and get help from other competitive programmers!