First tutorial blog :D Feedbacks are appreciated.
I will be discussing a divide and conquer technique, CDQ DnC. It is an offline technique that enables problems with rigorous data structure solutions to be solved in simple ways.
What is CDQ DnC?
CDQ DnC is a divide and conquer technique invented by the Chinese competitive programmer, CDQ. Honestly, there is only one difference between CDQ and normal DnC. Both divide and conquer techniques divides an interval $[l, r]$ to $[l, m]$ and $[m+1, r]$, where $m= \lfloor \frac{l+r}{2} \rfloor$. The only difference is that in CDQ, the values of $[l, m]$ have an influence on the values of $[m+1, r]$. This is usually the case for dynamic programming problems, where the formula $dp_i$ is derived from values of $dp_j$ where $j < i$.
The overall flow of CDQ DnC is the following:
- Divide $[l, r]$ to $[l, m]$ and $[m+1, r]$.
- Solve $[l, m]$ individually
- Calculate the contribution of the values $[l, m]$ apply in $[m+1, r]$
- Solve $[m+1, r]$ after the contribution is calculated
Longest Non-decreasing Subsequence of Pairs (HDU 4742)
Prerequisite: segment tree or binary indexed tree
Problem statement:
Given an array of size $N$. Each element is a triple of integers $(X_i, Y_i, Z_i)$. Find the longest set $\{A_1, A_2, \dots, A_k\}$ such that $X_{A_{i-1}} \le X_{A_i}$, $Y_{A_{i-1}} \le Y_{A_i}$, and $Z_{A_{i-1}} \le Z_{A_i}$ for all $1 < i \le k$. Note that the set $A$ is not necessarily sorted.
Also, find the number of such sets modulo $2^{30}$.Constraints: $1 \le N \le 10^5$, $1 \le X_i, Y_i, Z_i \le 2^{30}$, $(X_i, Y_i, Z_i) \ne (X_j, Y_j, Z_j)$ for all $1 \le i < j \le N$
Solution:
Before everything, apply coordinate compression for $X$, $Y$, and $Z$ such that now $X, Y, Z \le 10^5$, which we can do by simply sort all of the values in a different array, remove duplicates, and apply binary search to find the compressed indices.
First observation is to notice that we can regard the first dimension as the index of an array of $(k-1)$-tuples. In other words, let $B$ be an array of pairs such that $B_{X_i} = (Y_i, Z_i)$, we are to find the longest non-decreasing subsequence of $B$. For equal values of $X_i$, greedily sort the $(Y_i, Z_i)$ pairs as it doesn't hurt the optimality of the answer.
Let's briefly discuss the solution for 1D and 2D variant of the problem in subquadratic time. Throughout the solution, I will be using a BIT (binary indexed tree) instead of a segment tree as BIT is relatively faster in both implementation and performance speed, hence a better data structure.*
The first-dimension variant is an obvious problem. Simply sort the array. The length is always $N$ and the number of possible sets is always $1$.
The two-dimension variant is a classic as well. Sort all of the elements by the first dimension and apply LNDS at the second dimension. Let $DP_i$ be the length of LNDS ending at element $i$. Maintain a maximum BIT that stores the DP values. In particular, update point $X_i$ with value $DP_{X_i}$. Since the DP values for every index will increase, the maximum operation in the BIT is valid. Answer would be query at point $MAX$, which would be $\max(DP_1, DP_2, \dots, DP_{MAX})$. To maintain the number of longest chains, provide another sum operation BIT that stores the count values. The implementation is left to the reader as I believe that there are a ton of sources for $O(N\log N)$ LNDS.
Back to the original problem. Sort all the elements by the first dimension and apply LNDS to $(Y_i, Z_i)$. How to achieve this? 2D Segment tree???? Not enough memory, and the implementation makes one get the "I'm die, thank you forever" treatment.
Let's try to solve this problem by CDQ DnC.
- Suppose we are currently solving for the interval $[l, r]$.
- Split to two intervals and solve the first interval, $[l, m]$.
- Find the contribution of $dp$ values from $[l, m]$ to $[m+1, r]$
- Notice that $\max(X_l, X_{l+1}, \dots, X_m) \le \min(X_{m+1}, X_{m+2}, \dots, X_r)$. Thus, we should only worry for the other two dimensions. Notice that by this reduction, the problem becomes a 2D variation.
- Sort the values of $[l, m]$ and $[m+1, r]$ by the second dimension, which is by $Y$.
- Apply two pointers, one for $[l, m]$ and the other for $[m+1, r]$.
- When iterating $[m+1, r]$, update the BIT by values in $[l, m]$ accordingly, just like the 2D variation we have briefly discussed before.
- Solve the other interval $[m+1, r]$.
Let's try a different problem.
Lost Permutation (Own, no online judge (yet :D))
Prerequisite: FFT/NTT (only required if one wants to implement, but the main idea only requires the knowledge that FFT allows multiplication of polynomials to be done in linear logarithmic time)
Problem statement:
Let there be a permutation $A$ of size $N$. Construct a graph s.t. there exists an edge connecting $A_i$ and $A_j$ if and only if $A_i > A_j$ and $i < j$. You are given the number of connected components, $M$, and the size of each connected component, $X_i$. Your task is to find the number of permutations s.t. the constructed graph is consistent with the information given. Print the answer modulo $998\,244\,353$.
Constraints: $1 \le N, M, X_i \le 10^5$, $\sum_{i=1}^{M} X_i = N$
Solution:
Observation 1: For each connected component, the members must be consecutive.
Proof 1: Assume otherwise. Let $F$ be a connected component such that its members are not consecutive. Then, there must be a connected component G such that $\min(F) < \min(G) < \max(F)$. We will split $F$ to two parts: $F_1$ containing members that are less than $\min(G)$ and $F_2$ containing members that are more than $\min(G)$. Let $G_1$ be the set of members of $G$ that are less than $\max(F)$. To ensure that $F_2$ and $G_1$ are not connected, $G_1$ must have a smaller index than $F_2$. Similarly, to ensure that $F_1$ and $G_1$ are not connected, $F_1$ must have a smaller index than $F_2$. Thus, we conclude that $F_1$ has a smaller index than $F_2$, which is impossible as we know that $\min(F_2) > \min(G) > \max(F_1)$, hence $F_1$ and $F_2$ would not be connected. Contradiction.
Observation 2: There is only one way to order the connected components. Specifically, the connected components must be ordered increase in $\max(CC)$. An example is that a connected component with $\{4, 5, 6\}$ should be located left of $\{7, 8, 9, 10\}$.
Proof 2: Assume otherwise. Let $F$ and $G$ be different connected components such that $\max(F) > \max(G)$. If $F$ has a smaller index than $G$, then it satisfies $i < j$ and $A_i > A_j$ where $A_i = \max(F)$ and $A_j = \max(G)$. This means $i$ and $j$ are connected and $\max(F)$ and $\max(G)$ are in the same connected component. Contradiction.
From the two observations above, we can get the answer by multiplying $dp_{X_i}$s where $dp_x$ denotes the number of permutations of length $x$ that only have one connected component when the graph is constructed
Let's derive a transition for the dynamic programming. $dp_i = i! - \sum_{k=1}^{i-1} dp_{i-k} \times k!$, making this an $O(N^2)$ algorithm. The basecase is $dp_1 = 1$.
The logic behind this dp is $i!$ for all permutations length $i$, subtracted by invalid permutations $dp_{i-k} \times k!$. The $i - k$ denotes the first $i - k$ elements connected to one, then the next $k$ elements are located in another several connected components.
At this point, if you want to continue and solve it yourself, you can check the correctness of your implementation by comparing the $dp$ values with the quadratic solution.
Back to the original constraints, from here one would probably think of finding if such sequence in OEIS exists. Apparently, it does. There is a generating function that we can calculate in $O(N \log N)$, but we will be discussing another solution.
$\sum dp_{i-k} \times k!$ is a convolution, which we can calculate with NTT.
We will apply CDQ DnC:
- Suppose we are currently solving for dp values in the interval $[l, r]$.
- Split to two intervals and solve the dp values in the first interval, $[l, m]$.
- Find the contribution of $dp$ values from $[l, m]$ to $[m+1, r]$
- Multiply the two polynomials: $$dp_l + dp_{l+1} x + dp_{l+2} x^2 + \dots + dp_m x^{m-l}$$ and $$1! + 2! x + 3! x^2 + \dots + (r-l)! x^{r-l-1}$$
- Let $C_k$ be the coefficient of $x^k$ in the product. Notice that $C_k = \sum_j dp_{l+2+k - j} \times j!$
- Hence, we can apply $dp_i := dp_i - C_{i - l - 1}$ for all $m < i \le r$
- Solve the other interval $[m+1, r]$.
- When reaching a basecase $[x, x]$, $dp_x = x!$ as it is in the state before subtracting with $dp_{x-y} \times y!$ for $y < x$.
This is correct since in the DnC, we solved the intervals in order ($[l, m] \rightarrow [m+1, r]$), thus when calculating the contribution at the third step, the $dp_i$ values in $[l, m]$ are already final.
By master theorem, $T(N) = 2T(\frac{N}{2}) + O(NlogN)$ (NTT complexity), total complexity is $O(Nlog^2N)$.
Remember to multiply by $\frac{M!}{\prod cnt_{X_i}!}$, representing the number of permutations of the connected components.
Code:
Full code: https://pastebin.com/m2Jjmv6f
Practice problems:
- Balkan 2007 Mokia. For the english statement, please refer to other sources. (The reference in the footnote contains the abridged english statement for this problem).
- CF 1093 E (similar to Mokia)
*It's just a running joke targeted to a certain somebody who is very obsessed with segment tree. It's still a good data structure nonetheless.
CDQ DnC is sometimes refered to as online DnC, as it solves the ranges in order. The FFT technique explained in Lost Permutation is also sometimes refered to as online FFT.