Sunday, October 10, 2021

[Tutorial] CDQ DnC

    

First tutorial blog :D Feedbacks are appreciated.

I will be discussing a divide and conquer technique, CDQ DnC. It is an offline technique that enables problems with rigorous data structure solutions to be solved in simple ways.

What is CDQ DnC?

CDQ DnC is a divide and conquer technique invented by the Chinese competitive programmer, CDQ. Honestly, there is only one difference between CDQ and normal DnC. Both divide and conquer techniques divides an interval $[l, r]$ to $[l, m]$ and $[m+1, r]$, where $m= \lfloor \frac{l+r}{2} \rfloor$. The only difference is that in CDQ, the values of $[l, m]$ have an influence on the values of $[m+1, r]$. This is usually the case for dynamic programming problems, where the formula $dp_i$ is derived from values of $dp_j$ where $j < i$.

The overall flow of CDQ DnC is the following:

  1. Divide $[l, r]$ to $[l, m]$ and $[m+1, r]$.
  2. Solve $[l, m]$ individually
  3. Calculate the contribution of the values $[l, m]$ apply in $[m+1, r]$
  4. Solve $[m+1, r]$ after the contribution is calculated
To understand the algorithm more, we will directly dive into problems.

Longest Non-decreasing Subsequence of Pairs (HDU 4742)

Prerequisite: segment tree or binary indexed tree

Problem statement: 

Given an array of size $N$. Each element is a triple of integers $(X_i, Y_i, Z_i)$. Find the longest set $\{A_1, A_2, \dots, A_k\}$ such that $X_{A_{i-1}} \le X_{A_i}$, $Y_{A_{i-1}} \le Y_{A_i}$, and $Z_{A_{i-1}} \le Z_{A_i}$ for all $1 < i \le k$. Note that the set $A$ is not necessarily sorted.

Also, find the number of such sets modulo $2^{30}$.

Constraints: $1 \le N \le 10^5$, $1 \le X_i, Y_i, Z_i \le 2^{30}$, $(X_i, Y_i, Z_i) \ne (X_j, Y_j, Z_j)$ for all $1 \le i < j \le N$

Solution:

Before everything, apply coordinate compression for $X$, $Y$, and $Z$ such that now $X, Y, Z \le 10^5$, which we can do by simply sort all of the values in a different array, remove duplicates, and apply binary search to find the compressed indices.

First observation is to notice that we can regard the first dimension as the index of an array of $(k-1)$-tuples. In other words, let $B$ be an array of pairs such that $B_{X_i} = (Y_i, Z_i)$, we are to find the longest non-decreasing subsequence of $B$. For equal values of $X_i$, greedily sort the $(Y_i, Z_i)$ pairs as it doesn't hurt the optimality of the answer.

Let's briefly discuss the solution for 1D and 2D variant of the problem in subquadratic time. Throughout the solution, I will be using a BIT (binary indexed tree) instead of a segment tree as BIT is relatively faster in both implementation and performance speed, hence a better data structure.*

The first-dimension variant is an obvious problem. Simply sort the array. The length is always $N$ and the number of possible sets is always $1$.

The two-dimension variant is a classic as well. Sort all of the elements by the first dimension and apply LNDS at the second dimension. Let $DP_i$ be the length of LNDS ending at element $i$. Maintain a maximum BIT that stores the DP values. In particular, update point $X_i$ with value $DP_{X_i}$. Since the DP values for every index will increase, the maximum operation in the BIT is valid. Answer would be query at point $MAX$, which would be $\max(DP_1, DP_2, \dots, DP_{MAX})$. To maintain the number of longest chains, provide another sum operation BIT that stores the count values. The implementation is left to the reader as I believe that there are a ton of sources for $O(N\log N)$ LNDS.

Back to the original problem. Sort all the elements by the first dimension and apply LNDS to $(Y_i, Z_i)$. How to achieve this? 2D Segment tree???? Not enough memory, and the implementation makes one get the "I'm die, thank you forever" treatment.

Let's try to solve this problem by CDQ DnC.

  1. Suppose we are currently solving for the interval $[l, r]$.
  2. Split to two intervals and solve the first interval, $[l, m]$.
  3. Find the contribution of $dp$ values from $[l, m]$ to $[m+1, r]$
    • Notice that $\max(X_l, X_{l+1}, \dots, X_m) \le \min(X_{m+1}, X_{m+2}, \dots, X_r)$. Thus, we should only worry for the other two dimensions. Notice that by this reduction, the problem becomes a 2D variation.
    • Sort the values of $[l, m]$ and $[m+1, r]$ by the second dimension, which is by $Y$.
    • Apply two pointers, one for $[l, m]$ and the other for $[m+1, r]$.
    • When iterating $[m+1, r]$, update the BIT by values in $[l, m]$ accordingly, just like the 2D variation we have briefly discussed before.
  4. Solve the other interval $[m+1, r]$.
Code:




Try a similar problem: HDU 5324 Boring Class

With this technique, we can solve for $K$ dimension LNDS in $O(N{\log}^KN)$ using nested CDQ DnC, although the implementation could be very tedious.

Let's try a different problem.

Lost Permutation (Own, no online judge (yet :D))

Prerequisite: FFT/NTT (only required if one wants to implement, but the main idea only requires the knowledge that FFT allows multiplication of polynomials to be done in linear logarithmic time)

Problem statement: 

Let there be a permutation $A$ of size $N$. Construct a graph s.t. there exists an edge connecting $A_i$ and $A_j$ if and only if $A_i > A_j$ and $i < j$. You are given the number of connected components, $M$, and the size of each connected component, $X_i$. Your task is to find the number of permutations s.t. the constructed graph is consistent with the information given. Print the answer modulo $998\,244\,353$. 

Constraints: $1 \le N, M, X_i \le 10^5$, $\sum_{i=1}^{M} X_i = N$

Solution:

Observation 1: For each connected component, the members must be consecutive.

Proof 1: Assume otherwise. Let $F$ be a connected component such that its members are not consecutive. Then, there must be a connected component G such that $\min(F) < \min(G) < \max(F)$. We will split $F$ to two parts: $F_1$ containing members that are less than $\min(G)$ and $F_2$ containing members that are more than $\min(G)$. Let $G_1$ be the set of members of $G$ that are less than $\max(F)$. To ensure that $F_2$ and $G_1$ are not connected, $G_1$ must have a smaller index than $F_2$. Similarly, to ensure that $F_1$ and $G_1$ are not connected, $F_1$ must have a smaller index than $F_2$. Thus, we conclude that $F_1$ has a smaller index than $F_2$, which is impossible as we know that $\min(F_2) > \min(G) > \max(F_1)$, hence $F_1$ and $F_2$ would not be connected. Contradiction.

Observation 2: There is only one way to order the connected components. Specifically, the connected components must be ordered increase in $\max(CC)$. An example is that a connected component with $\{4, 5, 6\}$ should be located left of $\{7, 8, 9, 10\}$.

Proof 2: Assume otherwise. Let $F$ and $G$ be different connected components such that $\max(F) > \max(G)$. If $F$ has a smaller index than $G$, then it satisfies $i < j$ and $A_i > A_j$ where $A_i = \max(F)$ and $A_j = \max(G)$. This means $i$ and $j$ are connected and $\max(F)$ and $\max(G)$ are in the same connected component. Contradiction.

From the two observations above, we can get the answer by multiplying $dp_{X_i}$s where $dp_x$ denotes the number of permutations of length $x$ that only have one connected component when the graph is constructed

Let's derive a transition for the dynamic programming. $dp_i = i! - \sum_{k=1}^{i-1} dp_{i-k} \times k!$, making this an $O(N^2)$ algorithm. The basecase is $dp_1 = 1$.

The logic behind this dp is $i!$ for all permutations length $i$, subtracted by invalid permutations $dp_{i-k} \times k!$. The $i - k$ denotes the first $i - k$ elements connected to one, then the next $k$ elements are located in another several connected components.

At this point, if you want to continue and solve it yourself, you can check the correctness of your implementation by comparing the $dp$ values with the quadratic solution.

Back to the original constraints, from here one would probably think of finding if such sequence in OEIS exists. Apparently, it does. There is a generating function that we can calculate in $O(N \log N)$, but we will be discussing another solution.

$\sum dp_{i-k} \times k!$ is a convolution, which we can calculate with NTT.

We will apply CDQ DnC:

  1. Suppose we are currently solving for dp values in the interval $[l, r]$.
  2. Split to two intervals and solve the dp values in the first interval, $[l, m]$.
  3. Find the contribution of $dp$ values from $[l, m]$ to $[m+1, r]$
    • Multiply the two polynomials: $$dp_l + dp_{l+1} x + dp_{l+2} x^2 + \dots + dp_m x^{m-l}$$ and $$1! + 2! x + 3! x^2 + \dots + (r-l)! x^{r-l-1}$$
    • Let $C_k$ be the coefficient of $x^k$ in the product. Notice that $C_k = \sum_j dp_{l+2+k - j} \times j!$
    • Hence, we can apply $dp_i := dp_i - C_{i - l - 1}$ for all $m < i \le r$
  4. Solve the other interval $[m+1, r]$.
  5. When reaching a basecase $[x, x]$, $dp_x = x!$ as it is in the state before subtracting with $dp_{x-y} \times y!$ for $y < x$.

This is correct since in the DnC, we solved the intervals in order ($[l, m] \rightarrow [m+1, r]$), thus when calculating the contribution at the third step, the $dp_i$ values in $[l, m]$ are already final.

By master theorem, $T(N) = 2T(\frac{N}{2}) + O(NlogN)$ (NTT complexity), total complexity is $O(Nlog^2N)$.

Remember to multiply by $\frac{M!}{\prod cnt_{X_i}!}$, representing the number of permutations of the connected components.

Code:



Full code: https://pastebin.com/m2Jjmv6f

Practice problems:

That is the end of this tutorial. I hope you understand the topic well and put it into good use :D



Postscripts:

References: https://robert1003.github.io/2020/01/31/cdq-divide-and-conquer.html.

*It's just a running joke targeted to a certain somebody who is very obsessed with segment tree. It's still a good data structure nonetheless.

CDQ DnC is sometimes refered to as online DnC, as it solves the ranges in order. The FFT technique explained in Lost Permutation is also sometimes refered to as online FFT.


Friday, October 8, 2021

[My Experience] CPC COMPFEST 13 2021

     COMPFEST is one of the largest IT events in Indonesia which is hosted by university students of Universitas Indonesia. They host seminars and competitions which are joined by people around Indonesia. I joined the Competitive Programming Competition hosted as a part of COMPFEST. There are two divisions of the competitive programming competition: Junior and Senior. 


     The junior division is designated for school students. Most of the participants (including me) are high school students, but there are several middle school students. In fact, a primary student also participated in the junior division. The competition is in IOI style, as a typical simulation for the upcoming National Science Olympiad of Informatics. On the other hand, the senior division is designated for university students which is in ICPC style. From here, the junior division will be refered as JCPC (Junior Competitive Programming Competition), and the senior division will be refered as SCPC (Senior CPC).

[Now the formal introduction is done I'll be informal now lmao]

    Anyways, I have joined JCPC three times consecutively. The first contest was when I was a specialist in Codeforces, so you can say that I was relatively "weak". Well yes, I was, and I got the 10th or 11th rank. Nothing special at all, I just joined for experience as technically I'm new to competitive programming (actually that happened in my second year of CP, but please ignore my first year I don't feel I learnt anything that year :D). I got 4th place in my second contest, with 3 years of experience and a master rating in Codeforces. A very big improvement in a span of one year due to national training, I guess.

Me in my first competition. The second and third competition is held online due to the pandemic :(
    

    So in this blog I would be sharing my experience in my third JCPC participation. Well, not just JCPC participation :)


    The head PIC of this year's CPC is a friend of mine, Hocky. We occasionally talk in a discord server with many other people who is (or was) interested in competitive programming. It was around March 2021 when Hocky joked (okay he probably was serious but I took it as a joke, I'll explain later why) about a "call for tasks" form for CPC. That was during a discussion meeting for TROC #20, a "regular" (isn't regular, just depends on the coordinator and the availability of problemsetters) open contest in Indonesia team's local online judge. Maybe I'll write about TROC in another blog, so stay tuned :D.

    Back to topic, Hocky offered another friend of mine Rama to give C4T (call fo(u)r tasks). Well I guess at the point of competition Rama would be attending university outside Indonesia so he is eligible. But he also offered me, a high schooler and actually going to attend Junior CPC, to C4T. So yeah definitely a joke right??

    Fast forward a few months to probably late July or early August. Hocky promotes his C4T form again in the main discord server (we have a designated server for TROC discussions for obvious reasons). I jokingly asked if I can problemset since I don't really have a full set for a solo contest, but also have decent (read: "bad") problems. He seriously answered yes. Actually interested, I gave him a private chat just to make sure if he was joking. He was actually serious o_o.

    Now you might be thinking: JCPC participant, but also a problemsetter? Very sus move Maxi. WELL IN A WAY IT WAS, but I just felt the need to publish my problems somewhere. I find it hard to just keep a problem for a month or year, I just want a problem to be published almost immediately. Please excuse my impatience :D Anyways it was agreed that my problems are only to be published in the senior bugabooset and I will not take part in any testcase making, editorial making, testing supervising, etc since I am also a participant. I talked this with the two main PIC of the competition: Hocky and Steven.

    In the end, I submitted 5 problems. I only consider one of them to be good (and indeed it was good) while the others are below standard. Initially Hocky planned to use all 5, but having 5 problems in a 10-13 problem ICPC style contest authored by a participant? Definitely too much. So, I requested only three of them to be used. In the actual contest, the problems are problem G, L, and M. After that, I didn't receive any info about my problems apart from the information that it is confirmed that all three problems will be used. Well, if I were to receive other information that would make me very sussy.

    With my story of problemsetting SCPC done, it is time to tell the experience of me as a participant in JCPC.


    There are two parts of the competition: the qualification round and the final round. The qualification round was hosted at 11th of September, while the final round was hosted recently at 2nd of October.

    I can't really say I did well in the qualification round. There were 6 problems to be done in 5 hours IOI style. You can refer to the problems here. For non-Indonesian readers, you can find a button which changes the language to English, if it isn't in English already. If you want to solve them, please solve them first, as I might write out some main ideas for each problem (after all there is no official editorial [yet????]). 

    I literally bricked and only got my first AC after one hour despite the problems being relative easy for a master rank after further inspection. My first AC is problem D, an interactive problem which its solution is just binary search. My second AC is problem F, a data structure problem which its states are relatively classic, since the topic of "longest zig-zag subsequence" is relatively googleable. After that AC, I felt more warmed up and ACed the rest of the easy problems, B and C. During the scoreboard freeze, I had 400 with a blank unsubmitted A and E, while other contestants had various scores in each task due to the partial scoring. It do be menacing if one sees an IOI style scoreboard with only 100s and -s. 

    The final hour is very stressful. I implemented an 80 score solution for A but for some reason it WAed with only a score of 11. Panicking, I manually solve the easier subtasks one by one until I get a score of 52. Eventually, it was found out that my 80 score solution was correct but I misimplemeneted. Sadge. E, on the other hand, is an output-only problem, which I saved for the last few minutes to take the easier solvable-by-hand subtasks.

    Unfortunately, I did not save the frozen scoreboard, but I still have the final scoreboard of the qualification round screenshoted.

P.S. Vietnam flag for no reason, other people below top 10 changed their flags so I decided to join\

    The testers said that they did not expect the rank 1 score of the qualification round to be this small, and I have to admit, they are correct. Just imagine the number of points I can save if I did not brick the first three hours. Anyways the point of the qualification round is to just proceed to the final round, which I did.

    A week before the final round, I felt that I did not feel warmed up because of midterms in school, so I decided to do some problems. In the end, I decided to do the ICPC WF invitational mirror in codeforces with other two friends. Didn't really help anything though, I only solved the second easiest problem. With me feeling unprepared, I just went for it. What's the worst that can happen? Oh wait, a 3 hour brick yeah definitely wouldn't repeat that.
    
    Again, you can see the problems here. There won't be many spoilers though, as an official editorial from the codeforces SCPC mirror already exists.

    Gladly I didn't though. I ACed the two easiest problems, A and B in the first hour, and getting the generous 79 point subtask in problem C in the second hour. The author claims that the final subtask should worth at least 40 points, and I kinda agree. The solution development is very far and the last subtask is very hard to get. In fact, it is rated 3000 in codeforces after its debut in the SCPC mirror. Anyways 279 in the first two hours is very ideal, compared to only having 100 points in the qualification round.

    Just a reminder to never celebrate early, as look what happens when one does :)

    I tried to take the first subtask of F to clarify my understanding of the problem. I got WA and tried to fix but to no avail. I ate a whole hour for a 0 in F. It turns out that the issue was just a simple if. Anyways I didn't found that on contest so I moved to E. E was some tree problem, I don't really like these problems and I am bad at them so I don't really expect to rack up much points here. I took the bruteforce solution for 14 points, but as I code I thought of a full solution for the problem. After I submitted the 14 point bruteforce, I coded for the full solution for an hour, but after failing the sample testcase, I realised my solution is 100% wrong. Dang, now I have wasted 2 hours. I rushed to the output only problem D and get as much points as I can. So there it was, a failure.

    In the end, I got second place. Wish I got that subtask in F, I would've continued thinking to the second subtask (since I already have the idea), but the past is the past. Lesson learnt: not bricking in the first two hours does not mean not bricking in the next two hours. Also learnt that before a competition, sleep for at least 9 hours because it is what it is.

    

Final scoreboard


Me dying in awarding session

    After the contest I enjoyed the ruined SCPC mirror scoreboard in codeforces. Here is the contest portal. It's fun to see people failing at my 2400 rated problem M and ended up having only 8 teams ACing it.

    That's the conclusion to my story in CPC COMPFEST 2021. It is definitely a fun and unique experience playing a weird role of both a participant and a problemsetter. I can't join JCPC next year as I will graduate from high school and go to university. I would still find time to problemset for SCPC if the future head PIC allows it :)

    Thank you for reading :D

Wednesday, October 6, 2021

First Entry :D

Hello! My name is Maximilliano Utomo and I am a problem solver enthusiast. I usually do competitive programming in preparation for informatics olympiads. I also occasionally do mathematics olympiad problems. This blog records my experience in the world of informatics and mathematics and will have discussion of problems from time to time. Hope you enjoy reading my blog!

[My Experience] CodinGame Spring Challenge 2022 (Zero to Legend)

(Early warning that there are spoilers related to my solutions of the challenge itself. If you are interested to solve it, I recommend doing...