cps616: questions about the 2nd assignment.  

CPS616: Questions related to the 2nd assignment

This page contains questions from students and my answers to them. If a question is asked that has already been answered, then I will not post another answer. So if your questions seems to go unanswered, you should check over previous questions.

I was wondering where I can find copies of the transparencies you were using in class or copies of your notes.
Take notes in class: write down the most important definitions and ideas. You can browse handouts provided by Michael Goodrich and Roberto Tamassia. Note that a few of their original slides have minor errors and/or typos. Also, read handouts posted in the Documents folder on Blackboard.

Which programming language we can use to code the dynamic programming algorithms in the 2nd assignment?
Any common programming language is fine. But if you would like to use an exotic programming language, talk to the instructor in advance and explain your reasons.

With regards to defining the subproblems, I was thinking it's best to treat this similar to the XYZ problem. I was wondering if you could tell me if I'm on the right track? Since the rest of the question builds on this I don't want to make a big mistake early on.
Well, choosing the right set of sub-problems can be tricky when we would like to design an algorithm using DP. We have to check if we have a polynomial number of sub-problems: it is the case if we follow your idea. The main tool to check if we are on the right track is the optimal substructure property. It says that if we consider an optimal solution to a problem, then it should be constructed from optimal solutions to the subproblems. This property is important because when using DP we build a solution to the problem from solutions to smaller subproblems which usually are re-used a few times. If we find a counter-example such that optimal solutions to some of the subproblems do not yield to an optimal solution to the problem, then this violates the optimal substructure property. So, if we find such a counter-example, then we have to reconsider our choice for the set of sub-problems. Read your lecture notes. Also, read a handout about methodology of solving problems using DP.

What is the format of the input file in Part B of the 1st question?
The input files should have the following format. First line in the file should be a single positive integer number representing "n", the size of a grid along an axis. After that, each line should include 5 numbers separated by blanks. The first 2 numbers represent the coordinates of square where a transtion starts, the next 2 numbers represent coordinates of square where a transition ends and the last 5th number represent a potential income that can be a positive or negative integer. To process this input, you can read the first line of the input file to get the value of "n", then initialize all elements in your array p[n][n][n][n] to the largest negative integer number representable on your computer. After that, you can read all remaining lines in the input file in a single for-loop and assign the correct values of "p" to all legal transitions. Note that if n=3, you need to specify only 14 legal transitions in the input file, but the total number of elements in the array "p" will be 3*3*3*3=81. In other words, the input file uses a compact representation of legal transitions. As as example, consider the following input file. It should be easy to figure out what an optimal sequence of transitions should be for this input.

0 0 0 1 5
0 0 1 1 -1
1 0 0 1 15
1 0 1 1 3
1 0 2 1 -2
2 0 1 1 18
2 0 2 1 1
0 1 0 2 6
0 1 1 2 -3
1 1 0 2 13
1 1 1 2 4
1 1 2 2 -4
2 1 1 2 25
2 1 2 2 2

Do we get different income from being in different squares on the south border?
No, for simplicity, in this problem we assume that the intial income we get from any square on the south border is 0. In other words, you can assume that all the values in your array OPT[0][0], . . . , OPT[n-1][0] will be 0. Subsequently, we collect an income from doing legal transitions between squares.

If square has coordinates (i,j), then where this square is located on a grid?
Assume that the first coordinate 0 =< i =< (n-1) corresponds to a horizontal axis, and the 2nd coordinate 0 =< j =< (n-1) corresponds to the vertical axis. For example, square (0,n-1) will be the North-West corner of the valley, and square (n-1,0) will be the South-East corner of the valley.

In Part B of Q1, should our program compute all optimal sequences, or only one optimal sequence of transitions?
Computing and printing a single optimal sequence of transitions should be fine. There is no need to compute all of them.

In Q2, can the right hand sides of different rules be same?
Yes, they can be same. For example, a program might include two different rules X_5 --> X_7 X_9 and X_8 --> X_7 X_9.

In Part A, Q1, when I encountered the section in which you ask us to define an array Opt[i][j], I became confused since I thought the array A[i][j] was to be the max profit of having made previous decisions to square (i,j). But is this not the same definition of Opt[i][j]? So is there any way to clarify why do we need both these arrays and how they differ from one another?
The array A is a "semantic" array, i.e., you use it to formulate in English the meaning of a problem that you are solving. The array OPT is an array that we use to formulate a recurrence. The recurrence is a kind of "number-crunching" mechanism that does calculations, but the question remains whether the numbers calculated from the recurrence are what we need, or not. If the recurrence is correct, then A=OPT, but this needs a proof. Usually, this can be proved by induction, because elements on the right-hand side of your recurrence will be correct by inductive hypothesis, and using the recurrence you prove (in the inductive step) that the value on the left-hand side of the recurrence is also correct. In the assignment, you are asked to argue (as precisely as you can) that the values A[i][j] and OPT[i][j] are the same. At least, you have to explain why the numbers in OPT[i][j] computed by recurrence will be exactly what is defined informally in English. See the textbook for some examples.

I know we're supposed to find the highest profit but i also noticed this: "For each square, your program should print an optimal transition and an optimal value earned. Make sure that your output is easy to read: please use ASCII pseudo-graphics or similar symbols to print output neatly." Does this mean that for example in a 3x3 "forest" we need to show the maximum profit and route for all 9 (or well at least the top 6 since the bottom 3 would be 0) squares, or do you just want the top row or even just the best route?
This means that in a "n*n forest", you will print a two-dimensional array of the maximum profits, and another two-dimensional array with symbols "u" (go up), "w" (go north-west), "e" (go north-east), or some other readable symbols (your choice). To make sure that the TA will understand your notation, provide a brief legend. So, any notation you choose should be fine, as long as it is easy to understand.
Similarly, in the LCS problem that we discussed in class, you are asked to provide vertical, horizontal or diagonal arrows to show how to read the longest common sub-sequence itself (not only the length of this sub-sequence). See also Section "Computing a Solution in Addition to its Value" in the textbook (p. 257).
One purpose of this illustration is to simplify the task of marking. Another purpose is that you will realize that in addition to a value optimization problem, there is a closely related search problem. As a solution to the search problem, you demonstrate transitions themselves, not only optimal profits.

I'm working on question 2 for the second assignment and I am confused as to what exactly is being asked. When you say that some programs cause undesirable effect, is that because the program does not follow the rules that you gave us? Hence, the programmer did not make a program syntactically correct or is it because the actions that the program does is harmful, meaning the program is syntactically correct and then these harmful programs are stored in the library?
This means that a sequences of actions that the program produces is "harmful". For the same program, other sequences of actions can be harmful or not. All programs (=finite sets of rules) are syntactically correct as long as they have the specified format, i.e., all rules in the program are of 2 types only: X_i --> a_j or X_i1 --> X_i2   X_i3. Your algorithm takes as an input a finite set of rules and a single sequence of actions. Assume that a given set of rules is syntactically correct, you do not need to check the input. You can also assume that a given sequence of actions is "harmful" (for whatever reason). Your task is to design an algorithm that for a given input will find whether or not a given sequence of actions can be produced by a given program.

(Continuation of Q11) Also, do we have to actually write the pseudo code that checks if the program is syntactically correct? Meaning it will return 0 or 1 accordingly.
No, you do not. A program, i.e., a finite set of rules, is a part of the input to an algorithm. When we design algorithms, we never check whether the input is correct. We just assume it is correct, and based on this assumption design an algorithm that solves a problem. In a similar vein, when we studied the problem of multiplying a chain of matrices using the minimal number of multiplications, we did not check whether all matrices contain only numbers (and not characters or something else). We just assumed that the input is syntactically correct.

(Continuation of Q11 and Q12.) We are asked to solve the verification problem, what is the verification problem exactly? Is the verification problem the problem of checking the library for undesirable behavior or checking the program for proper syntax? I'm not entirely sure what I am solving or even how to use the array for the semi-solutions that I am thinking might work.
None of what you have mentioned. The so-called "verification problem", is the problem of finding whether a given sequence of actions can be produced from X_1 by using a given set of rules. If yes, then show how it can be produced. This is well explained in the assignment, where a sequence of actions is called a behavior, for brevity.

In Q2, we've been trying to start with one symbol (from the bottom); for each of the symbols of the bad-behavior sequence, we expand it based on the given rules. We get stuck on what should we do if there are several rules. Should we look at the neighbors of that symbol and choose the rule that fits? Should we replace that symbol or should we save the intermediate result in a list?
Please read again the paragraph on the 3rd page that starts with "Develop a DP algorithm ..." If you read carefully the hints given in the assignment, you will see how to design this algorithm. I'll rephrase these hints again. Consider subproblems of the following form: for each contiguous sub-sequence w_i   w_{i+1}   ...   w_j of the given sequence of actions w_1 ... w_n we have to find a set of intermediate symbols that can be used to produce this sub-sequence. We keep a set of these symbols as an entry A[i][j] in the array A (if the set is empty, we keep the empty set symbol). Note that we abstract away here from any implementation details and talk simply about sets. Since there are rules of 2 types only, if the length of the sub-sequence w_i w_{i+1} ... w_j is greater than 2, then it can be only produced starting from one of the rules X_i1 --> X_i2   X_i3, if any. Consequently, if the sub-sequence w_i w_{i+1} ... w_j can be produced at all from one of the intermediate symbols X_i1, this can happen only if a shorter sub-sequence w_i ... w_k can be produced, say, from X_i2 and the remaining sub-sequence w_{k+1} ... w_j can be produced from X_i3 by applying the rules of the given program. Of course, the question remains how to find "k" such that the problem for contiguous sub-sequence w_i w_{i+1} ... w_j will split into two smaller sub-problems. You should see now that the structure of overlapping sub-problems is somewhat similar to the structure of overlapping subproblems of the matrix chain multiplication problem (it was discussed in class and you did an example in a lab). Read your lecture notes and read the handout if you forgot the details. The links to handouts are provided from the Documents folder on Blackboard.

For question 2, the problem I am having is how to make a decision about the structure of the algorithm's input. For the string of undesirable behaviour, can I say it is an array u[] that is size n? The structuring of the input of the production rules is where I become confused. When it is said that we are given m primitive actions and r intermediate symbols, how to represent this. Am I allowed to represent this by a 2D array, where the number of rows is the corresponding production rule and the columns represent the different versions of the rules?
Yes, you can represent an undesirable behavior as an an array u[] that has size n. In any case, try to take an abstract perspective on this. In other words, it is irrelevant what data structures can be used to represent rules, sequences of actions, sets of symbols. When you describe an algorithm, you can characterize it on a high level of abstraction. For example, if a given "bad" bahavior consists just of a single primitive action a_j, and we have a rule X_1 --> a_j, then yes, we know that this "bad bahavior" can be produced by a given finite set of rules. Otherwise, if a given "bad bahavior" consists of a sequence of "n" primitive actions, then we consider sub-problems, as I explained in hints above (Q14/A14). Note that when we do this kind of reasoning and analysis, it does not matter what data structures are used to represent rules X_i --> X_j1 X_j2. If you keep thinking on the level of implementation details (what classes and objects you need and what methods you have to call), you still can design a correct algorithm, but it will take you more time and more efforts to do this. Note that you are NOT asked to implement an algorithm for Q2. So, implementation-level details are irrelevant.

For Question2, should the derivation of the rules always start from X1? For example, if we have two rules, X1-->a1 and X2-->a2, and the given sequence is 'a2', should the program return 0 since starting from X1 we only can produce a1?
Yes, in your example, "a_2" cannot be produced by a given program. However, if you add the rule X_1 --> a_2, then a_2 can be produced as well. Moreover, if we add 2 more rules, e.g.,
X_1 --> X_2 X_3
X_3 --> a_3,

then a sequence "a_2 a_3" also can be produced from X_1.

I am wondering whether it is absolutely necessary for the question 1(part B) to test the program with n=10, since to do this I will have to come up with 252 different legal transitions? I figured out general formula for the number of legal transitions needed for any n value greater than 2: ((n-2)*3+4)*(n-1). That's how I got number 252. Also since you mentioned that we need to put the output of the program into file Results.txt and include input file(just one) as a part of overall electronic submission, should my input file (and produced by program output file dependent on it) use value of the n equal to 4, 10 or some other value between those two?
No, it is not required to try n=10. But please try at least one more value in addition to n=4, e.g., n=5 or n=6. The transitions for different rows can be similar.
Please include 2 input files into your ZIP file and provide both outputs from them in your Results.txt file.

Must a specified robot expression have its derivations in order? ie: Is it possible that X1 -> X5;X3 such that X5 come before X3, or does the first part have to be of lower order (X1 -> X2;X3)
Rules can use the intermediate symbols in any order. So, it is possible to have rules like X1 -> X5;X3 such that X5 occurs before X3

To solve question 2 should we be borrowing ideas directly from matrix multiplication example, more specifically is the optimal sub structure of the two question similar?
Structure of subproblems is somewhat similar to the matrix chain multiplication example, with minor differences of how subproblems are being re-used. As you see from the answer A18 to Q18, the intermediate symbols can be used in any order. When we consider a chain of matrices A1 * A2 * ... * A_n than these matrices are multiplied strictly in order of their appearance.

The assignment PDF provides 3 restrictions for transitioning from a square x to y:

1) a northern square next to the current square (unless the current square is already on the north-most
border of the valley, then logging has to stop),
2) a square across the north-east corner from the current square (but only if the company is not already
on the left border of the valley),
3) a square across the north-west corner from the current square (but only if the company is not already
on the right border of the valley).
From what I understand, movements are made in either north, northeast or northwest directions only, so there are not lateral east or west movements. I understand (1) is for northern, (2) is for northeastern and 3) is for northwestern movements. However, the wording is confusing inside parentheses. For example, Rule2 says it moves across the northeast corner but then stipulates it must not be at the left border. There is a similar confusing situation in Rule3, where a company at the western border could move further north-west since it is not at the right border. Is it a typo in PDF?
Yes, you are right: there are no lateral east or west transitions. Your understanding of transitions is correct. But there is a typo in the rules. In Rule2: replace "left" with "right" and in Rule3, replace "right" with "left".
The eastern and western borders cannot be crossed. From an area on the eastern border, the company can make only north-bound or north-west transitions, but it cannot move north-east. Similarly, from an area on the western border, the company can move only to the adjacent northern area or to the adjacent north-eastern area.

I cannot see what sub-problems I should consider in Part 1. Can you give a hint?
If you read Part 1 carefully, you see you are told to define a semantic array A[i][j] with 2 parameters where "i" represents a row, and "j" represents a column. Also, in the problem statement, we read that a company collects income from logging in different squares. This implies that we should be looking for the most profitable way to get to (i,j) from the South border. Consequently, the natural optimization sub-problem would be finding the maximal profit from moving to the square (i,j) from some initial square on the South border. Think how a solution to this sub-problem can be developed by reasoning backward, e.g., if there is an optimal way to get to (i,j), then it includes an optimal way of getting to an intermediate square, and then making an optimal move from there to (i,j). Next, think about smaller sub-problems that need to be solved to solve an (i,j)-subproblem. In particular, how the company can get to the square (i,j)? In the assignment, there are 3 rules defining how the company can move to (i,j) from the previous row (i-1). Consider all these rules to develop a recurrence.

I have a small inquiry about question 1. Are we to assume that the most profitable path always end in the north most row of the logging area? I ask because it is possible for profit to be negative, which means that it could be possible that all transitions from the second most north row to the north most row could all be negative, which means that the logging should actually end at the second most north row. In fact, it could be possible that every transition given is negative, or that the combination of transitions and profit from previous squares are all negative which would mean that the most optimal path would be to not log at all or the company would only lose money. Thanks for clarifying.
Let's assume for simplicity that it is mandatory (for some reason) to proceed until the North border, i.e., logging should not terminate anywhere in between South and North borders. This might be ensured either if all p(x,y) >= 0, or if for every x there exists at least one y such that transition from x to y brings positive p(x,y) > 0. However, notice that in the latter case the greedy strategy of choosing the largest immediate positive profit p(x,y) will not always yield the globally optimal solution. Sometimes, the company might wish to incur a loss (=negative profit), that later will be compensated by very high positive returns from profitable areas in the forest. You should be able to construct counter-examples when an obvius greedy strategy will lead astray from the globally optimal solution.