Purdue University Timetabling |
| Description |
|
In timetabling problem at Purdue University a timetable for large lecture classes is constructed by a central scheduling
office in order to balance the requirements of many departments offering large classes that serve students from across
the university. Smaller classes, usually focused on students in a single discipline, are timetabled by “schedule deputies”
in the individual departments. Such a complex timetabling process, including subsequent student registration, takes a
rather long time. Initial timetables are generated about half a year before the semester starts. The importance of
creating a solver for a dynamic problem increases with the length of this time period and the need to incorporate
the various changes that arise.
As for Fall 2004 semester, this problem consists of about 830 classes (forming almost 1800 meetings) having a high density of interaction that must fit within 50 lecture rooms with capacities up to 474 students. Room availability is a major constraint for Purdue. Overall utilization of the time available in rooms exceeds 78%; moreover, it is around 94% for the four largest rooms. About 90,000 course requests by almost 30,000 students must also be considered. 8.4% of class pairs have at least one student enrolment in common. The timetable maps classes (students, instructors) to meeting locations and times. A major objective in developing an automated system is to minimize the number of potential student course conflicts which occur during this process. This requirement substantially influences the automated timetable generation process since there are many specific course requirements in most programs of study offered by the University. To minimize potential time conflicts, Purdue has historically subscribed to a set of standard meeting patterns. With few exceptions, 1 hour x 3 day per week classes meet on Monday, Wednesday, and Friday at the half hour (7:30, 8:30, 9:30, ...). 1.5 hour x 2 day per week classes meet on Tuesday and Thursday during set time blocks. 2 or 3 hours x 1day per week classes must also fit within specific blocks, etc. Generally, all meetings of a class should be taught in the same location. Such meeting patterns are of interest to the problem solution as they allow easier changes between classes having the same or similar meeting patterns. ModelDue to the set of standardized time patterns and administrative rules enforced at the university, it is generally possible to represent all meetings of a class by a single variable. This tying together of meetings considerably simplifies the problem constraints. Most classes have all meetings taught in the same room, by the same instructor, at the same time of day. Only the day of week differs. Moreover, these days and times are mapped together with the help of meeting patterns, e.g., a 2 hours x 3 day per week class can be taught only on Monday, Wednesday, Friday, beginning at 5 possible times
Or, for instance, a 1 hour x 2 day per week class can be taught only on Monday-Wednesday, Wednesday-Friday or Monday-Friday, beginning at 10 possible times
In addition, all valid placements of a course in the timetable have a one-to-one mapping with values in the variable's domain. This domain can be seen as a subset of the Cartesian product of the possible starting times, rooms, etc. for a class represented by these values. Therefore, each value encodes the selected time pattern (some alternatives may occur, e.g., 1.5 hour x 2 day per week may be an alternative to 1 hour x 3 day per week), selected days (e.g., a two meeting course can be taught in Monday-Wednesday, Tuesday-Thursday, Wednesday-Friday), and possible starting times. A value also encodes the instructor and selected meeting room. Each such placement also encodes its preferences (soft constraints), combined from the preference for time, room, building and the room's available equipment. Only placements with valid times and rooms are present in a class's domain. For example, when a computer (classroom equipment) is required, only placements in a room containing a computer are present. Also, only rooms large enough to accommodate all the enrolled students can be present in valid class placements. Similarly, if a time slice is prohibited, no placement containing this time slice is in the class's domain. As mentioned above, each value, besides encoding a class's placement (time, room, instructor), also contains information about the preference for the given time and room. Room preference is a combination of preferences on the choice of building, room, and classroom equipment. The second group of soft constraints is formed by student requirements. Each student can enrol in several classes, so the aim is to minimize the total number of student conflicts among these classes. Such conflicts occur if the student cannot attend two classes to which he or she has enrolled because these classes have overlapping times. Finally, there are some group constraints (additional relations between two or more classes). These may either be hard (required or prohibited), or soft (preferred), similar to the time and room preferences (from -2 to 2). ConstraintsThere are two types of basic hard constraints: resource constraints (expressing that only one course can be taught by an instructor or in a particular room at the same time), and group constraints (expressing relations between several classes, e.g., that two sections of the same lecture can not be taught at the same time, or that some classes have to be taught one immediately after another).Except the constraints described above, there are several additional constraints which came up during our work on this lecture timetabling problem. These constraints were defined in order to make the automatically computed timetable solution acceptable for users from Purdue University. First of all, if there are two classes placed one after another so that there is no time slot in between (also called back-to-back classes), distances between buildings need to be considered. The general feeling is that different rooms in the same building are always reasonable, moving to the building next door is to be discouraged, a couple of buildings away strongly discouraged, and any longer distance prohibited. Each building has its location defined as a pair of coordinates [x,y]. The distance between two buildings is estimated by Euclides distance in such two dimensional space, i.e., (dx^2 + dy^2)^(1/2) where dx and dy are differences between x and y coordinates of the buildings. As for instructors, two subsequent classes (where there is no empty slot in between, called also back-to-back classes) are infeasible to teach when such difference is more than 200 meters (hard constraint). The other options (soft constraints) are:
Our concern for distance between back-to-back classes for students is different. Here it is simply a question of whether it is feasible for students to get from one class to another during the 10-minute passing period. At present, the distance between buildings not more than 670 meters is considered as an acceptable travel distance. For the distance above 670 meters, the classes are considered as too far. If there is a student attending both classes, it means a student conflict (same as when these classes are overlapping in time). Next, since the automatic solver tries to maximize the overall accomplishment of soft time and room constraints (preferences), the resultant timetable might be unacceptable for some departments. The problem is that some departments define their time and room preferences more strictly than others. The departments which have not defined time and room preferences usually have most of their classes taught in early morning or late evening hours. Therefore, we introduced the departmental time and room preferences balancing mechanism. The solver is trying to fulfill the time and room preferences as well as to balance the used times between individual departments. This means that each department should use each time unit (half-hour, e.g., Monday 7:30 – 8:00) in a similar portion to the other time units used by the department. At first, for each department and time unit, there is a number stating how many times each time unit can be used (i.e., how many placements of all classes from the department can be placed over the time unit). For instance, if there are two 1 hour x 2 days per week classes, the time unit Wednesday 8:00 – 8:30 can be used four times, i.e., each of these classes can be placed either on Monday-Wednesday or Wednesday-Friday from 8:00 till 9:00. Than, an average fill factor is computed for each department and time unit. It is a ratio between the computed number of placements using the time unit and the total number of placements of all classes from the department (it is sixty for the above example with two classes, each class can be placed in thirty different times if all possible times are allowed). So, this factor states the overall usage of a time unit for a department. The reason for computing such number is the fact that some times are used much more than others (e.g., if the department has most of the classes in n hours hour x 3 days per week, Tuesday and Thursday are used much less than Monday, Wednesday and Friday). The initial allowance, which states how many times each time unit can be used by a department is computed from this maximal fill factor: it is increased by the given percentage (20% is used in our tests) and rounded upwards to the first integer number. The overall department balancing penalty of a solution is the sum of overruns of this initial allowance over all time units and departments. The intention is to keep this number as low as possible. Finally, since all of the classes are at least two time slots long (60 minutes), an empty time slot of a room which is surrounded by classes on both sides (i.e., the room is not used for 30 minutes between two consecutive classes) is considered useless – no other class can use it. The number of such useless half-hours should be minimized. Also the situation when a room is occupied by a class which is using less than 2/3 of its seats is discouraged. Both these soft constraints are considered much less important than all the constraints described above. HeuristicsThe quality of a solution is expressed as a weighted sum combining soft time and classroom preferences, satisfied soft group constrains and the total number of student conflicts. This allows us to express the importance of different types of soft constraints. The following weights are considered in the sum:
Note that preferences of all time, classroom and group soft constraints go from -2 (strongly preferred) to 2 (strongly discouraged). So, for instance, the value of the weighted sum is increased when there is a discouraged time or room selected or a discouraged group constraint satisfied. Therefore, if there are two solutions, the better solution of them has the lower weighted sum of the above criteria. The termination condition stops the search when the solution is complete and good enough (expressed as the number of perturbations and the solution quality described above). It also allows for the solver to be stopped by the user. Characteristics of the current and the best achieved solution, describing the number of assigned variables, time and classroom preferences, the total number of student conflicts, etc., are visible to the user during the search. The solution comparator prefers a more complete solution (with a smaller number of unassigned variables) and a solution with a smaller number of perturbations among solutions with the same number of unassigned variables. If both solutions have the same number of unassigned variables and perturbations, the solution of better quality is selected. If there are one or more variables unassigned, the variable selection criterion picks one of them randomly. We have tried several approaches using domain sizes, number of previous assignments, numbers of constraints in which the variable participates, etc., but there was no significant improvement in this timetabling problem towards the random selection of an unassigned variable. The reason is, that it is easy to go back when a wrong variable is picked - such a variable is unassigned when there is a conflict with it in some of the subsequent iterations. When all variables are assigned, an evaluation is made for each variable according to the above described weights. The variable with the worst evaluation is selected. This variable promises the best improvement in optimization. We have implemented a hierarchical handling of the value selection criteria. There are three levels of comparison. At each level a weighted sum of the criteria described below is computed. Only solutions with the smallest sum are considered in the next level. The weights express how quickly a complete solution should be found. Only hard constraints are satisfied in the first level sum. Distance from the initial solution (MPP), and a weighting of major preferences (including time, classroom requirements and student conflicts), are considered in the next level. In the third level, other minor criteria are considered. In general, a criterion can be used in more than one level, e.g., with different weights. The above sums order the values lexicographically: the best value having the smallest first level sum, the smallest second level sum among values with the smallest first level sum, and the smallest third level sum among these values. As mentioned above, this allows diversification between the importance of individual criteria. Furthermore, the value selection heuristics also support some limits (e.g., that all values with a first level sum smaller than a given percentage Pth above the best value [typically 10%] will go to the second level comparison and so on). This allows for the continued feasibility of a value near to the best that may yet be much better in the next level of comparison. If there is more than one solution after these three levels of comparison, one is selected randomly. This approach helped us to significantly improve the quality of the resultant solutions. In general, there can be more than three levels of these weighted sums, however three of them seem to be sufficient for spreading weights of various criteria for our problem. The value selection heuristics also allow for random selection of a value with a given probability (random walk, e.g., 2%) and, in the case of MPP, to select the initial value (if it exists) with a given probability (e.g., 70%). Criteria used in the value selection heuristics can be divided into two sets. Criteria in the first set are intended to generate a complete assignment:
Let us emphasize that the criteria from the second group are needed for optimization only, i.e., they are not needed to find a feasible solution. Furthermore, assigning a different weight to a particular criteria influences the value of the corresponding objective function. Student SectioningMany courses at Purdue University consist of several sections, with students enrolled in the course divided among them. Sections are often associated together by some constraints. For example, sections of the same course should not overlap. Each such section forms one class which has its own preferences. Therefore each is treated separately - there is a variable for each section.An initial sectioning of students into course sections is processed. This student sectioning is based on Carter's homogeneous sectioning and it is intended to minimize future student conflicts. However, there is still a possibility of improving the solution with respect to the number of student conflicts. This can be achieved via section changes during the search. In the current implementation, sectioning is altered only by switching student enrolments between two different sections of the same course. Each student enrolment in a course with more than one section is processed. An attempt is made to switch it with a student enrolment from a different section. If this switch decreases the total number of student conflicts, it is applied. Current implementation the students are switched in the current solution, before it is stored as the best solution. All classes are processed and attempted switches are made between students in the same course. Note that a switch of a student enrolment can be followed with subsequent switches, so that classes can be processed more than once. |