Assumptions and limitations

  • To get an accurate estimated start time, you must specify a run limit at the job level using the bsub -W option, in the queue by configuring RUNLIMIT in lsb.queues, or in the application by configuring RUNLIMIT in lsb.applications, or you must specify a run time estimate by defining the RUNTIME parameter in lsb.applications. If a run limit or a run time estimate is not defined, the scheduler will try to use CPU limit instead.

  • Estimated start time is only relatively accurate according to current running job information. If running jobs finish earlier, estimated start time may be moved to earlier time. Only the highest priority job will get accurate predicted start time. The estimated start time for other jobs could be changed after the first job starts.

  • Under time-based slot reservation, only information from currently running jobs is used for making reservation decisions.

  • Estimated start time calculation does not consider Deadline scheduling.

  • Estimated start time calculation does not consider Advance Reservation.

  • Estimated start time calculation does not consider DISPATCH_WINDOW in lsb.hosts and lsb.queue configuration.

  • If preemptive scheduling is used, the estimated start time may not be accurate. The scheduler may calculate and estimated time, but actually it may preempt other jobs to start earlier.

  • For resizable jobs, time-based slot reservation does not schedule pending resize allocation requests. However, for resized running jobs, the allocation change is used when calculating pending job predicted start time and resource reservation. For example, if a running job uses 4 slots at the beginning, but added another 4 slots, after adding the new resources, LSF expects 8 slots to be available after the running job completes.

Slot limit enforcement

The following slot limits are enforced:

  • Slot limits configured in lsb.resources (SLOTS, PER_SLOT)

  • MXJ, JL/U in lsb.hosts

  • PJOB_LIMIT, HJOB_LIMIT, QJOB_LIMIT, UJOB_LIMIT in lsb.queues

Memory request

To request memory resources, configure RESOURCE_RESERVE in lsb.queues.

When RESOURCE_RESERVE is used, LSF will consider memory and slot requests during time-based reservation calculation. LSF will not reserve slot or memory if any other resources are not satisfied.

If SLOT_RESERVE is configured, time-based reservation will not make a slot reservation if any other type of resource is not satisfied, including memory requests.

When SLOT_RESERVE is used, if job cannot run because of non-slot resources, including memory, time-based reservation will not reserve slots.

Host partition and queue-level scheduling

If host partitions are configured, LSF first schedules jobs on the host partitions and then goes through each queue to schedule jobs. The same job may be scheduled several times, one for each host partition and last one at queue-level. Available candidate hosts may be different for each time.

Because of this difference, the same job may get different estimated start times, future allocation, and reservation in different host partitions and queue-level scheduling. With time-based reservation configured, LSF always keeps the same reservation and future allocation with the earliest estimated start time.

bjobs displays future allocation information

  • By default, job future allocation contains LSF host list and number of CPUs per host, for example: alloc=2*hostA 3*hostB

  • LSF integrations define their own future allocation string to override the default LSF allocation. For example, in cpuset, future allocation is displayed as:

    alloc=2*mstatx01 2*mstatx00

Predicted start time may be postponed for some jobs

If a pending job cannot be placed in a future resource allocation, the scheduler can skip it in the start time reservation calculation and fall back to use greedy slot reservation. There are two possible reasons:

  • The job slot request cannot be satisfied in the future allocation

  • Other non-slot resources cannot be satisfied.

Either way, the scheduler continues calculating predicted start time for the remaining jobs without considering the skipped job.

Later, once the resource request of skipped job can be satisfied and placed in a future allocation, the scheduler reevaluates the predicted start time for the rest of jobs, which may potentially postpone their start times.

To minimize the overhead in recalculating the predicted start times to include previously skipped jobs, you should configure a small value for LSB_TIME_RESERVE_NUMJOBS in lsf.conf.