ABSTRACT
Workflow scheduling is crucial to the efficient operation of cloud platforms, and has attracted a lot of attention. Up to now, many algorithms have been reported to schedule workflows with budget constraints, so as to optimize workflows’ makespan on cloud resources. Nevertheless, the hourly-based billing model in cloud computing is an ongoing challenge for workflow scheduling that easily results in higher makespan or even infeasible solutions. Besides, due to data constraints among workflow tasks, there must be a lot of idle slots in cloud resources. Few works adequately exploit these idle slots to duplicate tasks’ predecessors to shorten their completion time, thereby minimizing workflow’s makespan while ensuring its budget constraint. Motivated by these, we propose a task duplication based scheduling algorithm, namely TDSA, to optimize makespan for budget-constrained workflows in cloud platforms. In TDSA, two novel mechanisms are devised: 1) a dynamic sub-budget allocation mechanism, it is responsible for recovering unused budget of scheduled workflow tasks and redistributing remaining budget, which is conducive to using more expensive/powerful cloud resources to accelerate completion time of unscheduled tasks; and 2) a duplication-based task scheduling mechanism, which strives to exploit idle slots on resources to selectively duplicate tasks’ predecessors, such advancing these tasks’ completion time while trying to ensuring their sub-budget constraints. At last, we carry out four groups of experiments, three groups on randomly generated workflows and another one on actual workflows, to compare the proposed TDSA with four baseline algorithms.