eScience '16: 12th IEEE International Conference on eScience
Today's science is more and more driven by collecting and evaluating increasing amounts of data. Utilizing Scientific Workflows is one suitable method how to organize processing pipelines for this purpose. In this work, we show that performance improvements on the execution of existing workflows can be achieved, if the conditions for starting selected tasks with certain data access characteristics are loosened. We provide a scheme how to identify eligible tasks in a given workflow and demonstrate a technique how an earlier start of tasks can be realized in Pegasus WMS by transforming the workflow DAG and by using a wrapper around the task executable during runtime. Our implemented wrapper handles the reading data accesses for task instances so that existing original workflows can be executed without the need to modify them. We evaluate our approach in simulations and experiments on real distributed computing resources, and are able to observe performance improvements for the Montage workflow by a significant reduction of total execution time.