Pilot job

From HandWiki
Revision as of 17:30, 6 February 2024 by Steve2012 (talk | contribs) (over-write)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Short description: Type of multilevel scheduling

In computer science, a pilot job is a type of multilevel scheduling, in which a resource is acquired by an application so that the application can schedule work into that resource directly, rather than going through a local job scheduler, which might lead to queue waits for each work unit. This term comes from the Condor High-Throughput Computing System, in which Condor GlideIns[1] provides this functionality. Other examples of pilot jobs are: the BigJob implemented in SAGA,[2] Swift Coasters as part of the Swift[3] parallel scripting system, the Falkon[4] lightweight task execution framework, and HTCaaS.[5]

Pilot jobs are most often used on systems that have queues, as part of their purpose is, in some sense, to avoid multiple waits in these queues. These are most often found in parallel computing systems, but pilot jobs are usually part of a distributed application, and are many times associated with Many-task computing.

References

  1. Sfiligoi, I. (2008). "glideinWMS—a generic pilot-based workload management system". J. Phys.: Conf. Ser. 119 (6): 062044. doi:10.1088/1742-6596/119/6/062044. Bibcode2008JPhCS.119f2044S. https://digital.library.unt.edu/ark:/67531/metadc895137/. 
  2. Luckow, André; Lacinski, Lukasz; Jha, Shantenu (2010). SAGA BigJob: An Extensible and Interoperable Pilot-Job Abstraction for Distributed Applications and Systems, 10th IEEE/ACM International Conference on Cluster. 135–144. doi:10.1109/CCGRID.2010.91. ISBN 978-1-4244-6987-1. 
  3. Wilde, Michael (2011). "Swift: A language for distributed parallel scripting". Parallel Computing 37 (9): 633–652. doi:10.1016/j.parco.2011.05.005. 
  4. I. Raicu, Y. Zhao, C. Dumitrescu, I. Foster, M. Wilde. "Falkon: A Fast and Lightweight Task Execution Framework," IEEE/ACM SC, 2007, http://www.cs.iit.edu/~iraicu/research/publications/2007_SC07_Falkon.pdf
  5. Jik-Soo Kim, Seungwoo Rho, Seoyoung Kim, Sangwan Kim, Seokkyoo Kim, and Soonwook Hwang, HTCaaS: Leveraging Distributed Supercomputing Infrastructures for Large-Scale Scientific Computing, ACM 6th Workshop on Many-Task Computing on Clouds, Grids, and Supercomputers (MTAGS'13) held with SC13, November 2013, http://datasys.cs.iit.edu/events/MTAGS13/p02.pdf

External links