l3q.1 12 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346
  1. .TH L3Q "1" "February 2024" "L3Q client" "User Commands"
  2. .SH NAME
  3. l3q \- l3q client. \fBLight\fR, \fBlight\fR, \fBlight\fRweight \fBqueuing\fR system to run processes/programs in parallell or in sequence on multiple nodes.
  4. .SH SYNOPSIS
  5. .B l3q
  6. \fI[-h] [--version] [-l] {add,show,list,cancel,node,validate} ...\fR
  7. .SH DESCRIPTION
  8. .PP
  9. l3q is the l3q client that sends queries, requests and commands to the l3q daemon \fBl3qd\fR. l3q uses the TCP/IP protocol and all communication with the l3q daemon is handled over network. Client sends POST requests and l3qd responds with json responses. l3q consists of sub-commands that handle different kind of actions, \fBadd\fR adds jobs to l3q, \fBshow\fR prints information of different types, \fBlist\fR prints information in list format, \fBcancel\fR cancels running jobs, \fBnode\fR list status of nodes and changes states on nodes (must be run as root), \fBvalidate\fR to validate client with l3q daemon.
  10. .br
  11. l3q launches \fBjobs\fR that contains one or more \fBtasks\fR on compute \fBnodes\fR.
  12. .br
  13. Before the l3q daemon accepts communication from this client a validation step has to be done first. On the l3q daemon server start the validation daemon and follow the instructions: \fBl3qd --validate-host\fR
  14. .SH QUEUE
  15. The default queueing system in the l3q daemon is similar to a FIFO (first in, first out) stack. The first job that is added will be launched first, if there are more queued jobs they will all be processed in turn from the top. If there are enough resources available for the next queued job to be launch it will be started, until there are no queued jobs left or not enough resources are available. All jobs are processed from top to bottom and when the first queued job will not fit the processing stops, even if there are smaller jobs further down in the queue that would be able to be launched this will not happen. That job has to wait until all previous queued jobs has beed launched first.
  16. .br
  17. Jobs that are in Depend state are not inluded in the queue until dependent jobs are finished.
  18. .SH COMMANDS
  19. .SS "Show commands"
  20. .PP
  21. .TP
  22. \fB\ show [-h] {queue,history,job,status} ... \fR
  23. Shows status and information about nodes and jobs.
  24. .IP
  25. \fB-h\fR, \fB--help\fR show help message and exit
  26. .TP
  27. \fB show status [-h] [-H]\fR
  28. Shows information about all hosts available in l3q.
  29. .IP
  30. \fB-H\fR, \fB--human\fR Prints human-readable output
  31. .br
  32. \fB-h\fR, \fB--help\fR show help message and exit
  33. .TP
  34. \fB show queue [-h] [-l] \fR
  35. Lists all running and queued jobs in l3q.
  36. .br
  37. As a shorthand use: \fBl3q [-l]\fR, this will print the same output.
  38. .IP
  39. \fBColumns in queue\fR:
  40. .br
  41. \fBJobid\fR: l3q id for job
  42. .br
  43. \fBUser\fR: User that added the job
  44. .br
  45. \fBName\fR: Free text to describe job
  46. .br
  47. \fBCores alloc\fR: Number of cores per node that this job is allocating.
  48. .br
  49. \fBNodes alloc\fR: Number of nodes that this job is allocating.
  50. .br
  51. \fBTasks R/T/E:N\fR: Number of tasks: Running/Terminated/Error:Total
  52. .br
  53. \fBState\fR: A job can have the following states: Running, Queued, Terminated, Cancel, Canceled, Error
  54. .br
  55. \fBNodes\fR: Number of nodes that l3q is using for the job.
  56. .IP
  57. \fBAditional columns in queue --long\fR:
  58. .br
  59. \fBInit\fR: Date and time when job was added
  60. .br
  61. \fBStart\fR: Date and time when the job started.
  62. .br
  63. \fBDepend\fR: List all jobs that this job depends on.
  64. .br
  65. \fBNodes\fR: Names of nodes that l3q is using for the job.
  66. .IP
  67. \fB-l\fR, \fB--long\fR Prints a long and more detailed view of the queue
  68. .br
  69. \fB-h\fR, \fB--help\fR show help message and exit
  70. .TP
  71. \fB show history [-h] [-l] [-n NUMBER] \fR
  72. Lists all terminated and canceled jobs in l3q.
  73. .br
  74. Columns output is equal to process show queue.
  75. .IP
  76. \fB-l\fR, \fB--long\fR Prints a long and more detailed view of the history
  77. .br
  78. \fB-n NUMBER\fR, \fB--number NUMBER\fR Limit the number of old jobs to be displayed.
  79. .TP
  80. \fB show job [-h] [-d] JOBID \fR
  81. Shows information about specified job.
  82. .IP
  83. \fBJOBID\fR Id number of job to show.
  84. .br
  85. \fB-d\fR, \fB--details\fR Shows detailed information about specified job
  86. .br
  87. \fB-h\fR, \fB--help\fR show help message and exit
  88. .TP
  89. \fB show task [-h] TASKID \fR
  90. Shows information about specified task.
  91. .IP
  92. \fBTASKID\fR Id number of task to show.
  93. .br
  94. .br
  95. \fB-h\fR, \fB--help\fR show help message and exit
  96. .SS "List commands"
  97. .PP
  98. .TP
  99. \fB\ list [-h] {task} ... \fR
  100. List different type of info and status of l3q.
  101. .TP
  102. \fB list task [-h] JOBID \fR
  103. List all tasks for specified job.
  104. .IP
  105. \fBJOBID\fR Id number of job to show tasks.
  106. .br
  107. \fB-h\fR, \fB--help\fR show help message and exit
  108. .SS "Add commands"
  109. .PP
  110. .TP
  111. \fB\ add [-h] [-t TASKFILE] [-i] [-d DEPEND] {para,seq} ... \fR
  112. Adds a new job to l3q.
  113. .IP
  114. \fB-t TASKFILE\fR, \fB--taskfile TASKFILE\fR
  115. .br
  116. Path to file containing all tasks that should be run in sequence or parallel.
  117. .br
  118. One task per line, empty lines and lines starting with # are ignored.
  119. .br
  120. TASKFILE must contain special syntax line for either
  121. .br
  122. parallel or squence job.
  123. .IP
  124. Special syntax in TASKFILE:
  125. #--add-para ...
  126. #--add-seq ...
  127. .br
  128. \fB-i\fR, \fB--jobid\fR
  129. .br
  130. Jobid of this job will be returned on success as the only output.
  131. .br
  132. On failure the error message is printed.
  133. .br
  134. \fB-d DEPEND\fR, \fB--depend DEPEND\fR
  135. .br
  136. Comma separated list of other jobid that this job depends on.
  137. .br
  138. All jobs specified in this joblist have to terminate without
  139. .br
  140. errors before this job is started. If a depenedent job fails
  141. .br
  142. this job will also fail and will not be started.
  143. .br
  144. Example: 1,5,9 (comma separated without whitespaces)
  145. .IP
  146. \fB-h\fR, \fB--help\fR show help message and exit
  147. .TP
  148. \fB add para [-h] [-c CORES] [-n NODES] [-j JOBNAME] [-i] [-d DEPEND] TASKFILE \fR
  149. Adds a new parallel job of single threaded tasks to l3q.
  150. .br
  151. Number of tasks executed in parallel depends on number of cores and hosts specified.
  152. .IP
  153. \fBTASKFILE\fR Path to file containing all tasks that should be run in parallel.
  154. .br
  155. One task per line, empty lines and lines starting with # are ignored.
  156. .br
  157. Lines containing tasks must have one or two column, first column
  158. .br
  159. specifies a path to an executable program and the second optional
  160. .br
  161. column contains working directory. Columns are separated by
  162. .br
  163. whitespace. If second column, workdir, is not specified,
  164. .br
  165. users home directory will be used as default workdir.
  166. .br
  167. \fB-c CORES\fR, \fB--cores CORES\fR
  168. .br
  169. Number of cores to use on each node, --nodes is also required.
  170. \fB-n NODES\fR, \fB--nodes NODES\fR
  171. .br
  172. Number of nodes to use, L3Q will choose which nodes to use. --cores is also required.
  173. \fB-j JOBNAME\fR, \fB--jobname JOBNAME\fR
  174. .br
  175. Name of specifed job.
  176. .br
  177. \fB-i\fR, \fB--jobid\fR
  178. .br
  179. Jobid of this job will be returned on success as the only output.
  180. .br
  181. On failure the error message is printed.
  182. .br
  183. \fB-d DEPEND\fR, \fB--depend DEPEND\fR
  184. .br
  185. Comma separated list of other jobid that this job depends on.
  186. .br
  187. All jobs specified in this joblist have to terminate without
  188. .br
  189. errors before this job is started. If a depenedent job fails
  190. .br
  191. this job will also fail and will not be started.
  192. .br
  193. Example: 1,5,9 (comma separated without whitespaces)
  194. .br
  195. \fB-h\fR, \fB--help\fR show help message and exit
  196. .IP
  197. If TASKFILE only contains one task on each line then both options --cores and --nodes are required.
  198. .br
  199. If TASKFILE contains a line with the special syntax specifying required options then no options are required on the command line.
  200. .br
  201. If TASKFILE contains the special syntax line and --core and --nodes are given as arguments these options will be used
  202. .br
  203. and the special syntax line will be ignored.
  204. .IP
  205. Special syntax in TASKFILE:
  206. .br
  207. #--add-para --cores 5 --nodes 8
  208. .TP
  209. \fB add seq [-h] [-j JOBNAME] [-i] [-d DEPEND] TASKFILE \fR
  210. Adds a new sequential job of single threaded tasks to l3q.
  211. .br
  212. All tasks will be executed one after another.
  213. .IP
  214. \fBTASKFILE\fR Path to file containing all tasks that should be run in parallel.
  215. .br
  216. One task per line, empty lines and lines starting with # are ignored.
  217. .br
  218. Lines containing tasks must have one or two column, first column
  219. .br
  220. specifies a path to an executable program and the second optional
  221. .br
  222. column contains working directory. Columns are separated by
  223. .br
  224. whitespace. If second column, workdir, is not specified,
  225. .br
  226. users home directory will be used as default workdir.
  227. .br
  228. \fB-j JOBNAME\fR, \fB--jobname JOBNAME\fR
  229. .br
  230. Name of specifed job.
  231. .br
  232. \fB-i\fR, \fB--jobid\fR
  233. .br
  234. Jobid of this job will be returned on success as the only output.
  235. .br
  236. On failure the error message is printed.
  237. .br
  238. \fB-d DEPEND\fR, \fB--depend DEPEND\fR
  239. .br
  240. Comma separated list of other jobid that this job depends on.
  241. .br
  242. All jobs specified in this joblist have to terminate without
  243. .br
  244. errors before this job is started. If a depenedent job fails
  245. .br
  246. this job will also fail and will not be started.
  247. .br
  248. Example: 1,5,9 (comma separated without whitespaces)
  249. .br
  250. \fB-h\fR, \fB--help\fR show help message and exit
  251. .IP
  252. If TASKFILE contains a line with the special syntax specifying the name of the task this name will be used.
  253. .br
  254. If TASKFILE contains the special syntax line and --jobname is given as arguments this options will be used
  255. .br
  256. and the special syntax line will be ignored.
  257. .br
  258. If TASKFILE contains multiple special syntax lines the last line will be used.
  259. .IP
  260. Special syntax in TASKFILE:
  261. .br
  262. #--add-seq --jobname The name of the job
  263. .SS "Cancel commands"
  264. .PP
  265. .TP
  266. \fB\ cancel [-h] {job} ... \fR
  267. \fB-h\fR, \fB--help\fR show help message and exit
  268. .TP
  269. \fB cancel job [-h] JOBID \fR
  270. Cancel specified running or queued job.
  271. .IP
  272. \fBJOBID\fR Id number of job to cancel
  273. .br
  274. \fB-h\fR, \fB--help\fR show help message and exit
  275. .SS "Node commands"
  276. .PP
  277. .TP
  278. \fB\ node [-h] {set} \fR
  279. Configure compute nodes in L3Q
  280. .IP
  281. \fB-h\fR, \fB--help\fR show help message and exit
  282. .TP
  283. \fB\ node set [-h] [--online] [--offline] NODENAME \fR
  284. Set status of a compute node in L3Q.
  285. .br
  286. If set offline current running processes will run
  287. until finished before set offline and will not
  288. accept any new processes.
  289. .IP
  290. \fBNODENAME\fR Name of node to change state on in the L3Q daemon.
  291. .br
  292. \fB--online\fR Set node soft online and will display Soft Online
  293. .br
  294. until node is reacheble from L3Q daemon and changes to Online,
  295. .br
  296. if node is not reachable L3Q daemon will set node offline again.
  297. .br
  298. \fB--offline\fR Set node offline,
  299. .br
  300. if node still run processes, these will finish before
  301. .br
  302. set in Maintenance mode and Maintenance draining mode will show in status.
  303. .br
  304. While in Maitenance mode the node will not run any new jobs.
  305. .br
  306. \fB-h\fR, \fB--help\fR show help message and exit
  307. .SS "Validate commands"
  308. .PP
  309. .TP
  310. \fB validate [-h] -p PORT \fR
  311. Configure and validate client with daemon.
  312. .br
  313. Without validation the clients on this host will not be able to communicate with daemon.
  314. .br
  315. Start the validation server on the same server as the l3q daemon with:
  316. .br
  317. l3qd --validate-host
  318. .br
  319. and follow instructions.
  320. .IP
  321. \fB-p PORT\fR, \fB--port PORT\fR Port that the validate server is listening on.
  322. .br
  323. \fB-h\fR, \fB--help\fR show help message and exit
  324. .SH FILES
  325. /etc/l3q/l3q.conf
  326. .br
  327. /etc/l3q/network.l3q
  328. .br
  329. /var/log/l3q/l3q-client.log
  330. .SH AUTHOR
  331. Written by Marcus Pedersén
  332. .SH "REPORTING BUGS"
  333. <https://notabug.org/marcux/l3q>
  334. .SH COPYRIGHT
  335. Copyright \(co 2023 Marcus Pedersén
  336. .br
  337. License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>.
  338. .br
  339. This is free software: you are free to change and redistribute it.
  340. There is NO WARRANTY, to the extent permitted by law.
  341. .SH "SEE ALSO"
  342. l3q.conf(5)