BATCH
Introduction
Extended actions (xactions) are batch operations, or jobs, that run asynchronously, report satistics (viewable at runtime and later), can be waited upon, and can be stopped.
Terminology-wise, in the code we mostly call it xaction by the name of the corresponding software abstraction. But elsewhere, it is simply a job - the two terms are interchangeable.
In the source code, all supported xactions are enumerated here.
For users, there’s an API to start, stop, and wait for a job:
In CLI, there’s ais job
command and its subcommands (<TAB-TAB>
completions):
$ ais job
start stop wait rm show
$ ais job start
prefetch download lru rebalance resilver ec-encode copy-bck
blob-download dsort etl cleanup mirror warm-up-metadata move-bck
Not all supported jobs can be started via ais start
or by the corresponding Go or Python API call. Example, the job to copy or (ETL) transform datasets has its own dedicated API (both Python and Go) and CLI.
See e.g.,
ais cp --help
Complete and most recently updated list of supported jobs can be found in this table of job descriptors.
Last (but not the least) is - time. Job execution may take many seconds, sometimes minutes or hours.
Examples include erasure coding or n-way mirroring a dataset, resharding and reshuffling a dataset and more.
Global rebalance gets (automatically) triggered by any membership changes (nodes joining, leaving, powercycling, etc.) that can be further visualized via ais show rebalance
CLI.
Another example would be primary election. AIS proxies provide access points (“endpoints”) for the frontend API. At any point in time there is a single primary proxy that also controls versioning and distribution of the current cluster map. When and if the primary fails, another proxy is majority-elected to perform the (primary) role.
This (election by simple majority) is also a job that cannot be started via ais start
or the corresponding API. Similar to global rebalance, it is event-driven. Similar to rebalance, there’s a separate dedicated API to run it administratively.
Rebalance and a few other AIS jobs have their own CLI extensions. Generally, though, you can always monitor xactions via
ais show job xaction
command that also supports verbose mode and other options.
AIS subsystems integrate subsystem-specific stats - e.g.:
Related CLI documentation:
- CLI:
ais show job
- CLI: multi-object operations
- CLI: reading, writing, and listing archives
- CLI: copying buckets
Table of Contents
Operations on multiple selected objects
AIStore provides APIs to operate on batches of objects:
API Message (apc.ActionMsg) | Description |
---|---|
apc.ActCopyObjects |
copy multiple objects |
apc.ActDeleteObjects |
delete --/-- |
apc.ActETLObjects |
etl (transform) --/-- |
apc.ActEvictObjects |
evict --/-- |
apc.ActPrefetchObjects |
prefetch --/-- |
apc.ActArchive |
archive --/-- |
For CLI documentation and examples, please see Operations on Lists and Ranges (and entire buckets).
There are two distinct ways to specify the objects: list them (ie., the names) explicitly, or specify a template.
Supported template syntax includes 3 standalone variations - 3 alternative formats:
- bash (or shell) brace expansion:
prefix-{0..100}-suffix
prefix-{00001..00010..2}-gap-{001..100..2}-suffix
- at style:
prefix-@100-suffix
prefix-@00001-gap-@100-suffix
- fmt style:
prefix-%06d-suffix
In all cases, prefix and/or suffix are optional.
List
List APIs take a JSON array of object names, and initiate the operation on those objects.
Parameter | Description |
---|---|
objnames | JSON array of object names |
Range
Parameter | Description |
---|---|
template | The object name template with optional range parts. If a range is omitted the template is used as an object name prefix |
Examples
All the following examples assume that the action is delete
and the bucket name is bck
, so only the value part of the request is shown:
"value": {"list": "["obj1","dir/obj2"]"}
- deletes objects obj1
and dir/obj2
from the bucket bck
"value": {"template": "obj-{07..10}"}
- removes the following objects from bck
(note leading zeroes in object names):
- obj-07
- obj-08
- obj-09
- obj-10
"value": {"template": "dir-{0..1}/obj-{07..08}"}
- template can contain more than one range, this example removes the following objects from bck
(note leading zeroes in object names):
- dir-0/obj-07
- dir-0/obj-08
- dir-1/obj-07
- dir-1/obj-08
"value": {"template": "dir-10/"}
- the template defines no ranges, so the request deletes all objects which names start with dir-10/