Dan rightly suggested I'd be more specific about what the task system is instead of what the task system isn't.
The problem is that I'm not completely sure how it's going to work. It also depends on the events mechanism. This is my current working draft:
TaskInfo: id string methodName string kwargs json-object (string keys variant values) *filtered to remove sensitive information
getRunningTasks(filter string, filterType enum{glob, regexp}) Returns a list of TaskInfo of all tasks that their id's match the filter
That's it, not even stopTask()
As explained, I would like to offload handling to the subsystems. In order to make things easier for the clients every subsystem can choose a filed of the object to be of type OperationInfo. This is a generic structure that the user has a generic way to track all tasks on all subsystem with a report interface. The extraData field is for subsystem specific data. This is where the storage subsystem would put, for example, imageState (broken, degraded, optimized) data.
OperationInfo: operationDescription string - something out of an agreed enum of strings vaguely describing the operation at hand for example "Copying", "Merging", "Deleting", "Configuring", "Stopped", "Paused", .... They must be known to the client so it can in turn translate it in the UI. The also have to remain relatively vague as they are part of the interface meaning that new values will break old clients so they have to be reusable. stageDescription - Similar to operation description in case you want more granularity, optional. stage (int, int) - (5, 10) means 5 out of 10. 1 out of 1 implies the UI to not display stage widgets. percentage - 0-100, -1 means unknown. lastError - (code, message) the same errors that can return for regular calls extraData - json-object
For example creatVM will return once the object is created in VDSM. getVmInfo() would return, amongst other things, the operation info. For the case of preparing for launch it will be: {"Creating", "configuring", (2, 4), 40, (0, ""), {state="preparing for launch"}} In the case of VM paused on EIO: {"Paused", "Paused", (1, 1), -1, (123, "Error writing to disks"), {state="paused"}}
Migration is a tricky one, it will be reported as a task while it's in progress but all the information is available on the image operationInfo. In the case of Migration: {"Migration", "Configuring", (1, 3), -1, (0, ""), {status="Migrating"}}
For StorageConnection this is somewhat already the case but in simplified version.
If you want to ask about any other operation I'd be more then happy to write my suggestion for it.
Subsystems have complete freedom about how to set up the API. For Storage you have Fixes() to start\stop operations. Gluster is pretty autonomous once operations have been started.
Since operations return as soon as they are registered (persisted) or fail to register, it makes synchronous programming a bit clunky. vdsm.pauseVm(vmId) doesn't return when the VM is paused but when VDSM committed it will try to pause it. This means you will have to poll in order to see if the operation finished. For gluster, as an example, this is the only way we can check that the operation finished.
For stuff we have a bit more control over vdsm will fire events using json-rpc notifications sent to the clients. The will be in the form of: {"method": "alert", "params": { "alertName": <subsystem>(.<objectType>)?.<object>.(<subobject>., ...), "operationInfo", OperationInfo} }
The user can register to recive events using a glob or a regexp. registering to vdsm.VM.* pop every time any VM has changed stage. This means that whenever the task finishes, fails or gains significance progress and VDSM is there to track it, an event will be sent to the client.
This means that the general flow is. # Register operation vmID = best_vm host.VM.pauseVM(vmID) while True: opInfo = None try: event = host.waitForEvent("vdsm.VM.best_vm", timeout=10) opInfo = event.opInfo except VdsmDisconnectionError: host.waitForReconnect() host.vm.getVmInfo(vmID) # Double check that we didn't miss the event continue except Timeout: # This is a long operation, poll to see that we didn't miss any event # but more commonly, update percentage in the UI to show progress. vmInfo = host.vm.getVmInfo(vmID) opInfo = vmInfo.operationInfo
if opInfo.stage.number != op.stage.total: # Operation in progress updateUI(opInfo) else: # Operation completed # Check that the state is what we expected it to be. if oInfo.extraData.state == "paused": return SUCCESS else: return opInfo.lastError
vdsm.waitForEvent(filterm, timeout) is a client side libvdsm helper operation. Clients that access the raw API need to create thir own client side code to filter out events and manage their distribution. I'm open to also defining server side filters but I'm not sure whether it's worth it or just having it be a boolean (all events or none) is sufficient.
This is a very simplified example but the general flow is clear. Even if the connection is lost for 1 second or 4 days, the code still works. Further more, the user can wait for multiple operations in the same thread using: host.waitForEvent("vdsm.VM.(best_vm_ever|not_so_good_vm)") This means that the client can wait for a 100 VMs or all VMs (using wildecards) in a mechanism similar "poll()" with minimal overhead. This also means that if The fact that operations are registered means that even if connections is lost due to VDSM crashing or the network crashing, the manager doesn't need to care one the original command returns as it know the operation registered. This doesn't mean that every operation must retry forever. How persistent every method is can and should change between the different operations. Also, it means that manager that didn't initiate an operation track it in the same way as those that did. This makes clustered managers a lot easier to implement as if one goes down a second one can more or less immediately with minimal extra code.
----- Original Message -----
Dan rightly suggested I'd be more specific about what the task system is instead of what the task system isn't.
The problem is that I'm not completely sure how it's going to work. It also depends on the events mechanism. This is my current working draft:
TaskInfo: id string methodName string kwargs json-object (string keys variant values) *filtered to remove sensitive information
getRunningTasks(filter string, filterType enum{glob, regexp}) Returns a list of TaskInfo of all tasks that their id's match the filter
That's it, not even stopTask()
So for each verb we need to implement a specific 'cancel' op if relevant? e.g. copyImage -> cancelCopyImage ? (or just deleteImage on the target image in this case?) for migrateVM -> cancelMigrateVM? etc.
As explained, I would like to offload handling to the subsystems. In order to make things easier for the clients every subsystem can choose a filed of the object to be of type OperationInfo. This is a generic structure that the user has a generic way to track all tasks on all subsystem with a report interface. The extraData field is for subsystem specific data. This is where the storage subsystem would put, for example, imageState (broken, degraded, optimized) data.
OperationInfo: operationDescription string - something out of an agreed enum of strings vaguely describing the operation at hand for example "Copying", "Merging", "Deleting", "Configuring", "Stopped", "Paused", .... They must be known to the client so it can in turn translate it in the UI. The also have to remain relatively vague as they are part of the interface meaning that new values will break old clients so they have to be reusable. stageDescription - Similar to operation description in case you want more granularity, optional. stage (int, int) - (5, 10) means 5 out of 10. 1 out of 1 implies the UI to not display stage widgets. percentage - 0-100, -1 means unknown.
I'm assuming this is not overall percentage but relative to current stage?
lastError - (code, message) the same errors that can return for regular calls extraData - json-object
For example creatVM will return once the object is created in VDSM. getVmInfo() would return, amongst other things, the operation info. For the case of preparing for launch it will be: {"Creating", "configuring", (2, 4), 40, (0, ""), {state="preparing for launch"}} In the case of VM paused on EIO: {"Paused", "Paused", (1, 1), -1, (123, "Error writing to disks"), {state="paused"}}
Migration is a tricky one, it will be reported as a task while it's in progress but all the information is available on the image operationInfo. In the case of Migration: {"Migration", "Configuring", (1, 3), -1, (0, ""), {status="Migrating"}}
Since you wrote 'image' I'm assuming you're referring to storage migration here? So the migrate command is successful the moment it starts? (or even a bit before, once it persisted info or something) and then to query the state of the migration user needs to call 'image operationInfo'?
For StorageConnection this is somewhat already the case but in simplified version.
If you want to ask about any other operation I'd be more then happy to write my suggestion for it.
Subsystems have complete freedom about how to set up the API. For Storage you have Fixes() to start\stop operations.
Fixes?
Gluster is pretty autonomous once operations have been started.
Since operations return as soon as they are registered (persisted) or fail to register, it makes synchronous programming a bit clunky. vdsm.pauseVm(vmId) doesn't return when the VM is paused but when VDSM committed it will try to pause it. This means you will have to poll in order to see if the operation finished. For gluster, as an example, this is the only way we can check that the operation finished.
For stuff we have a bit more control over vdsm will fire events using json-rpc notifications sent to the clients. The will be in the form of: {"method": "alert", "params": { "alertName": <subsystem>(.<objectType>)?.<object>.(<subobject>., ...), "operationInfo", OperationInfo} }
The user can register to recive events using a glob or a regexp. registering to vdsm.VM.* pop every time any VM has changed stage. This means that whenever the task finishes, fails or gains significance progress and VDSM is there to track it, an event will be sent to the client.
This means that the general flow is. # Register operation vmID = best_vm host.VM.pauseVM(vmID) while True: opInfo = None try: event = host.waitForEvent("vdsm.VM.best_vm", timeout=10) opInfo = event.opInfo except VdsmDisconnectionError: host.waitForReconnect() host.vm.getVmInfo(vmID) # Double check that we didn't miss the event continue except Timeout: # This is a long operation, poll to see that we didn't miss any event # but more commonly, update percentage in the UI to show progress. vmInfo = host.vm.getVmInfo(vmID) opInfo = vmInfo.operationInfo
if opInfo.stage.number != op.stage.total: # Operation in progress updateUI(opInfo) else: # Operation completed # Check that the state is what we expected it to be. if oInfo.extraData.state == "paused": return SUCCESS else: return opInfo.lastError
vdsm.waitForEvent(filterm, timeout) is a client side libvdsm helper operation. Clients that access the raw API need to create thir own client side code to filter out events and manage their distribution. I'm open to also defining server side filters but I'm not sure whether it's worth it or just having it be a boolean (all events or none) is sufficient.
This is a very simplified example but the general flow is clear. Even if the connection is lost for 1 second or 4 days, the code still works. Further more, the user can wait for multiple operations in the same thread using: host.waitForEvent("vdsm.VM.(best_vm_ever|not_so_good_vm)") This means that the client can wait for a 100 VMs or all VMs (using wildecards) in a mechanism similar "poll()" with minimal overhead. This also means that if The fact that operations are registered means that even if connections is lost due to VDSM crashing or the network crashing, the manager doesn't need to care one the original command returns as it know the operation registered. This doesn't mean that every operation must retry forever. How persistent every method is can and should change between the different operations. Also, it means that manager that didn't initiate an operation track it in the same way as those that did. This makes clustered managers a lot easier to implement as if one goes down a second one can more or less immediately with minimal extra code. _______________________________________________ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
vdsm-devel@lists.stg.fedorahosted.org